Help


Home -> Help


ProGlycProt Tutorial


  • What is ProGlycProt?
  • ProGlycProt is a manually curated, exclusive repository of comprehensive information on experimentally characterized glycoproteins and protein glycosyltransferasesthat belong to eubacteria and archaea.

    For the benefit of users, ProGlycProtrepository is arranged in following two sub-databases namely, ProGPdb and ProGTdb.

    ProGPdb:is a compilation of experimentally validated bacterial and archaeal glycoproteins.Entries in ProGPdb are arranged in chronological order and each entry has a unique identifier ProGP ID. ProGPdb is further arranged in two sections:

    1) ProCGP (Prokaryotic Characterized Glycoproteins) defined as a compilation of prokaryotic glycoproteins for which at least one glycosite i.e. the glycosylated residue is identified through experiments like edman degradation, mass spectroscopy, site directed mutagenesis etc.

    2) ProUGP (Prokaryotic Uncharacterized Glycoproteins) defined as a list of prokaryotic glycoproteins in which glycosylation is known by experiments like aberrant migration on SDS PAGE, lectin binding, sugar specific staining but not the glycosites. In addition, ProGlycProt provides for separate structure gallery, relevant links and two new tools that are developed keeping in mind the new sequon information obtained from literature on prokaryotic glycoproteins.

    ProGTdb: compiles and provides extensive information about enzymes that are experimentally characterized and involved in N-, O- and S- glycosylation (GTs) in bacteria and archaea. It provides manually curated information about native organism, genome, gene, protein, detailed description of the substrate specificity, catalytic linkages, mechanism, structure and the experimental strategies/methodologies used for the biochemical, genetic and/or biophysical characterization of the glycosyltransferase activity of the enzyme. ProGTdb not only contains entries that correspond to the ProGP ID’s in a cross linked manner but also entries for which acceptor substrate is either a synthetic peptide or a eukaryotic protein.Each entry in ProGT_Main and ProGT_Accessory has a unique identifier ProGT ID arranged chronologically. ProGTdb is composed of following two sections:

    1)ProGT_Mainis a compilation of bacterial and archaeal protein glycosyltransferases for which at least one genetic or biochemical evidence of glycosyltransferase activity is validated in the published literature. Each entry in ProGT_Main has a unique identifier ProGT ID.

    2)ProGT_Accessoryis a compilation of such bacterial and archaeal proteins/enzyme that have miscellaneous accessory roles in a given protein glycosylation pathway of a characterized protein glycosyltransferases compiled in ProGT_Main. The qualifying criteria is at least one known genetic or biochemical evidence of accessory function, in literature. Each entry in ProGT_Accessory has a unique identifier ProGT ID.
  • What is the rationale and vision behind creation of ProGlycProt?
  • Once thought restricted to eukaryotes, glycoproteins are now known to exist in almost all major phyla of prokaryotes. In last ten years, many new prokaryotic glycoproteins have been characterized for the precise location of glycosites, indicating a rising interest in the biology of protein glycosylation in prokaryotes. However, according to our search, currently none of the available protein informatic or glycoinformatic resource provides comprehensive and exclusive information about these experimentally identified or characterized glycoproteins of archaea and eubacteria. ProGlycProt is therefore designed to fill in this void. A long-term vision of ProGlycProt is to provide as much as possible experimental information available in published literature about these glycoproteins and to link these with the experimental information on mechanisms and genetic machinery involved in glycosylation of these proteins.
  • What is different or unique in ProGlycProt?
  • To the best of our knowledge, ProGlycProt is the first manually curated, exclusive database of experimentally detected/characterized glycoproteins of prokaryotic origin. Though parts of the data available at ProGlycProt can be retrieved at SwissProt/ PDB/ BCSDB/ O- GlycBase/ CAZY yet unlike all these repositories ProGlycProt is an exclusive compilation of experimentally validated glycoproteins and enzymes instrumental in glycosylation of these proteins in prokaryotes only. Furthermore, ProGlycProt provides a lot of additional interesting information about a given glycoprotein like; full structure of attached glycan (IUPAC linear notation) as available from literature, information about glycosylation linked genes, experimental methods used to characterize a given glycoprotein, year of detection, year of characterization, observed sequon features and mannual annotation of all characterized glycoprotein sequences to incorporate mutational changes/ sequence conflicts/ in vitro or in vivo engineered sequence and visual display of glycosites.

    Similarly, enzymes involved in glycosylation (and glycosylation pathway) of a given protein are also described amply. Manually curated information onnative organism, genome, gene, protein, detailed description of the substrate specificity, catalytic linkages, mechanism, structure and the experimental strategies/methodologies used for the biochemical, genetic and/or biophysical characterization of the glycosyltransferase activity of the enzyme is provided , accordingly. While release of ProGlycProt contained information about at least 108 experimentally known glycoproteins that were absent in SwissProt on account of unsequenced genomes of the related organisms and as many as 69 proteins that included peptides & engineered proteins for which glycositeswere yet not annotated in any of the above mentioned protein databases.

    Second release provides atleast 48 unique entries for protein glycosyltransferasesthat are not yet listed in CAZY. Further out of 181 characterized GTs listed in ProGTdb only 46 are listed as characterized in CAZY as of now. ProGlycProt additionally provides information about the in vitro engineered/evolved variants of these GT’s which are not compiled in CAZY.
  • Does ProGlycProt represent all known prokaryotic glycoproteins?
  • As of now and to the best of our knowledge, ProGlycProtdb represents the largest compilation of characterized prokaryotic glycoproteins available online consisting of entries from year 1968 to April2017. Though we may not claim that ProCGP and ProGTdbare complete compilations, yet we have taken best efforts to incorporate all data that we could search through literature until we stopped finding further references in our searches made until April 30, 2017, in the second release. Further, we have tried to provide as accurate as possible information, yet we encourage users to refer the original literature cited along.

    On the other hand, ProUGP is compiled from various brief seed compilations available in some of the published reviews (included in bibliography) and other available research publications. In view of the increasing availability of data on detection of new glycoproteins using high throughput techniques like mass spectroscopy, we believe ProUGPwill keep on increasing exponentially in future updates. In the second release, we have ensured maximal information about protein glycosylation, enzymes involved and pathways defined; at one place; in fully cross-referenced and searchable format, with information on several additional accounts.
  • What is the Scope of ProGlycProt?
  • In last two decades, a lot of interest is generated in studying glycoproteins and mechanisms of their glycosylation in bacteria and archaea. In the first release, we had tried to provide all around information about a number of glycoproteins that are implicated in virulence, host pathogen interactions, immune modulation, disease diagnosis, and vaccination. Second release provides a total of 42% increase in the number of entries made in the database with a major expansion in compilation of experimentally validated proteins/enzymes involved in protein glycosylation in prokaryotes.

    Staying true to the promises we made during first release of ProGlycProt, now in second release we have improvised and expanded the repository to:

    1. Include extensive experimental information about prokaryotic Oligosaccharyl Transferases (OSTs), Glycosyl Transferases (GTs) and other accessory proteins or enzymes.

    2. Facilitate better retrieval of biologically relevant and experiments oriented information about each entry. Second release provides Search by Features, Compare entries tool as well as Map/Location based search of a database entry and its associated research groups/ laboratories.

    3.To address the gap in structural and image inputs for glycan entries corresponding to ProGlycProt entriesweare in the process of linking these entries to International glycan structure repository (https://glytoucan.org/) made available recently.
  • Future Plans for ProGlycProt?
  • Apart from continued compilations for bacterial and archaeal glycoproteins and glycosyltransferases; in future updates, we aim at expanding information on directed evolution and applications of the glycosyltransferases.


Glossary of Terms used in ProGlycProt

S.No.
Term/Acronym
Definition
1.
AAL
Aleuria Aurantia Lectin
2.
ABEE
p-Aminobenzoic acid ethyl ester
3.
Amino sugar
Monosaccharide with one hydroxyl group (-OH) replaced by an amine group (-NH3).
4.
Bac
Bacillosamine (3, 4-diacetamido-3, 4, 6-trideoxyglucopyranose).
5.
CAD/CID
Collision-Activated (or –induced) Dissociation
6.
CapLC-MS/MS
Capillary Liquid Chromatography-Tandem Mass Spectrometry
7.
S (Cys) linked glycosylation
Refers to the covalent linkage between glycan and sulphur atom of cysteine residue in a protein sequence
8.
COSY
Correlated Spectroscopy
9.
DATDH
3,4-Di-Acetamido-3,4,6-Tri -DeoxyHexose
10.
Deglycosylation
Removal of glycans from the glycoproteins by chemical or enzymatic methods.
11.
DIG Glycan detection
Method of detection of Digoxigenin (DIG)-labeled glycoconjugates using enzyme immunoassay
12.
Dolichol
An isoprenoid lipid with 15-19 isoprenoid units and a terminal phosphorylated hydroxyl group. Dolichol acts as a membrane bound carrier for sugars in the synthesis of glycoprotens and glycolipids
13.
DQF-COSY
Double Quantum Filtered Correlation Spectroscopy
14.
ECD
Electron Capture Dissociation
15.
Endo Hf
Endoglycosidase H leaves one GlcNAc residue attached to Asn by cleaving between the two GlcNAc residues of the N-glycan core.
16.
Engineered glycoprotein
A protein naturally unglycosylated or a synthetic peptide that is glycosylated in vitro or in vivo by chemical or enzymatic methods (usually after mutation of one or a few residues). Such proteins are also termed as neoglycoproteins.
17.
ESI Q-TOF-MS
Electrospray Ionization Quadrupole Time Of Flight Mass Spectrometry
18.
ETD
Electron Transfer Dissociation
19.
FAB-MS
Fast Atom Bombardment-Mass Spectrometry
20.
FT-ICR-MS
Fourier Transform Ion Cyclotron Resonance Mass Spectrometry
21.
Fuc
Fucose
22.
FucNAc
N-Acetylfucosamine
23.
GAGs
Glycosaminoglycans
24.
Gal
Galactose
25.
GalNAc
N-Acetyl-D-Galactosamine
26.
GATDH
3-Acetamido 4-Glyceramido 3,4,6-Trideoxyhexose Or 3- Glyceramido 4- Acetamido 3,4,6-Trideoxyhexose
27.
GC
Gas Chromatography
28.
GC-MS
Gas Chromatography-Mass Spectrometry
29.
Glc
Glucose
30.
GlcA
Glucuronic acid
31.
GlcNAc
N-Acetyl-D-Glucosamine (NAG)
32.
Glycoform
One of the differentially glycosylated forms of a glycoprotein. Glycoforms of a glycoprotein have the same protein sequence but differ in the number and/or structure of oligosaccharides attached

33.

Glycoprotein
Protein with one or more covalently bound glycans added as a co-translational or post-translational modification. The glycan may be a monosaccharide, an oligosaccharide or a polysaccharide.
34.
Glycosidase
Enzyme catalyzing the hydrolysis of a glycosidic linkage
35.
Glycosidic linkage (bond)
The bond linking monosaccharides in didiasaccharides and polysaccharides. Formed by a condensation reaction between teo OH groups, one from each of the two monosaccharides.
36.
Glycosite
An amino acid residue where glycosylation occurs in a protein sequence.
37.
Glycosyltransferase (GT)
Enzyme (with EC 3.4.X.X) catalyzing the transfer of a sugar from a nucleotide (nucleoside phosphate) sugar donor to
an acceptor substrate to form a glycosidic linkage
38.
HMBC
Heteronuclear Multiple Bond Coherence
39.
HMQC
Heteronuclear Multiple Quantum Correlation
40.
HPAEC
High-Performance Anion-Exchange Chromatography
41.
HPLC
High Pressure Liquid Chromatography
42.
HSQC
Heteronuclear Single Quantum Coherence
43.
IdoA
Iduronic Acid
44.
Lectin
A glycan binding protein with a carbohydrate-recognition domain (CRD) homologous to the sugar binding region of leguminous plant lectin.
45.
LFA
Limax Flavus Agglutinin (Sialic acid-specific lectin)
46.
MALDI-TOF MS
Matrix Assisted Laser Desorption/Ionization Time Of Flight Mass Spectrometry
47.
Man
Mannose
48.
MS-MS
Tandem Mass Spectrometry
49.
Nano-LC-MS/MS
Nano Liquid Chromatography-Tandem Mass Spectrometry
50.
nESI-feCID-MS/MS
Nano-Electrospray Ionization–Front-End Collision-Induced Dissociation Tandem Mass Spectrometry
51.
NeuNAc (NANA)
N-Acetyl Neuraminic Acid (Sialic acid)
52.
N (Asn) linked glycosylation
Refers to the covalent linkage between glycan and amide nitrogen of an aspargine residue in a protein sequence
53.
NMR
Nuclear Magnetic Resonance
54.
NOESY
Nuclear Overhauser Effect Spectroscopy
55.
O (Ser/ Thr/ Tyr) linked glycosylation
Refers to the covalent linkage between glycan and oxygen of hydroxyl group of serine/ threonine or tyrosine in a protein sequence
56.
OST
Oligo Saccharyl Transferase, the enzyme responsible for catalyzing the transfer of a precurs