Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Services for N-Glycosylation Process Integrated Technology Resource for Biomedical Glycomics NCRR/NIH Satya S. Sahoo, Amit P. Sheth, William S. York,

Similar presentations

Presentation on theme: "Web Services for N-Glycosylation Process Integrated Technology Resource for Biomedical Glycomics NCRR/NIH Satya S. Sahoo, Amit P. Sheth, William S. York,"— Presentation transcript:

1 Web Services for N-Glycosylation Process Integrated Technology Resource for Biomedical Glycomics NCRR/NIH Satya S. Sahoo, Amit P. Sheth, William S. York, John A. Miller Presentation at International Symposium on Web Services For Computational Biology and Bioinformatics, VBI, Blacksburg, VA, May 26-27, 2005

2 2 Glycomics  Study of structure, function and quantity of ‘complex carbohydrate’ synthesized by an organism Glycosylation  Carbohydrates added to basic protein structure - Glycosylation Folded protein structure (schematic)

3 3  Genome (comprised of DNA) or Proteome (proteins) are not the only factors in life functions of an organism glycosylation  Carbohydrates attached to different protein structures (by glycosylation) are important for:  Identification of foreign entities by immune system cells  Markers to accurately diagnose diseases  Regulate signaling activities N-glycosylation  Categorization of glycosylation - the way carbohydrates are attached to proteins. Example: N-glycosylation Glycosylation – why is it important?

4 4 N-GlycosylationProcessNGP N-Glycosylation Process (NGP) Cell Culture Glycoprotein Fraction Glycopeptides Fraction extract Separation technique I Glycopeptides Fraction n*m n Signal integration Data correlation Peptide Fraction ms datams/ms data ms peaklist ms/ms peaklist Peptide listN-dimensional array Glycopeptide identification and quantification proteolysis Separation technique II PNGase Mass spectrometry Data reduction Peptide identification binning n 1 By N-glycosylation Process, we mean the identification and quantification of glycopeptides

5 5  This Resource was established by the National Center for Research Resources  The aim is to develop the tools and technology to analyze glycoprotein and glycolipid expression of embryonic stem cells  Our research provides bioinformatics support for four research groups:  Embryonic Stem Cell Culture Program  Glycomic Analysis of Glycoproteins  Glycomic Analyses of Glycosphingolipids and Sphingolipids  Transcript analysis by kinetic RT-PCR NGP – part of the Bioinformatics core Integrated Technology Resource for Biomedical Glycomics

6 6  Unlike proteomics or genomics, high-throughput experimental protocols are still being established in Glycomics  NGP involves a multitude of heterogeneous tasks, including human-mediated tasks Web Services  NGP attempts to encapsulate particular computational steps as platform-independent, scalable and Web-accessible tools – Web Services  Enables glycobiologists to integrate automated data generation tasks with data processing tools (Web Services) end- to-end experimental lifecycle NGP – need in Glycomics

7 7  Extremely difficult to identify glycosylated peptide sequences using standard analytical methods consensus sequences  N-glycosylation occurs at particular sites on the protein structure – consensus sequences N-Glycosylation identification - Problems XS/TN An example glycopeptide (schematic) Peptide Glycan Consensus Sequence PNGaseF DJ Asparagine Aspartate

8 8 NGP - implementation  NGP,currently,implements a Web Process constituted of two Web Services:  DB Modifier NJ  DB Modifier Web Service – modifies the search database by replacing N (in consensus sequences) by J  Collator  Collator Web Service – identifies a probable N-glycosylated peptide, using three parameters:  Calculated molecular mass J  Presence of ‘J’ in a peptide sequence  MASCOT* Score assigned to a hit  NGP also involves propriety Mass Spectrometer search engine service (MASCOT*) as an intermediate task  Hence, NGP Web Process identifies probable glycosylated peptides – enabling rapid processing of data from high throughput experiment *

9 9 NGP – Architecture (current) ms/ms raw data PEAK LIST FILE Primary Sequence Database ModifyDB Web Service Collator Web Service MASCOT* Mass Spectrometer Search Engine Deglycosylated peptide list MASCOT* output file (contains both glycosylated and non- glycosylated peptide sequences) *

10 10 NGP Results  A typical MASCOT output file is about 3MB!  High-throughput experiment protocol generate thousands of such files - manual identification is not feasible q1_p1=-1 q2_p1=0, , ,2,APGVAGR,18, ,1.49, ,0,0;"gi| ":0:190:196:1 q2_p2=1, , ,2,APARGR,18, ,1.33, ,0,0;"gi| ":0:2:7:2 q2_p3=0, , ,2,APAVGGR,18, ,1.33, ,0,0;"gi| ":0:212:218:1,"gi| ":0:212:218:1 q3_p3=0, , ,4,DIIFK,12, ,25.26, ,0,0;"gi| ":0:364:368:2,"gi| ":0:328:332:2 q3_p4=0, , ,4,MPLFK,12, ,25.24, ,0,0;"gi| ":0:95:99:1,"gi| ":0:1:5:2 q3_p5=0, , ,3,NNLFK,12, ,15.34, ,0,0;"gi| ":0:539:543:1 q3_p6=0, , ,3,LDIFK,12, ,15.34, ,0,0;"gi| ":0:891:895:1 q3_p7=0, , ,3,NNIFK,12, ,15.34, ,0,0;"gi| ":0:212:216:1 q3_p8=0, , ,3,LDLFK,12, ,15.34, ,0,0;"gi| ":0:237:241:1 q3_p9=0, , ,3,EVIFK,12, ,13.61, ,0,0;"gi| ":0:67:71:1 q3_p10=0, , ,3,VELFK,12, ,13.61, ,0,0;"gi| ":0:493:497:1,"gi| ":0:99:103:1 q4_p1=-1 q5_p1=0, , ,5,DLLFR,14, ,18.41, ,0,0;"gi| ":0:84:88:1,"gi| ":0:17:21:1,"gi| ":0:647:651:1 q5_p2=0, , ,3,DLFLR,14, ,12.81, ,0,0;"gi| ":0:407:411:1,"gi| ":0:330:334:1,"gi| ":0:6:10:1 q5_p3=0, , ,3,DIFIR,14, ,12.81, ,0,0;"gi| ":0:924:928:1,"gi| ":0:1170:1174:1 q5_p4=0, , ,3,NNFIR,14, ,11.84, ,0,0;"gi| ":0:667:671:1 q5_p5=0, , ,4,IDLFR,14, ,9.98, ,0,0;"gi| ":0:602:606:1,"gi| ":0:536:540:1,"gi| ":0:646:650:1 q5_p6=0, , ,4,LDLFR,14, ,9.98, ,0,0;"gi| ":0:335:339:1 q5_p7=0, , ,4,VELFR,14, ,9.98, ,0,0;"gi| ":0:436:440:1 q5_p8=0, , ,4,LDIFR,14, ,9.98, ,0,0;"gi| ":0:2699:2703:1 q5_p9=0, , ,4,NLNFR,64, ,5.89, ,0,0;"gi| ":0:816:820:1 q5_p10=1, , ,2,NRFAR,14, ,3.37, ,0,0;"gi| ":0:97:101:1 q6_p1=0, , ,4,VSDNIK,35, ,11.27, ,0,0;"gi| ":0:935:940:1 q6_p2=0, , ,5,EGDLGGK,21, ,7.97, ,0,0;"gi| ":0:1058:1064:1 q6_p3=0, , ,5,EATVAGK,21, ,7.88, ,0,0;"gi| ":0:527:533:1 q6_p4=1, , ,3,QRMLK,14, ,7.46, ,0,0;"gi| ":0:467:471:2,"gi| ":0:638:642:2 q6_p5=0, , ,5,LSSSPGK,56, ,7.38, ,0,0;"gi| ":0:806:812:1 q6_p6=0, , ,4,WDLGGK,42, ,6.40, ,0,0;"gi| ":0:123:128:1 q6_p7=0, , ,4,QATDLK,56, ,6.21, ,0,0;"gi| ":0:451:456:1 q6_p8=1, , ,3,QTNKGK,14, ,6.03, ,0,0;"gi| ":0:85:90:1 q6_p9=1, , ,6,QMRIK,28, ,5.77, ,0,0;"gi| ":0:269:273:1,"gi| ":0:278:282:1 q6_p10=1, , ,6,QMRLK,28, ,5.77, ,0,0;"gi| ":0:300:304:1 q7_p1=0, , ,4,YDASLK,14, ,8.86, ,0,0;"gi| ":0:2761:2766:1

11 11  Two Ontologies developed as part of the NCRR-Glycomics project:  GlycO  GlycO: a domain Ontology embodying knowledge of the structure and metabolisms of glycans  Contains 770 classes – describe structural features of glycans  URL:  ProPreO  ProPreO: a comprehensive process Ontology modeling experimental proteomics  Contains 296 classes  Models three phases of experimental proteomics* – Separation techniques, Analytical techniques and, Data analysis  URL: NGP Web Services – Adding Semantics * (PEDRO UML schema)

12 12  ProPreO models the phases of proteomics experiment using five fundamental concepts:  Data  Data: (Example: a peaklist file from ms/ms raw data)  Data_processing_applications  Data_processing_applications: (Example: MASCOT* search engine)  Hardware  Hardware: embodies instrument types used in proteomics (Example: ABI_Voyager_DE_Pro_MALDI_TOF)  Parameter_list  Parameter_list: describes the different types of parameter lists associated with experimental phases  Task  Task: (Example: component separation, used in chromatography) ProPreO - Experimental Proteomics Process Ontology *

13 13  Formalize description and classification of Web Services using ProPreO concepts Service description using WSDL-S ….. WSDL ModifyDBWSDL-S ModifyDB …… ProPreO process Ontology data sequence peptide_sequence Concepts defined in process Ontology Description of a Web Service using: Web Service Description Language

14 14  There are no current registries that use semantic classification of Web Services in glycoproteomics Stargate  BUDDI classification based on proteomics and glycomics classification – part of integrated glycoproteomics Web Portal called Stargate  NGP to be published in BUDDI  Can enable other systems such as my Grid to use NGP Web Services to build a glycomics workbench Biological UDDI (BUDDI) WS Registry for Proteomics and Glycomics

15 15  As part of NCRR Integrated Technology Resource for Biomedical Glycomics, we implemented a Semantic Web Process for high throughput glycomics in open, web-centric environment  Large domain specific ontologies with process (ProPreO) and domain (GlycO) knowledge concepts was used to describe and classify Web Services – at Semantic level  Used proposed Semantic Web Service specification (WSDL-S) to add semantics to Web Service description Stargate  Biological UDDI (BUDDI) – part of Stargate is being developed as a single-window resource to discover and publish Web Services in glycoproteomics domain Conclusions

16 16 Resources  NCRR (Integrated Technology Resource for Biomedical Glycomics):  Bioinformatics core of Glycomics project:  ProPreO process Ontology:  GlycO domain Ontology:  Stargate – GlycoProteomics Web Portal:  WSDL-S: joint UGA-IBM technical note

17 17 Acknowledgement Special Thanks: James Atwood (CCRC, UGA) Meenakshi Nagarajan (LSDIS Lab, UGA) Blake Hunter (LSDIS Lab, UGA)

18 18  BUDDI  BUDDI – BioUDDI is envisioned as the ‘yellow pages’ for all WS in life sciences  The classification of WS uses biological taxonomy  Open resource for the worldwide community of life sciences research  Format Converter  Format Converter – Enables conversion of two available representation formats into a xml-based representation  IUPAC to LINUCS to GLYDE (a xml-based representation)  Web Service Generator  Web Service Generator – Enables existing java application to be exposed as Web Services  Generates required files from a java application to allow deployment as a Web Service  Enable the newly generated Web Service to be published on BioUDDI Extra Slides: Stargate subsystems – a bit of detail

19 19  Group Forum  Group Forum – Members of the research group use it to foster a sense of community  Schedule meetings, discuss issues, collaborate on papers…  Post papers for peer reviews, publications on relevant topic  Stargate Search  Stargate Search – is an integrated unit of the Stargate  Enables search for research publication within the research group  Enables search on the internet  Login  Login – Allows restrictions on accessibility of selected parts of Stargate Extra Slides: Stargate subsystems – a bit of detail

20 20 Extra Slides: The take home message… InternetForum BUDDI Search Web Service Generator

Download ppt "Web Services for N-Glycosylation Process Integrated Technology Resource for Biomedical Glycomics NCRR/NIH Satya S. Sahoo, Amit P. Sheth, William S. York,"

Similar presentations

Ads by Google