Presentation is loading. Please wait.

Presentation is loading. Please wait.

©2006 Ariadne Genomics. All Rights Reserved. Pathway Studio Workgroup/Enterprise training course.

Similar presentations


Presentation on theme: "©2006 Ariadne Genomics. All Rights Reserved. Pathway Studio Workgroup/Enterprise training course."— Presentation transcript:

1 ©2006 Ariadne Genomics. All Rights Reserved. Pathway Studio Workgroup/Enterprise training course

2 ©2006 Ariadne Genomics. All Rights Reserved. DAY 1 Technology overview System architecture

3 ©2006 Ariadne Genomics. All Rights Reserved. 3 Products Pathway Studio Desktop Pathway Studio Workgroup Pathway Studio Enterprise Main functionality: 1)Data mining and pathway building 2)Analysis of high-throughput data 3)Text-mining and fact extraction

4 ©2006 Ariadne Genomics. All Rights Reserved. 4 Ariadne Corporate Offering Software solution for Knowledge management and pathway analysis of the high-throughput data Knowledge Databases ResNet Biological Association Networks Pathway Building Pathway collection MedScan 1000 abstracts/min Proprietary data Public interaction data Analysis of High- Throughput data Text-mining

5 ©2006 Ariadne Genomics. All Rights Reserved. 5 Accomplishments (April, 2007) 188 publications using AGI software and ResNet database Gene expression microarray analysis (105) Pathway Analysis (80) Disease mechanism (64) Human genetics (7) Publication by Ariadne Authors (13) Text processing (9) Reviews (6) Databases (3) Drug discovery (16) Toxicogenomics (4)

6 ©2006 Ariadne Genomics. All Rights Reserved. 6 Pathway Studio Workgroup client-server architecture Database Read-only users Data curators Third party tools, in-house applications, API SQL interface, bulk data management PSW administrator

7 ©2006 Ariadne Genomics. All Rights Reserved. 7 PathwayExpert Architecture Bioinformaticians via Pathway Studio Database Application server Read-only users via web browser Data editors via web browser Third party tools, in-house applications, API SQL interface, bulk data management

8 ©2006 Ariadne Genomics. All Rights Reserved. 8 “Everyone is an Expert” decentralized deployment schema Hundreds or thousands of users some with read only and some with editor or publishers roles accessing one central database via Pathway Studio and/or Web browser to analyze experiments, browse pathway collection, do literature mining, sharing the data and analysis results.

9 ©2006 Ariadne Genomics. All Rights Reserved. 9 “Bioinformatics service group” centralized deployment schema Bioinformatics group servicing scientists for entire company by analyzing their experimental data and literature mining. Analysis results are published via Web browser interface for end users Bioinformatics group 1)Analysis of experimental data 2)Text-mining and Pathway Building View only access to pathways and analysis networks annotated with experimental data via web browser and links to PathwayExpert Web Services 1)Experimental data 2)Search requests End users

10 ©2006 Ariadne Genomics. All Rights Reserved. 10 ©2006 Ariadne Genomics. All Rights Reserved. “Disease area” decentralized clusters deployment schema Disease area groups have bioinformatics, biologists and chemists working as a team with focus on one disease Cardiovascular groupCancer group Digestive disorders groupCNS group

11 ©2006 Ariadne Genomics. All Rights Reserved. Day 1 Introduction to MedScan technology

12 ©2006 Ariadne Genomics. All Rights Reserved. 12 ©2006 Ariadne Genomics. All Rights Reserved. Ariadne MedScan Text-To-Knowledge Technology Extracting biological association networks from text Knowledge Databases ResNet Biological Association Networks Pathway Analysis in ResNet database MedScan 1000 abstracts/min Pathway Studio to navigate knowledgebase MedScan output: RNEF XML

13 ©2006 Ariadne Genomics. All Rights Reserved. 13 ©2006 Ariadne Genomics. All Rights Reserved. How MedScan extracts facts from text? Sentence in PubMed: “Axin binds beta-catenin and inhibits GSK-3beta.” Identify Proteins in Dictionary (in red): “Axin binds beta-catenin and inhibits GSK-3beta.” Identify Interaction Type (in black): “Axin binds beta-catenin and inhibits GSK-3beta.” Extracted Facts: Axin - beta-cateninrelation: Binding Axin -> GSK-3betarelation: Regulation, effect: Negative Syntactic LayerNoun PhraseVerb PhraseNoun Phrase Semantic LayerProteinProtein Relations Protein

14 ©2006 Ariadne Genomics. All Rights Reserved. 14 ©2006 Ariadne Genomics. All Rights Reserved. Describing MedScan Manually curated: dictionaries and grammar rules Fast: 14 mln PubMed abstracts in 2 days on modern PC Comprehensive: facts recovery rate > 90% Removes redundancy: 7,647,282 non-distinct relations =>1,000,000 distinct relations Accurate: false positive rate – 10% Customizable: dictionaries and patterns

15 ©2006 Ariadne Genomics. All Rights Reserved. 15 ©2006 Ariadne Genomics. All Rights Reserved. MedScan Architecture Entity recognizer Semantic processor Pattern matcher Entity detection Relationship extraction Dictionaries Rules Patterns Modules Mammals Plants Toxicology Cartridges Future: New modules: ConceptScan New cartridges: Immunology, Clinical Yeast Drosophila Customizable by user C-elegans RNEF XML

16 ©2006 Ariadne Genomics. All Rights Reserved. 16 ©2006 Ariadne Genomics. All Rights Reserved. Overview of MedScan Architecture Input Text Tokenizer Semantic Interpreter Semantic tree Tagged Sentences Ontological interpreter Syntactic Parser Preprocessor Sequence of Words Sentence Structure Database of relations Grammar Lexicon Extraction rules Protein names dictionary Converter Extracted facts Dictionary-based Identifies proteins and small molecules Context-free grammar Grammar and lexicon are proprietary. They are domain- independent by design but focused on biomedical field. Rule-based Rules are equivalent to ontology Pattern Matcher Extraction patterns

17 ©2006 Ariadne Genomics. All Rights Reserved. 17 ©2006 Ariadne Genomics. All Rights Reserved. MedScan Applications Pubmed Open access Google MedScan Entity-based index Semantic Index Automatic reader’s digest Document Summary Indexing the scientific literature Extracting interactions to create databases for systems biology

18 ©2006 Ariadne Genomics. All Rights Reserved. 18 ©2006 Ariadne Genomics. All Rights Reserved. Text-mining tools in Pathway Studio Tools -> Start MedScan Reader –Web-browser enhanced with MedScan technology –Search PubMed and manually select abstracts for fact extraction –Search Google Scholar and extract facts from top 100 hits –Search Google and extract facts from top 30 hits –Search Highwire and BioMed Central and extract facts from the individual full-text articles Tools -> MedScan: Extract pathways from text –search PubMed –from file –from location Tools -> Update pathway Tools -> Pathway Reference summary –Export to EndNote

19 ©2006 Ariadne Genomics. All Rights Reserved. 19 ©2006 Ariadne Genomics. All Rights Reserved. Medscan Reader settings 1) Specifying MedScan cartridge 2) Tracking favorite entities via highlight 3) Filtering for favorite entities and relations 4) Filtering against entities and relations

20 ©2006 Ariadne Genomics. All Rights Reserved. Day 1 Ariadne ResNet database construction

21 ©2006 Ariadne Genomics. All Rights Reserved. 21 ©2006 Ariadne Genomics. All Rights Reserved. ResNet Mammal Database Shipped with >1,000,000 unique relations derived by Medscan between proteins, metabolites, chemicals, cell processes and diseases ResNet physical interactions are manually curated 712 manually curated pathways Gene Ontology Optional pathway updates: – >300 Regulome pathways – >2500 Biological processes pathways –>200 Cellular component pathways –High-throughput interaction data ResNet automatically curation is possible to remove redundancy and cleanup false positives

22 ©2006 Ariadne Genomics. All Rights Reserved. 22 ©2006 Ariadne Genomics. All Rights Reserved. Pathways collection in ResNet Canonical pathways (included, curated) Signaling line pathways (included, curated) Regulome pathways (optional, automatic) Biological processes pathways (optional, automatic) Cellular component pathways (optional, automatic) KEGG metabolic pathways (optional, imported) STKE (commercial) Metabolic vision (commercial) PathArt (commercial)

23 ©2006 Ariadne Genomics. All Rights Reserved. 23 ©2006 Ariadne Genomics. All Rights Reserved. Ariadne databases for other organisms All databases contain: - Relations extracted by MedScan organism-specific cartridge from organism-specific abstracts and full-text articles - Entrez Gene protein annotation - Protein interactions from Entrez Gene (include BIND, HPRD, BioGRID and EcoCyc datasets) - Gene Ontology annotation Model Organism databases: ResNet Plant >400,000 relations, supports 6 plant species – Optional entity co-occurrence data – Additional protein physical interactions predicted by TAIR ResNet Drosophila – Additional interactions from published high-throughput datasets ResNet C-elegans – Additional interactions from published high-throughput datasets ResNet Yeast – Additional interactions from published high-throughput datasets ResNet Bacteria (beta version) – Additional interactions from published high-throughput datasets Databases for non-model organisms containing interactions predicted from closest model organism are available from:

24 ©2006 Ariadne Genomics. All Rights Reserved. 24 ©2006 Ariadne Genomics. All Rights Reserved. Additional Commercial Datasets KEGG: > 130 metabolic pathways from Kyoto U-ty STKE: > 70 pathways from AAAS Metabolic vision: >10,000 curated pathways for 587 organisms from Integrated Genomics Inc Hynet: adds over 100,000 new protein physical interactions to ResNet 5.0 from Prolexys Inc PathArt: >600 disease pathways from Jubilant Inc

25 ©2006 Ariadne Genomics. All Rights Reserved. Day1 Pathway Studio maintenance and administration and technical support

26 ©2006 Ariadne Genomics. All Rights Reserved. 26 ©2006 Ariadne Genomics. All Rights Reserved. Hardware requirements for Pathway Studio Pathway Studio desktop or workgroup client –CPU: 2 GHz or more –RAM: 512 MB or more –Disk space for application: 500 MB –Disk space for one local database: 2 GB PathwayStudio workgroup server –1 CPU for 1-5 concurrent users: : >3.0 GHz –2 CPU for 6-10 concurrent users: >3.0 GHz –RAM for 1-5 concurrent users: >2 GB –RAM for 6-10 concurrent users >3 GB –Disk space : 20 GB for the database –Optimal disk configuration: for 1-5 concurrent users: 4 hard drives in RAID 0 for 6-10 concurrent users: RAID 10 mode

27 ©2006 Ariadne Genomics. All Rights Reserved. 27 ©2006 Ariadne Genomics. All Rights Reserved. Pathway Studio software requirements Pathway Studio desktop or workgroup client –Microsoft Windows Server (2000,2003), Windows XP (Professional), Windows Vista (Professional, Ultimate, Corporate) PathwayStudio workgroup server –MS SQL Server 2000 or 2005 (Developer, Workgroup, Standard or Enterprise Edition) on Windows 2000, Windows 2003 Server, Windows XP Professional –Oracle 10g or later on any supported Oracle platform including Windows 2003 Server, Linux, etc.

28 ©2006 Ariadne Genomics. All Rights Reserved. 28 ©2006 Ariadne Genomics. All Rights Reserved. Connecting to the central workgroup database

29 ©2006 Ariadne Genomics. All Rights Reserved. 29 ©2006 Ariadne Genomics. All Rights Reserved. Connecting to the server enterprise database

30 ©2006 Ariadne Genomics. All Rights Reserved. 30 ©2006 Ariadne Genomics. All Rights Reserved. Database Index folder Database statistics Viewing entities in the list pane Viewing pathways Viewing groups Expression experiments folder Simulation model folder

31 ©2006 Ariadne Genomics. All Rights Reserved. 31 ©2006 Ariadne Genomics. All Rights Reserved. PS Workgroup Admin console User roles in Workgroup environment Administrator Editor – can edit public objects Publisher – can publish private pathways Regular user – can work only in his private space Ask your PSW administrator to get an account and choose your role

32 ©2006 Ariadne Genomics. All Rights Reserved. 32 ©2006 Ariadne Genomics. All Rights Reserved. Ariadne Technical Support

33 ©2006 Ariadne Genomics. All Rights Reserved. 33 ©2006 Ariadne Genomics. All Rights Reserved. Summary of the introduction slides Medscan technology Software architecture, hardware and software requirements User roles ResNet database overview Ariadne’s technical support

34 ©2006 Ariadne Genomics. All Rights Reserved. 34 ©2006 Ariadne Genomics. All Rights Reserved. Summary for the rest of the day Working with objects in database Working with pathway diagram and layout algorithms Database search in PS Build pathway tool and strategy Data import/export Pathways in ResNet Pathway comparison and statistical algorithms Find groups/pathways Text-mining in PS Microarray analysis: data import options and algorithms Pathway kinetics simulation in PS

35 ©2006 Ariadne Genomics. All Rights Reserved. DAY 1 Pathway Building in Pathway Studio Manual Automatic using Graph navigation tools Using text-mining with MedScan

36 ©2006 Ariadne Genomics. All Rights Reserved. 36 ©2006 Ariadne Genomics. All Rights Reserved. Viewing and editing pathways in Pathway Studio Viewing entities in the List Pane Entity and relation tables Show all references Pathway Reference summary Export protein list Display styles: By type, By effect, By reference count UI options: –magnifier –fit text to entities –simple and full graph view –fit to window –rotate –move –zoom by rectangle –advanced graph scaling resizing nodes in pathway pane

37 ©2006 Ariadne Genomics. All Rights Reserved. 37 ©2006 Ariadne Genomics. All Rights Reserved. Finding entities and relations in Pathway Studio database Quick search String search Search by attribute Build pathway tool

38 ©2006 Ariadne Genomics. All Rights Reserved. 38 ©2006 Ariadne Genomics. All Rights Reserved. Viewing and editing entity/relation properties Edit Entity property dialog, URN identifier Links to external databases Adding new properties, Declaring new properties in the database

39 ©2006 Ariadne Genomics. All Rights Reserved. 39 ©2006 Ariadne Genomics. All Rights Reserved. Palette pane Making a figure legend for your publication Viewing group display styles Drag & drop entity icon into pathway pane

40 ©2006 Ariadne Genomics. All Rights Reserved. 40 ©2006 Ariadne Genomics. All Rights Reserved. Images pane Drag & drop images into pathway pane Importing your own images Image properties

41 ©2006 Ariadne Genomics. All Rights Reserved. 41 ©2006 Ariadne Genomics. All Rights Reserved. KEGG pathways layout node cloning in pathway graph 131 metabolic pathways 20,972 connected proteins

42 ©2006 Ariadne Genomics. All Rights Reserved. 42 ©2006 Ariadne Genomics. All Rights Reserved. Several methods for adding objects and relations to Pathway pane Adding objects: Drag & drop from the palette Drag & drop from the list pane Adding relations: Connect selected entities button Enter a fact box Drag & drop from the list pane

43 ©2006 Ariadne Genomics. All Rights Reserved. 43 ©2006 Ariadne Genomics. All Rights Reserved. Building pathways by manual curation in Pathway Studio In GeneMapp In Pathway Studio

44 ©2006 Ariadne Genomics. All Rights Reserved. 44 ©2006 Ariadne Genomics. All Rights Reserved. Building pathways by manual curation in Pathway Studio Complex Nodes Adding components to Complex Nodes In GeneMapp In Pathway Studio

45 ©2006 Ariadne Genomics. All Rights Reserved. 45 ©2006 Ariadne Genomics. All Rights Reserved. Questioner about the previous slides How many chemical reactions in the ResNet database? What is the default image for Transcription factor in PS? How many images for cell membrane can be in PS? What is the quickest search in PS? What is the quickest way to add relation to your pathway diagram?

46 ©2006 Ariadne Genomics. All Rights Reserved. DAY 1 Automatic Pathway Building using Graph navigation Build pathway tool

47 ©2006 Ariadne Genomics. All Rights Reserved. 47 ©2006 Ariadne Genomics. All Rights Reserved. Mining regulatory relations in database Basic principal: Regulatory interactions are mediated by physical interaction network –Regulomes –Biological processes pathways –Disease pathways

48 ©2006 Ariadne Genomics. All Rights Reserved. 48 ©2006 Ariadne Genomics. All Rights Reserved. Build Pathway dialog Build pathway options Filtering by direction Number of steps Build pathway filter The main application of the Build pathway tool is to quickly find connections between entities of interest therefore its button is available from all panes:

49 ©2006 Ariadne Genomics. All Rights Reserved. 49 ©2006 Ariadne Genomics. All Rights Reserved. Build pathway filters Using entity filters to answer different biological questions Using relation filter to analyze different types of high- throughput data Filtering by properties

50 ©2006 Ariadne Genomics. All Rights Reserved. 50 ©2006 Ariadne Genomics. All Rights Reserved. Build pathway Edit Results Display filtering Selecting results based on local connectivity IsNew column

51 ©2006 Ariadne Genomics. All Rights Reserved. 51 ©2006 Ariadne Genomics. All Rights Reserved. Automatic layout options Direct force layout –charges and springs –Good to find hubs in the pathway Hierarchical layout –Directed graph –Good for metabolic pathways (KEGG, ERGO) Symmetric layout (Centric graph) –Good for Expand pathway Cell localization layout (Circular and linear membrane) Configurable: –Cell localization annotation –Organelle images layout –Association of Cell localization value and Organelle image Dynamic layout –Direct-force like with adjustable spring force –Use cell localization if organelle

52 ©2006 Ariadne Genomics. All Rights Reserved. 52 ©2006 Ariadne Genomics. All Rights Reserved. Regulome pathways: algorithm input

53 ©2006 Ariadne Genomics. All Rights Reserved. 53 ©2006 Ariadne Genomics. All Rights Reserved. Regulome pathways: algorithm result

54 ©2006 Ariadne Genomics. All Rights Reserved. 54 ©2006 Ariadne Genomics. All Rights Reserved. Building pathways by Data mining converting regulatory network to protein physical interaction network for Cell Processes, Diseases, Regulomes

55 ©2006 Ariadne Genomics. All Rights Reserved. 55 ©2006 Ariadne Genomics. All Rights Reserved. Disease networks 2300 diseases, 230 cancers in ResNet 5.0 Entities associated with Endothelial cells cancer in ResNet

56 ©2006 Ariadne Genomics. All Rights Reserved. 56 ©2006 Ariadne Genomics. All Rights Reserved. Endothelial cells cancer network

57 ©2006 Ariadne Genomics. All Rights Reserved. 57 ©2006 Ariadne Genomics. All Rights Reserved. Data-mining techniques and hints Different filter settings – different biological questions. Know the relation type meaning Directional filter to perform upstream/downstream analysis Relaxing search by including the Regulation relations To mine for more specific relations use search Relation by Sentence include “your focus keyword” –Find relation mentioned in certain tissue –Find specific mechanism: trans-activation, cleavage etc… Filter by relation confidence using Relation table to increase network confidence

58 ©2006 Ariadne Genomics. All Rights Reserved. DAY 1 Build pathway settings asking different biological questions

59 ©2006 Ariadne Genomics. All Rights Reserved. 59 ©2006 Ariadne Genomics. All Rights Reserved. Finding major regulators among DE genes First choice for expression data 2 3 Second choice for expression data 1 Third choice for expression data

60 ©2006 Ariadne Genomics. All Rights Reserved. 60 ©2006 Ariadne Genomics. All Rights Reserved. Upstream analysis of DE genes and gene clusters First choice for expression data 2 3 Second choice for expression data 1 Third choice for expression data 1 2 3

61 ©2006 Ariadne Genomics. All Rights Reserved. 61 ©2006 Ariadne Genomics. All Rights Reserved. Analysis of proteomics co-IP data

62 ©2006 Ariadne Genomics. All Rights Reserved. 62 ©2006 Ariadne Genomics. All Rights Reserved. Analysis of proteomics phosphoprofiling experiments

63 ©2006 Ariadne Genomics. All Rights Reserved. 63 ©2006 Ariadne Genomics. All Rights Reserved. Analysis of metabolomics experiment Importing metabolomics experiment

64 ©2006 Ariadne Genomics. All Rights Reserved. 64 ©2006 Ariadne Genomics. All Rights Reserved. Relaxing Build pathway settings Replace Find only direct interactions by Find shortest path Increase Maximum number of steps in Find common regulators or in Find shortest path

65 ©2006 Ariadne Genomics. All Rights Reserved. Day 1 Pathway Building by text-mining Non-melanoma skin cancer >1,000,000 cases, (<2,000 deaths), in USA

66 ©2006 Ariadne Genomics. All Rights Reserved. 66 ©2006 Ariadne Genomics. All Rights Reserved. MedScan Reader: PubMed search Keep searching and adding relations At the end Send extracted relations to Pathway Studio

67 ©2006 Ariadne Genomics. All Rights Reserved. 67 ©2006 Ariadne Genomics. All Rights Reserved. MedScan Reader: Import top 100 Hits from Google Scholar search: downloads found articles and processes them with MedScan

68 ©2006 Ariadne Genomics. All Rights Reserved. 68 ©2006 Ariadne Genomics. All Rights Reserved. MedScan Reader: Import top 30 Hits from Google search: downloads found web-pages and processes them with MedScan

69 ©2006 Ariadne Genomics. All Rights Reserved. 69 ©2006 Ariadne Genomics. All Rights Reserved. Full-text article found on Highwire press with “non-melanoma skin cancer” text search

70 ©2006 Ariadne Genomics. All Rights Reserved. 70 ©2006 Ariadne Genomics. All Rights Reserved. “Non-melanoma skin cancer” literature network – result of text-mining by MedScan Reader Every entity in this network was mentioned in the context of non-melanoma skin cancer

71 ©2006 Ariadne Genomics. All Rights Reserved. 71 ©2006 Ariadne Genomics. All Rights Reserved. Protein interaction network for non-melanoma skin cancer using information from entire ResNet Compare this pathway with your experimental patient data

72 ©2006 Ariadne Genomics. All Rights Reserved. 72 ©2006 Ariadne Genomics. All Rights Reserved. Text-mining techniques and hints controlling relevance of literature networks Search with keywords for full-text articles and subsequent MedScan fact extraction loosely associates keywords with facts: you find all facts mentioned in the one article with your keywords Search with keywords for PubMed abstracts and subsequent MedScan fact extraction provides better relevance of the extracted facts to your keywords: you find all facts mentioned in the one abstract with your keywords Search with keywords for sentences extracted by MedScan provides the most relevant relevance of the extracted facts to your keywords: you find all facts mentioned in the one abstract with your keywords

73 ©2006 Ariadne Genomics. All Rights Reserved. DAY 1 Data Import/Export

74 ©2006 Ariadne Genomics. All Rights Reserved. 74 ©2006 Ariadne Genomics. All Rights Reserved. Tools -> Import Protein List Choice of identifiers Lookup preview Paste and Load from file Import as New group of proteins

75 ©2006 Ariadne Genomics. All Rights Reserved. 75 ©2006 Ariadne Genomics. All Rights Reserved. Tools -> Import Protein Network Choice of identifiers Lookup preview Paste and Load from file Import of Regulatory relations

76 ©2006 Ariadne Genomics. All Rights Reserved. 76 ©2006 Ariadne Genomics. All Rights Reserved. Importing Chip-On-Chip data as PromoterBinding relations using Tools->Import Protein Network

77 ©2006 Ariadne Genomics. All Rights Reserved. 77 ©2006 Ariadne Genomics. All Rights Reserved. Import creates a new pathway with new relations

78 ©2006 Ariadne Genomics. All Rights Reserved. 78 ©2006 Ariadne Genomics. All Rights Reserved. Database -> Import Wizard Importing from Internet Import formats and options Specifying source for entities and relation Specifying source folder for pathways

79 ©2006 Ariadne Genomics. All Rights Reserved. 79 ©2006 Ariadne Genomics. All Rights Reserved. Database -> Export Wizard Exporting pathways Export filters Export strategy Exporting entities annotation in Plain text format

80 ©2006 Ariadne Genomics. All Rights Reserved. DAY 1 Data management, pathway comparison, find groups/pathways

81 ©2006 Ariadne Genomics. All Rights Reserved. 81 ©2006 Ariadne Genomics. All Rights Reserved. Working with groups in Pathway Studio Create group Add Entities to a group Add group as a node into pathway pane Select/Highlight by group Maintaining group hierarchy

82 ©2006 Ariadne Genomics. All Rights Reserved. 82 ©2006 Ariadne Genomics. All Rights Reserved. Edit -> Combine Pathway Union Intersection Subtract

83 ©2006 Ariadne Genomics. All Rights Reserved. 83 ©2006 Ariadne Genomics. All Rights Reserved. Tools for pathways comparison in Pathway Studio Combine pathways Select Highlight

84 ©2006 Ariadne Genomics. All Rights Reserved. 84 ©2006 Ariadne Genomics. All Rights Reserved. Statistical algorithms for pathway comparison in Pathway Studio Find Pathways Find Groups Gene Ontology analysis

85 ©2006 Ariadne Genomics. All Rights Reserved. DAY 2 Analysis of high-throughput data in Pathway Studio

86 ©2006 Ariadne Genomics. All Rights Reserved. 86 ©2006 Ariadne Genomics. All Rights Reserved. Experiment types Gene expression –Find major regulators –Find biomarkers –Gene clustering Metabolomics –Find major metabolism regulators –Combined analysis with gene expression Proteomics –Mass-spec protein level –Finding major kinases/phosphatase for phosphoprofiles

87 ©2006 Ariadne Genomics. All Rights Reserved. 87 ©2006 Ariadne Genomics. All Rights Reserved. Data model in ResNet database Use different networks for different types of experimental data Expression PromoterBinding DirectRegulation ProtModification Binding MolSynthesis MolTransport Regulation Interpretation of Gene Expression data Interpretation of Proteomics data Interpretation of Metabolomics data, Biomarkers prediction and validation …MORE….

88 ©2006 Ariadne Genomics. All Rights Reserved. 88 ©2006 Ariadne Genomics. All Rights Reserved. Analysis of gene expression microarray data: import and selection of responsive genes Data import –Tab-delimited and Excel files –Affymetrix CEL files (with RMA normalization) –GenePix (GPR) Result: Save the experiment in the Expression favorites Selection of responsive genes –Find differentially expressed genes (significance analysis via t-test) for analysis of two samples measured in multiple replicas –Gene clustering via correlation networks (Pearson correlation) –Find responsive genes in the 3d party software for statistical analysis of microarray data and import it as a list (Tools->Import protein list) Result: save as group of genes in Groups folder

89 ©2006 Ariadne Genomics. All Rights Reserved. 89 ©2006 Ariadne Genomics. All Rights Reserved. Analysis of gene expression microarray data: Pathway Analysis Network analysis –Identification of DE expressed protein complexes and physical networks Build pathway: Find direct regulation, filter for physical interactions (Binding, DirectRegulation, ProtModification) Build differentially expressed networks, filter by Binding (PS Enterprise only) –Identification of major regulators and targets in expression network: Build pathway: Find direct regulation, filter for Expression and/or PromoterBinding interactions, use hierarchical layout Find significant regulators (network enrichment analysis) filter by Expression, PromoterBinding (PS Enterprise only) Result: save as pathway Functional analysis –Find groups/pathways Gene ontology analysis Comparative gene ontology analysis –Build pathway: Find common targets, filter by CellProcess –Find DE groups/pathways (Gene Set Enrichment analysis, GSEA) Result: List of groups/pathways with p-values indicating statistical significance of differential expression. Save as a group, as analysis results or export to Excel

90 ©2006 Ariadne Genomics. All Rights Reserved. 90 ©2006 Ariadne Genomics. All Rights Reserved. Most common workflow for microarray analysis in Pathway Studio for disease Identify genes differentially expressed in disease (DE genes) Identify genes known to associate to disease according to the literature using Pathway Studio Identify DE genes that are linked to known diseases genes using Pathway Studio Report novel disease genes

91 ©2006 Ariadne Genomics. All Rights Reserved. 91 ©2006 Ariadne Genomics. All Rights Reserved. Expression Data Import wizard Generic tab-delimited format –Import any matrix expression data containing expression values and/or p-values. Minimum requirement: one column with gene identifiers and one column with sample Import of Affymetrix CEL (RMA averaging) Import of Molecular devices Genepix format with Vera & Sam normalization

92 ©2006 Ariadne Genomics. All Rights Reserved. 92 ©2006 Ariadne Genomics. All Rights Reserved. Expression experiment viewer in Pathway Studio Experiment properties Gene identifier column: views, sorting, find Heat map scale Filter genes by value Filtering by genes by pathway Text view for expression matrix Create group from selection

93 ©2006 Ariadne Genomics. All Rights Reserved. 93 ©2006 Ariadne Genomics. All Rights Reserved. Finding differentially expressed genes in Pathway Studio (significance analysis): Two-sample t-test = Between groups t-test Finds genes that are differentially expressed between two classes of samples measured independently on single color microarrays. Examples: multiple replicas of one untreated (1) and multiple replicas of one treated sample (2); multiple replicas of one normal sample (1) and multiple replicas of one disease sample (2); Calculated p-values indicate significance of expression difference between replicas marked 1 and replicas marked 2.

94 ©2006 Ariadne Genomics. All Rights Reserved. 94 ©2006 Ariadne Genomics. All Rights Reserved. Finding differentially expressed genes in Pathway Studio (significance analysis): Paired samples t-test, usually for two channel microarray platform Find genes which are differentially expressed between two classes of samples when the comparison is performed in one experiment (two color or two channel microarray) but multiple times.The first class is marked by positive integer and the corresponding sample from the second class measured on the same array is marked by the negative integer with the same absolute value. Calculated p-values indicate significance of expression difference between two sample classes.

95 ©2006 Ariadne Genomics. All Rights Reserved. 95 ©2006 Ariadne Genomics. All Rights Reserved. Finding differentially expressed genes in Pathway Studio (significance analysis): DE genes in multiple experimental log ratio samples If you have imported pre-calculated your data as log ratios of the normalized expression values you should use this test to find differentially expressed genes for multiple replicas of normalized expression values. Calculated p-values indicate how far the ratio of a given gene deviates from the global mean of ratios across all genes and samples.

96 ©2006 Ariadne Genomics. All Rights Reserved. 96 ©2006 Ariadne Genomics. All Rights Reserved. Gene expression clustering using Relevance network Expression -> Build network from expression -> Pearson correlation

97 ©2006 Ariadne Genomics. All Rights Reserved. 97 ©2006 Ariadne Genomics. All Rights Reserved. Parameters for Pearson correlation Major parameters: Percent of genes to remove – removes less variable genes. Controls number of vertices in the graph. Keep number of proteins under 1000 in the network Threshold – allows correlation links above threshold. Controls number of edges in the graph. Number of permutations – turn on automatic Threshold calculation using randomized expression samples. P-value – select most non-random correlation links. Controls number of edges in the graph. Value 0.01 corresponds to 10% of all possible links equal to (number of vertices) 2

98 ©2006 Ariadne Genomics. All Rights Reserved. 98 ©2006 Ariadne Genomics. All Rights Reserved. Finding upstream regulator for a gene cluster using Build pathway option Find common regulators

99 ©2006 Ariadne Genomics. All Rights Reserved. 99 ©2006 Ariadne Genomics. All Rights Reserved. Finding major transcription regulators among differentially expressed genes Use Build pathway tool option Find direct interactions with filtering for PromoterBinding and Expression to reduce the complexity of your differential expression pattern

100 ©2006 Ariadne Genomics. All Rights Reserved. 100 ©2006 Ariadne Genomics. All Rights Reserved. Build pathway filter stringencies Gene Expression: Promoter Binding > Expression > Regulation > Co- occurrence Protein > Complex > Functional Class Metabolomics: MolSynthesis > Regulation Proteomics: Direct Regulation > ProtModification > Binding > Regulation Protein > Complex > Functional Class

101 ©2006 Ariadne Genomics. All Rights Reserved. 101 ©2006 Ariadne Genomics. All Rights Reserved. Questioner Day 1 What is the quickest Entity search in Pathway Studio? What is the most comprehensive Entity search in Pathway Studio? How to create a group in PathwayStudio and add entities to it? How to Build pathway from the up-regulated genes in you microarray experiment?

102 ©2006 Ariadne Genomics. All Rights Reserved. 102 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 1 Build pathway for EDG regulation Using GeneMapp pathway as a guide build the EDG1 pathway in PathwayStudio: –Find proteins for EDG1 pathway –Find relations for EDG1 pathway –Create additional relations missing from ResNet database –Arrange nodes by cell localization –Save pathway as HTML for web publication

103 ©2006 Ariadne Genomics. All Rights Reserved. 103 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 2 Create a pathway containing groups and sub- pathways as nodes. Continue building EDG pathway by adding sub-pathways and groups Complet the pathway by text-mining search with filtering

104 ©2006 Ariadne Genomics. All Rights Reserved. 104 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 3 Find drug regulating kinases Find kinases in the database with connectivity >0 –Search by attribute for Functinal class = Kinase and Connectivity >0 Find drugs regulating these kinases –Expand pathway from kinases with filter by small molecules –Select drugs in the expanded pathway –Select neighbors for drugs –Copy selection in the new pathway

105 ©2006 Ariadne Genomics. All Rights Reserved. 105 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 4 Find biological processes regulated by proteins involved in prostate cancer Find prostate cancer disease node Find proteins regulating prostate cancer Find cell processes affected by these proteins Sort found processes by number of prostate cancer protein regulators

106 ©2006 Ariadne Genomics. All Rights Reserved. Day 2 Advanced workflows in Pathway Studio

107 ©2006 Ariadne Genomics. All Rights Reserved. 107 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 1. Comparative Gene ontology analysis (Folberg’s experiment) Import of CEL files 1) Calculation of the differentially expressed genes 2) Creating a group from DE genes 3) Finding statistically significant GO groups 4) Creating a pathway from GO groups 5) Comparing two lists of GO groups 6) Finding DE genes in GO groups Comparing lists of the differentially expressed GO groups rather than DE genes is more sensitive when comparing the responses in two cell lines, patients and other samples.

108 ©2006 Ariadne Genomics. All Rights Reserved. 108 ©2006 Ariadne Genomics. All Rights Reserved. Comparing lists of the differentially expressed GO groups rather than DE genes is more sensitive when comparing the responses in two cell lines, patients and other samples Subtracting groups 6 genes Subtracting genes No significant groups

109 ©2006 Ariadne Genomics. All Rights Reserved. 109 ©2006 Ariadne Genomics. All Rights Reserved. Two groups of genes differentially expressed during growth in 3D culture vs. flat culture for aggressive and non-aggressive tumors are selected non-aggressive aggressive flat 3D no growth flat 3D growth 1. Genes of interest 2. Groups of interest

110 ©2006 Ariadne Genomics. All Rights Reserved. 110 ©2006 Ariadne Genomics. All Rights Reserved. Comparative GO group analysis of aggressive vs. non-aggressive uveal melanoma Open DE GO groups from aggressive tumors Compare with DE GO groups from non-aggressive tumors

111 ©2006 Ariadne Genomics. All Rights Reserved. 111 ©2006 Ariadne Genomics. All Rights Reserved. Select GO groups related to your experimental goals (cell adhesion DE groups unique for aggressive tumors) These groups are significant in aggressive melanoma when we compare its growth in 3D matrix vs. flat culture These groups are NOT significant in non-aggressive melanoma when we compare its growth in 3D matrix vs. flat culture

112 ©2006 Ariadne Genomics. All Rights Reserved. 112 ©2006 Ariadne Genomics. All Rights Reserved. A network of differentially expressed in aggressive uveal melanoma involved in cell adhesion

113 ©2006 Ariadne Genomics. All Rights Reserved. 113 ©2006 Ariadne Genomics. All Rights Reserved. 23 SP1 targets among DE genes in cell adhesion network unique for aggressive uveal melanoma during 3D growth

114 ©2006 Ariadne Genomics. All Rights Reserved. 114 ©2006 Ariadne Genomics. All Rights Reserved. Supportive evidence for SP1 role in melanoma aggressiveness

115 ©2006 Ariadne Genomics. All Rights Reserved. 115 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 2. Three methods to find biological processes affected by DE genes 1)Find groups from Biological processes Gene Ontology classification 2)Find pathways indicating biological processes 3)Build pathway option Find common targets filtering for Cell Process Includes: -Finding proteins using Search by attribute (cell localization) and then determining their biological processes

116 ©2006 Ariadne Genomics. All Rights Reserved. 116 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 3. Three ways to find biomarkers in Pathway Studio By text-mining –Extract pathways from text: PubMed Search for your Disease By data-mining –Search for disease of interest in the database –Use Build Pathway: Expand option to find Disease biomarkers By gene expression data analysis –Identify Differentially expressed genes –Use Build pathway: Direct interaction option to find proteins that are downstream of many DE genes. These proteins are most likely biomarkers according to your expression data (See also next slide)

117 ©2006 Ariadne Genomics. All Rights Reserved. 117 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 4. Building disease network using Build pathway tool Includes: Finding disease of interest in the database Finding proteins contributing to disease Finding biomarkers for a disease Building disease networks using: –Build pathway Find direct interactions for protein regulating disease –Build pathway Expand pathway for protein biomarkers –Combining two pathways –Layout by cell localization - Text –mining : updates?

118 ©2006 Ariadne Genomics. All Rights Reserved. 118 ©2006 Ariadne Genomics. All Rights Reserved. Workflow 5. Building pathway by text-mining for Li- Fraumeni syndrome Includes: Creating new local database Use of Search PubMed option (Db import) Consolidation of the db (db updates / groups) Understanding the major protein players in Li- Fraumeni syndrome Understanding regulators / targets / cell processes associated with Li-Fraumeni syndrome


Download ppt "©2006 Ariadne Genomics. All Rights Reserved. Pathway Studio Workgroup/Enterprise training course."

Similar presentations


Ads by Google