Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cincinnati Comparative Mouse Genomics Centers Consortium: Bioinformatics Analysis Tools for Assessment of Human Gene Polymorphisms Anil G Jegga, Sivakumar.

Similar presentations


Presentation on theme: "Cincinnati Comparative Mouse Genomics Centers Consortium: Bioinformatics Analysis Tools for Assessment of Human Gene Polymorphisms Anil G Jegga, Sivakumar."— Presentation transcript:

1 Cincinnati Comparative Mouse Genomics Centers Consortium: Bioinformatics Analysis Tools for Assessment of Human Gene Polymorphisms Anil G Jegga, Sivakumar Gowrisankar, Jing Chen, Rafal Adamczak, Ashima Gupta, Marc A Ramirez, Kalyan SC Andra, James W Carman, Bruce J Aronow University of Cincinnati and Cincinnati Children’s Hospital Medical Center, Cincinnati, OH-45229 A principal goal of the NIEHS Comparative Mouse Genome Centers Consortium (CMGCC) is to systematically evaluate the effect of human genome polymorphisms on critical genes, pathways, and processes that alter the impact of environmental agents on human disease. The evaluation process is to identify polymorphisms, perform sequence analysis, and assess disease association and functional impact using mouse models. Sequence analysis provides an opportunity to prioritize the functional evaluation process in favor of polymorphisms likely to have harmful impact. The University of Cincinnati – CCMGC (Cincinnati Comparative Mouse Genomics Center; http://cmgcc.cchmc.org) has developed several bioinformatics’ analysis tools to improve visualization, assessment, and ranking of polymorphisms. We are now collaborating with the Universities of Washington and Utah and have developed methods to study all of the genes within the EGP and have focused analyses and tools development in three areas: genes, proteins, and pathways. In particular, we have analyzed most of the genes within DNA repair and cell cycle control categories. Support : NIEHS U01 ES11038 Mouse Centers Genomics Consortium Human RefSeq Proteins NCBI-dbSNP (non-EGP Genes) EGP-SNPs Domain Mapping PolyDom Pfam Domain Database Biological Implication Text Parsing PolyPhen/SIFT PolyDoms: We have now mapped all non-synonymous SNPs of EGP genes onto the corresponding conserved and known functional protein domains. The potential protein structure altering implications of the coding SNPs have been collected into a general visualizer for the polymorphic proteins (http://polydoms.cchmc.org) using the PolyPhen (Polymorphism Phenotyping; http://tux.embl- heidelberg.de/ramensky/) annotations and SIFT (Sorting Intolerant From Tolerant; http://blocks.fhcrc.org/~pauline/SIFT.html) server. Links to MedLine abstracts referring to the disease implications of any polymorphism of each protein is also provided when available, and are automatically updated for each of the proteins. We are also extending the mapping of the nsSNPs in the context of the 3D structure. TraFaC: To identify gene features potentially susceptible to the effects of polymorphisms, we have developed a system that uses mouse-human comparative genomic sequence feature analysis to identify conserved cis- element clusters that could act as gene regulatory regions. Our current implementation of this tool has focused on the identification and characterization of genomic features that are conserved between mouse and human (http://trafac.chmcc.org). Additional functionalities under development will allow for direct visualization of conserved features altered by insertions, deletions, and non-coding SNPs. >Seq 1 Genomic AGAGAAAATTGCTAGAGCTC AGGAGTTTGAGACCAGCCTG GGCAATAGAGTAAGACTTTG TCTCTATCAAAAATTTAAAA ATTAACTGGGCTTGGCGGTG TGCACCTGTGGTCCAGCTAC TCAGGAGGCTGAGGTGGGAG GATTGCTTGAGCCCAAGAGG TTGAGGCTGCAGTAAGCCGT >Seq 2 Genomic GACTGAGGGCTTGTGAAACA GCAAGAACCTGTCTCAAAAA ACAGTGGGCAGGGAGGGGAT TAATGAATAGGCAGCTACGT TCTGGGACTGGAGGGACTCG AGGTGGCTAGAAAGCAAGAG GTACTGGGAGACAAGGCTGC AGACATTTCTTTTTTTTTTT TTTTTTTTTGAGACAGAGTC Local Alignment Number 5 Similarity Score: 3074 Match Percentage: 51 % Number of Matches: 96 Number of Mismatches: 39 Total Length of Gaps: 52 Begins at (8281,8874) and Ends at (8416,9059) Seq 1 Seq 2 Sim% No. of Nt 8281-8300 8874-8893 70% (20 nt) 8301-8310 8902-8911 90% (10 nt) 8311-8324 8923-8936 57% (14 nt) 8325-8376 8947-8998 62% (52 nt) 8378-8386 8999-9007 67% (9 nt) 8387-8416 9030-9059 90% (30 nt) BlastZ TF Binding Sites Seq 1 Seq 2 Sim% Nt Hits 8301-8310 8902-8911 90% (10 nt)3 8311-8324 8923-8936 57% (14 nt)2 8325-8376 8947-8998 62% (52 nt)3 8378-8386 8999-9007 67% (9 nt)0 8387-8416 9030-9059 90% (30 nt)4 TraFaC CMGCC-UC BIOINFORMATICS TOOLS TraFaC (http://trafac.cchmc.org) EGP genes (234): Incidence of SNPs in the context of functional domains PolyDoms (http://polydoms.cchmc.org) PathMaker PathMaker: To represent the presence and impact polymorphisms in the context of biological pathways, we have sought to unify our representation of molecular, biological, and environmental entities such that biological knowledge from experts and biomedical literature could be assembled in a storyboard canvas. For example, the representation of a disease could consist of a biological process that is itself comprised of one or more pathways, within which, entities (gene products, complexes, and cellular and sub-cellular components) are subjected to one or more interactions and transitions to disease term associated states. We have begun the development of an application and database structure that can represent these processes, using a host of publicly available data sources including gene objects and biological ontologies to represent biomedical literature and expert knowledge. Ensembl GenBankSwiss-Prot KEGG OMIM PubMed Molecular Databases Processes & Associated Databases PathMaker Application Layers Search and Browse Organization and Modification Layer Display Layer PathMaker Canvas and Navigator Ontology Categorizer State Changer Ontology Browser Model Viewer Complex Builder Taxonomy Viewer General Search Molecule Search Ontology Explorer Generalized Biological Object Model Molecular Entities & Relationships Pathways Processes & Diseases Biological Ontologies PATHMAKER: A systems biology modeling and data mining tool built on a generalized biological object model able to represent the interaction of genes, biologic processes, environment, oncogenic pathways, and disease. Biomaterial Taxon Developmental Temporal Anatomic body part organ tissue cell subcellular Process Physiological gene ontology molecular function biological process cellular localization Temporal Pathologic Toxicologic Injury Genomic damage SNOMED pathology ICD-9/10 OMIM Exposure Xenobiotic Microbiologic Pharmacologic Biologic Clinical Outcome Pathway Biomaterial Process_In Process_out Molecule_in Molecule_out Action bind dissociate activate inhibit convert PathMaker uses ontologies of specific biological domain knowledge to model the effects of environment, genetic variation, and therapy on oncogenic molecular pathways and disease processes. Molecule Gene feature promoter cis-element product transcript splice form protein domain polymorphism snp ins / del Chromosome structure damage state Chemical structure reaction Molecular Complex composition modification state Human-Mouse Comparative Genomics Analysis of OGG1 for Coding and Non-Coding Regulatory Region Conservation Genomic Sequence with Exons (red) % sequence identity conserved Cis- element density between human and mouse Regions of sequence similarity between human and mouse conserved cis-elements in 2nd intron of Ogg1 * Disease Implication Esophageal cancer (Xing et al 2001) Lung cancer (Sugimura et al 1999) Prostate cancer (Xu et al 2002) Stomach cancer (Hanaoka et al 2001) * OGG1: Mapping Non-Synonymous SNPs onto Conserved Protein Domains & 3D Structure OGG1 Peptide Sequence 345 aa Protein domain-pfam00730, HhH-GPD superfamily base excision DNA repair protein Ala 288 Val Arg 229 Gln Asp 322Asn Clinical/Experimental-GeneChip DataSet Questions: what are the relevant patterns for disease/biology? Controls Poly-Articular JRA Course 11113.3, Pauci, 1_Pauci, MTX 0, XR_unknown 878, Pauci, 2_Poly, MTX 1, XR_erosions 845, Spond, JAS, MTX 0, XR_space narrowing 18057, Spond, JAS, MTX 0, XR_sclerosis 18036, Pauci, 1_Pauci, MTX 0, XR_normal 7029, Pauci, 1_Pauci, MTX 0, XR_normal 894, Pauci, 2_Poly, MTX 1, XR_normal 831, Poly, 2_Poly, MTX 1, XR_erosions 850, Poly, 2_Poly, MTX 1, XR_erosions 872, Poly, 2_Poly, MTX 1, XR_space narrowing 1073, Syst, 2_Poly, MTX 0, XR_space narrowing 1087PB, Poly, 2_Poly, MTX 1, XR_space narrowing 9272, Poly, 2_Poly, MTX 1, XR_unknown 1081, Poly, 2_Poly, MTX 0, XR_space narrowing 912, Syst, 2_Poly, MTX 1, XR_space narrowing 19, Pauci, 2_Poly, MTX 1, XR_space narrowing 9137, Poly, 2_Poly, MTX 0, XR_space narrowing993, Syst, 2_Poly, MTX 1, XR_space narrowing 8003, Pauci, 1_Pauci, MTX 0, XR_unknown 7177, Spond, JSPA, MTX 1, XR_space narrowing 9161, Pauci, 1_Pauci, MTX 1, XR_normal976, Pauci, 1_Pauci, MTX 1, XR_normal 1083, Control, na, MTX 0, XR_na 7145, Pauci, 1_Pauci, MTX 0, XR_normal7206, Pauci, 1_Pauci, MTX 0, XR_unknown 817, Pauci, 1_Pauci, MTX 0, XR_space narrowing 1082, Control, na, MTX 0, XR_na 1085, Control, na, MTX 0, XR_na1089, Control, na, MTX 0, XR_na 1095, Control, na, MTX 0, XR_na 7149.3, Control, na, MTX 0, XR_na 801, Pauci, 2_Poly, MTX 1, XR_erosions 1087ctrl, Control, na, MTX 0, XR_na 9245, Spond, JSPA, MTX 0, XR_space narrowing 824, Poly, 2_Poly, MTX 0, XR_normal 9264, Pauci, 1_Pauci, MTX 0, XR_erosions 7118.3, Control, na, MTX 0, XR_na 18042, Spond, JAS, MTX 0, XR_sclerosis 7021.31, Control, na, MTX 0, XR_na 1084, Control, na, MTX 0, XR_na 813.3, Control, na, MTX 0, XR_na 7108, Syst, 3_Systemic, MTX 1, XR_normal 9150, Spond, JAS, MTX 0, XR_sclerosis 7113.3, Control, na, MTX 0, XR_na 242 genes 105 Genes with Significantly Lower Expression In PolyArticular JRA 137 Genes with Significantly Higher Expression In PolyArticular JRA Individual: Individuals (33 patients + 12 controls)


Download ppt "Cincinnati Comparative Mouse Genomics Centers Consortium: Bioinformatics Analysis Tools for Assessment of Human Gene Polymorphisms Anil G Jegga, Sivakumar."

Similar presentations


Ads by Google