Presentation is loading. Please wait.

Presentation is loading. Please wait.

Textpresso Application and Extensibility Eimear Kenny GMOD Meeting, April 2004.

Similar presentations


Presentation on theme: "Textpresso Application and Extensibility Eimear Kenny GMOD Meeting, April 2004."— Presentation transcript:

1 Textpresso Application and Extensibility Eimear Kenny GMOD Meeting, April 2004

2 Textpresso Advances Application: Advanced lit. search tool for curators Semi-automated curation tasks Automated curation tasks Extensibility: Implementation of Textpresso for yeast lit.

3

4

5

6 ABSTRACTFULL TEXT DatatypeHumanSearch termTrue hits Total hits RecallPrecisionTrue hits Total hits RecallPrecision Expression data 327express*22139867.6%55.5%327901100%36.3% Mapping data 36map*0510% 3148286.1%6.4% RNAi data 220rnai608427.3%71.4%21035395.5%59.5% Transgenes 95transgenes*8238.4%34.8%6938172.6%21.7% TOTAL 67828955642.6%52%6372,11794%30.1%

7 Textpresso Ontology Relationships Semantic Biological Concepts Gene Transgene Allele Cell or Cell Group Cellular Component Nucleic Acid Organism Entity Feature Life Stage Phenotype Strain Sex Clone Molecular Function Mutant Drugs and Sml Mols Association Consort Effect Purpose Pathway Regulation Comparison Spatial Relation Time Relation Involvement Characterization Method Biological Process Action Bracket Determiner Conjunction Conjecture Negation Preposition Pronoun Punctuation “anti-rabbit IgG polyclonal antibody” “eat-4” “necessary for” “Nomarski” “epipstasis” “co-expressed with” “homologue of” “not” “ZK512.6”

8 Textpresso Ontology Relationships Semantic Biological Concepts Gene Transgene Allele Cell or Cell Group Cellular Component Nucleic Acid Organism Entity Feature Life Stage Phenotype Strain Sex Clone Molecular Function Mutant Drugs and Sml Mols Association Consort Effect Purpose Pathway Regulation Comparison Spatial Relation Time Relation Involvement Characterization Method Biological Process Action Bracket Determiner Conjunction Conjecture Negation Preposition Pronoun Punctuation “anti-rabbit IgG polyclonal antibody” “eat-4”, “necessary for” “Nomarski” “epipstasis” “co-expressed with” “homologue of” “not” “ZK512.6”

9 ….. activation of let-7 RNA expression downregulates LIN-4 to relieve inhibition of lin-29. Biological Process Regulation Gene Molecular Function Biological Process // activation of let-7 RNA expression down regulates LIN-41 to relieve inhibition of lin-29. // © Textpresso, 2004

10 Find sentences from the literature that describe genetic interaction! >= 2 named “Gene” && (>= 1 “Association” || >= 1 “Regulation”) Using Textpresso to expediate curation

11 Interaction TypeABC Genetic Interactions1(0.5%)13(6.5%)39(19.5%) Possible Genetic Interaction3(1.5%)6(3%)14(7%) Non-genetic Interactions4(2%)6(3%)12(6%) No Interaction192(96%)175(87.5%)135(67.5%)

12 100 sentences per hour!

13 1,986 articles  17,851 sentences 31.4% Interaction Information 68.6% NO Interaction Information 1,224 Regulation 6.5% 127 Physical Inxn 0.7% 1,825 Possible Inxn 9.8% 3,702 Genetic Inxn 19.8%

14 MOD’s Disease/Expr/Mut/Other Seqn/Str Did you know ? “The Molecular Database Collection” (NAR - 2001, 2002, 2003, 2004)

15 Textpresso goes to Stanford …… Rob Nash Stan Dong Eimear Kenny Rama Balakrishnan Christopher Lane Eurie Hong Mike Cherry

16 Implementing Textpresso for Yeast >6,000 Papers (~4,000 full text) 1 week build - add papers (~24 h) - change ontology (rebuild) 8G database Linux >60,000 Journal Article (~15,000 full text) >2 week build -add papers (~3d) -change ontology (rebuild) 30G database? Solaris Worm Build Yeast Build

17 Adapting Textpresso Ontology for Yeast Life Stage Cell Cycle Life Cycle Cell Name or Group Sex Phenotype  Phenotype Method  Method Gene  Gene Allele  Allele Transgene  Transgene Strain  Strain ?? Clone  Clone Worm biology  Yeast biology

18

19

20 Implementing Textpresso for MODS >6,000 Papers (~4,000 full text) 1 week build - add papers (~24 h) - change ontology (rebuild) 8G database Linux >60,000 Journal Article (~15,000 full text) >2 week build -add papers (~3d) -change ontology (rebuild) 30G database? Solaris Worm BuildYeast BuildFly Build >140,000 Journal Article (? full text) ? build -add papers (?) -change ontology (rebuild) ?G database Solaris

21 Textpresso Ontology Relationships Semantic Biological Concepts Gene Transgene Allele Cell or Cell Group Cellular Component Nucleic Acid Organism Entity Feature Life Stage Phenotype Strain Sex Clone Molecular Function Mutant Drugs and Sml Mols Association Consort Effect Purpose Pathway Regulation Comparison Spatial Relation Time Relation Involvement Characterization Method Biological Process Action Bracket Determiner Conjunction Conjecture Negation Preposition Pronoun Punctuation Life Cycle FOR FLY Anatomy 1. Chromosomal aberrations? (inversion, polytene, substitution, deletion, balancers, p elements, hypomorphs, hypermorphs) 2. Stresses? (nutrition, temperature, sleep)


Download ppt "Textpresso Application and Extensibility Eimear Kenny GMOD Meeting, April 2004."

Similar presentations


Ads by Google