Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright OpenHelix. No use or reproduction without express written consent1.

Similar presentations


Presentation on theme: "Copyright OpenHelix. No use or reproduction without express written consent1."— Presentation transcript:

1 Copyright OpenHelix. No use or reproduction without express written consent1

2 Textpresso A Text-Mining System for Biological Literature Materials prepared by: Mary E. Mangan, Ph.D. www.openhelix.com Updated: Q1 2011 Version 1.0

3 Copyright OpenHelix. No use or reproduction without express written consent3 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

4 Copyright OpenHelix. No use or reproduction without express written consent4 Textpresso Introduction Textpresso developed at WormBase open-source text-mining software ontology-based information retrieval and extraction system initially used by WormBase biocurators now applicable to other bodies of text

5 Copyright OpenHelix. No use or reproduction without express written consent5 Text-Mining: Separating Wheat from Chaff ? but I just want papers for Hoxd11 Hoxd11 keyword = Hoxd11 Text-Mining Scientific Literature corpus

6 Copyright OpenHelix. No use or reproduction without express written consent6 Results Example Keyword: lin-12

7 Copyright OpenHelix. No use or reproduction without express written consent7 Textpresso: Beyond the Nematode... FUNGI: N. crassa, A. nidulans, M. grisea, Cryptococcus, Filamentous http://ilex.caltech.edu:80/trac/alere/

8 Copyright OpenHelix. No use or reproduction without express written consent8 Textpresso Homepage About

9 Copyright OpenHelix. No use or reproduction without express written consent9 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

10 Copyright OpenHelix. No use or reproduction without express written consent10 Information Retrieval and Extraction Hoxd11 retrieves documents Hoxd11 is expressed in the limb bud. Pax6 regulates Hoxd11 transcription. The nucleus contains Hoxd11 protein. extracts facts Information Retrieval corpus Information Extraction corpus

11 Copyright OpenHelix. No use or reproduction without express written consent11 Two Major Elements of Textpresso Information Retrieval titles and abstracts only titles, abstracts, and full-text 1 Improves Results sentences are marked up with ontology-based categories + 2 Information Extraction

12 Copyright OpenHelix. No use or reproduction without express written consent12 How Textpresso Works paper Sentence 1 Sentence 2 Sentence 3 Sentence 4 Sentence 5 Sentence 6 Word TEXTPRESSO = Text Processing System ontology category tags = association = molecular function = phenotype = anatomy etc... Query = phenotype ( ) + anatomy ( ) : Sentence 1 Sentence 4 Extract =

13 Copyright OpenHelix. No use or reproduction without express written consent13 Textpresso Ontology Categories grouped by Topic allele, anatomy, developmental stage phenotype, molecular function (GO), biological process (GO), etc... characterization, effect, localization, method, pathway, purpose, etc... Biological Concepts Description anatomy = cuticle, dorsal cord, M1 neuron, etc... association, comparison, consort, involvement, regulation, spatial relation, etc... Relationships

14 Copyright OpenHelix. No use or reproduction without express written consent14 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

15 Copyright OpenHelix. No use or reproduction without express written consent15 Similar Textpresso Sites hosted on Textpresso server at Caltech hosted by the groups using them And more!

16 Copyright OpenHelix. No use or reproduction without express written consent16 Textpresso for C. elegans corpus

17 Copyright OpenHelix. No use or reproduction without express written consent17 Document Finder WBPaper00000646 WormBase paper ID document is in the corpus

18 Copyright OpenHelix. No use or reproduction without express written consent18 Ontology Browser tail

19 Copyright OpenHelix. No use or reproduction without express written consent19 Category Lists pick-list all the terms for “association” in lexicon

20 Copyright OpenHelix. No use or reproduction without express written consent20 Query Language

21 Copyright OpenHelix. No use or reproduction without express written consent21 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

22 Copyright OpenHelix. No use or reproduction without express written consent22 Simple Search enter search term let-23

23 Copyright OpenHelix. No use or reproduction without express written consent23 let-23 Results Deselect “Matching Sentences” for list overview

24 Copyright OpenHelix. No use or reproduction without express written consent24 Document Details click keyword highlighted

25 Copyright OpenHelix. No use or reproduction without express written consent25 Matching Sentences

26 Copyright OpenHelix. No use or reproduction without express written consent26 Supplemental Document Links

27 Copyright OpenHelix. No use or reproduction without express written consent27 let-23 Adding Categories to Your Search

28 Copyright OpenHelix. No use or reproduction without express written consent28 Detailed Query

29 Copyright OpenHelix. No use or reproduction without express written consent29 Detailed Query Results anatomy c molecular function regulation c keyword other genes

30 Copyright OpenHelix. No use or reproduction without express written consent30 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

31 Copyright OpenHelix. No use or reproduction without express written consent31 Filtering the Results +Clark[author]

32 Copyright OpenHelix. No use or reproduction without express written consent32 Filtered Results

33 Copyright OpenHelix. No use or reproduction without express written consent33 Turning on Advanced Search Options

34 Copyright OpenHelix. No use or reproduction without express written consent34 Advanced Search Options field of document to search what scope of document to search how to sort documents exclude documents optional filters search mode

35 Copyright OpenHelix. No use or reproduction without express written consent35 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

36 Copyright OpenHelix. No use or reproduction without express written consent36 Textpresso: Beyond the Nematode... FUNGI: N. crassa, A. nidulans, M. grisea, Cryptococcus, Filamentous

37 Copyright OpenHelix. No use or reproduction without express written consent37 Textpresso: Beyond the Model Organisms

38 Copyright OpenHelix. No use or reproduction without express written consent38 Textpresso for Neuroscience category list

39 Copyright OpenHelix. No use or reproduction without express written consent39 category list Neuroscience Categories adds neurobiology categories

40 Copyright OpenHelix. No use or reproduction without express written consent40 Neuroscience Query Example cocaine synapsebinding

41 Copyright OpenHelix. No use or reproduction without express written consent41 Pharmspresso

42 Copyright OpenHelix. No use or reproduction without express written consent42 Pharmspresso Query

43 Copyright OpenHelix. No use or reproduction without express written consent43 Pharmspresso Results

44 Copyright OpenHelix. No use or reproduction without express written consent44 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

45 Copyright OpenHelix. No use or reproduction without express written consent45 Textpresso: Text Processing System paper Sentence 1 Sentence 2 Sentence 3 Word ontology category tags searches full-text marks up sentences and words + Information RetrievalInformation Extraction

46 Copyright OpenHelix. No use or reproduction without express written consent46 Textpresso: Mine the Biological Literature c molecular functionregulation c keyword other genes Extract = matching sentences Search using keywords + categories anatomy Retrieval = pertinent documents

47 Copyright OpenHelix. No use or reproduction without express written consent47 Textpresso: Different Site, Same Interface Learn it once. Use it everywhere.

48 Copyright OpenHelix. No use or reproduction without express written consent48 Textpresso: Beyond Organisms neurobiology pharmacogenetics

49 Copyright OpenHelix. No use or reproduction without express written consent49 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org

50 Copyright OpenHelix. No use or reproduction without express written consent50


Download ppt "Copyright OpenHelix. No use or reproduction without express written consent1."

Similar presentations


Ads by Google