Download presentation
Presentation is loading. Please wait.
Published byGiles Hensley Modified over 9 years ago
1
Moving beyond free text
2
Authors
3
Scientist does research Scientist publishes research results in journal article Old Paradigm:
4
Want: All genes involved in seed development (name, species, protein sequence)
5
Read 3,404 articles???
6
Read 592,000 articles???
7
Results extracted from free text and converted to a structured format (ontology annotations) Structured data combined with other data for queries, further analysis manual curation (+ NLP…?) Scientist does research Scientist publishes research results as free text Database Old Paradigm - extended:
8
Example – Journal article about gene function
9
The goal: an annotation that captures the result Example – Journal article about gene function
10
Manual curation: Time consuming, does not scale well NLP: Very challenging The goal: an annotation that captures the result Example – Journal article about gene function
11
Example – phylogenetic treatment http://www.mobot.org/mobot/research/apweb/welcome.html Relatively high degree of structure compared to journal article May be more amenable to natural language processing but still very challenging, complex information
12
Results extracted from free text and converted to a structured format (ontology annotations) Structured data combined with other data for queries, further analysis manual curation (+ NLP) Can we get authors involved? Scientist does research Scientist publishes research results as free text Database
13
Link to external resource Scientific Publishers are interested in this problem…
14
Science Direct: http://www.sciencedirect.com/science/article/pii/S0378111910001502 Scientific Publishers are interested in this problem…
16
Databases are interested in this problem…
18
What if we had a good general tool for authors to do this themselves?
19
http://herbarium.usu.edu/webmanual/ Example: Morphological description of species
20
http://herbarium.usu.edu/webmanual/ Example: Morphological description of species
21
PO:0025034 (leaf), PATO:0000599 (decreased width) PO:0020003 (ovule), PATO:0000460 (abnormal) PO:0009010 (seed), PATO:0001997 (reduced) Example: Mutant phenotype description
22
Scientist does research Scientist publishes research results as free text and as annotations using ontology terms Benefit to scientist – wider exposure and reuse of results Benefit to publishers – tagged text allows enhanced presentation for subscribers Benefit to research community – Better access to data New Paradigm:
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.