Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Retrieval and its Application in Biomedicine Hong Yu 1,2, PhD Susan McRoy 1, PhD 1 Department of Computer Science 2 Department of Health Sciences.

Similar presentations


Presentation on theme: "Information Retrieval and its Application in Biomedicine Hong Yu 1,2, PhD Susan McRoy 1, PhD 1 Department of Computer Science 2 Department of Health Sciences."— Presentation transcript:

1 Information Retrieval and its Application in Biomedicine Hong Yu 1,2, PhD Susan McRoy 1, PhD 1 Department of Computer Science 2 Department of Health Sciences University of Wisconsin-Milwaukee Sept 4 Introduction

2 What is Information Retrieval? The field concerned with the acquisition, organization, and searching of knowledge-based information. (Hersh, 2003) The field concerned with the acquisition, organization, and searching of knowledge-based information. (Hersh, 2003)

3 Speed Up Communication

4 Information World Wide Web World Wide Web Company Documentations Company Documentations Drug Descriptions Drug Descriptions Medical Records Medical Records Books Books Everything that is text, image, video, and sound, and that can be transformed digitally Everything that is text, image, video, and sound, and that can be transformed digitally

5 Information in Biomedicine Literature (over 17 million publications) Literature (over 17 million publications) WWW WWW Electronic medical records Electronic medical records Genomics data Genomics data –DNA sequences, etc. Knowledge representation Knowledge representation –Gene Ontology Company databases Company databases –Micromedex drug database

6 IR in Biomedicine Index Medicus (Billings 1879) Index Medicus (Billings 1879) MEDLARS (NLM 1966) MEDLARS (NLM 1966) SAPHIRE (Hersh 1990) SAPHIRE (Hersh 1990) PubMed (NLM 1996) PubMed (NLM 1996) Arrowsmith (Smalheiser 1998) Arrowsmith (Smalheiser 1998) BioText (Hearst 2003) BioText (Hearst 2003) BioMedQA (Yu 2006) BioMedQA (Yu 2006)

7 Electronic and Open Publishing Internet and Web have a profound impact on the publishing of knowledge-based information Internet and Web have a profound impact on the publishing of knowledge-based information Most of literature can be electronically available Most of literature can be electronically available Open-access Open-access –The Bethesda Statement on Open Access Publishing (http://www.earlham.edu/~peters/fos/bethesda.htm) (April 11, 2003) http://www.earlham.edu/~peters/fos/bethesda.htm –The Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities (http://www.zim.mpg.de/openaccess- berlin/berlindeclaration.html). (2003) http://www.zim.mpg.de/openaccess- berlin/berlindeclaration.htmlhttp://www.zim.mpg.de/openaccess- berlin/berlindeclaration.html –PubMedCentra (NLM 2004)

8 Quality of Information A lack of quality control A lack of quality control –Anyone can publish online –A wealthy of studies concluded that Web has a poor quality for healthcare information Readability Readability –Hard to read

9 Information Needs and Seeking Unrecognized needs Unrecognized needs –Clinicians unaware of information needs or knowledge deficit Recognized needs Recognized needs –Clinicians aware of needs but may or may not pursue them Pursued needs Pursued needs –Information seeking occurs but may or may not be successful Satisfied needs Satisfied needs –Information seeking successful

10 Evidence-Based Medicine

11 What You Will Learn IR algorithms IR algorithms –Indexing –Query and Retrieval –Evaluation –Text Classification –XML retrieval –Web retrieval

12 What You Will Learn (Cont.) Open-Source IR tools Open-Source IR tools –What open-source IR tools are available Indexing/retrieval Indexing/retrieval Part-of-speech and syntactic parsing Part-of-speech and syntactic parsing Semantic parsing Semantic parsing Discourse relations Discourse relations Machine-learning classifiers Machine-learning classifiers How to use the tools? How to use the tools?

13 What You Will Learn (Cont.) State of the art IR systems State of the art IR systems –Baruch 1965 [BLIMP http://blimp.cs.queensu.ca/index.html ] http://blimp.cs.queensu.ca/index.html –SAPHIRE (Hersh 1990) Retrieval Retrieval –MedLEE (Friedman 1994) Extraction Extraction –PubMed (NLM 1997) PubMed –ARROSMITH Systems (Smalheiser 1998) ARROSMITH Systems ARROSMITH Systems Hidden Relation Discovery Tool Hidden Relation Discovery Tool –GENIES (Friedman 2001) Extraction Extraction

14 BioText ( Hearst 2003 http://biotext.berkeley.edu/ ) BioText ( Hearst 2003 http://biotext.berkeley.edu/ ) http://biotext.berkeley.edu/ –Retrieval+Categorization GeneWays ( Rzhetsky 2004 http://geneways.genomecenter.columbia.edu/ ) GeneWays ( Rzhetsky 2004 http://geneways.genomecenter.columbia.edu/ ) http://geneways.genomecenter.columbia.edu/ –Extraction+Visualization TextPresso ( Muller 2004 http://www.textpresso.org/ ) TextPresso ( Muller 2004 http://www.textpresso.org/ ) http://www.textpresso.org/ –Retrieval+Extraction iHOP ( Hoffman and Valencia 2005 http://www.ihop- net.org/UniPub/iHOP/ ) iHOP ( Hoffman and Valencia 2005 http://www.ihop- net.org/UniPub/iHOP/ ) http://www.ihop- net.org/UniPub/iHOP/ http://www.ihop- net.org/UniPub/iHOP/ –Retrieval BioMedQA ( Yu 2006 http://monkey.ims.uwm.edu/MedQA ) BioMedQA ( Yu 2006 http://monkey.ims.uwm.edu/MedQA ) BioMedQA –Question Answering BioNLP Systems

15 Advanced NLP applications

16 Beyond text: Image and Video Image classification Image classification –Finding concepts in captions and annotations –Machine learning on textual & visual features –Determining salient features in text and image separately and merging the results Extracting text from image Extracting text from image –Understanding and correcting OCR (handwriting, equations) –Finding text in images Finding document text related to illustrations Finding document text related to illustrations Video retrieval Video retrieval Video retrieval Video retrieval

17 Beyond Extraction: Experimental Tools

18 Resources Annotated collections (GENIA, Medstract, Yapex …) Annotated collections (GENIA, Medstract, Yapex …) Ontologies, tools, knowledge bases … Ontologies, tools, knowledge bases … Publications, Conferences, Evaluations … Publications, Conferences, Evaluations … Centres and web portals Centres and web portals

19 What We Provide Textbook Textbook –Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze. Introduction to Information Retrieval. Cambridge University Press, 2007 Introduction to Information RetrievalIntroduction to Information Retrieval http://www-csli.stanford.edu/~schuetze/information- retrieval-book.html http://www-csli.stanford.edu/~schuetze/information- retrieval-book.html Office hour: Office hour: –Tuesdays, 3-4 pm EMS 710 and by appointment –Hong Yu, 414-229-3344 –Susan McRoy, 414-229-6695

20 What We Expect Undergraduate: Undergraduate: –30% Homework, 35% Midterm exam, 35% Final exam or project Graduate: Graduate: –20% Midterm exam, 40% Homework, 40% Project: The project may be done individually or in a team of 2-3 people. The final project will include a software system, a 2-3 page written project report, and an oral presentation. The report should describe the problem, the approach, and evaluation and should cite related work where appropriate.


Download ppt "Information Retrieval and its Application in Biomedicine Hong Yu 1,2, PhD Susan McRoy 1, PhD 1 Department of Computer Science 2 Department of Health Sciences."

Similar presentations


Ads by Google