Presentation is loading. Please wait.

Presentation is loading. Please wait.

Medline Text Searching Tools – a Comparison Experiment McDermott Center for Human Growth and Development Center for Biomedical Inventions.

Similar presentations


Presentation on theme: "Medline Text Searching Tools – a Comparison Experiment McDermott Center for Human Growth and Development Center for Biomedical Inventions."— Presentation transcript:

1 Medline Text Searching Tools – a Comparison Experiment McDermott Center for Human Growth and Development Center for Biomedical Inventions

2 Your first biomedical experiment should be on a computer - a literature search and a data search!

3 Databases and Tools to Exploit Them National Center for Biotechnology Information - http://www.ncbi.nlm.nih.gov/ http://www.ncbi.nlm.nih.gov/ UTSW Bioinformatics Tools – http://innovation.swmed.edu/ http://innovation.swmed.edu/ GeneCards - http://bioinformatics.weizmann.ac.il/cards/ Genome Data Base - http://www.gdb.org/ US Patent Office - http://www.uspto.gov/ Research Genetics, Stratagene….

4 Bioinformatics research requires significant hardware and software investment. Hardware – ~120 workstations, 3 Apache www servers, Sun servers, 3 HP Enterprise 4/8 CPU servers, 32 CPU Linux cluster, 3 TB RAID 5 storage system, 3D laser scanners, SGI visualization stations, HP visualization stations. Software – standard genomics search tools, unique new text data mining tools, 3D analysis tools Databases – all major genetics/genomics databases, all of Medline, many custom local database.

5 Text data mining Genomic Annotation Gene collection identification and analysis Polymorphism Prediction A family of bioinformatics tools and their computed databases have been developed.

6 Applied Computational Biology Toolset. POMOUS and SNIDE – polymorphism prediction software. PANORAMA – A DNA/Protein sequence analysis and visualization tool. ARROGANT – A gene/clone collection analysis tool. eTBLAST, FRISC, TRITE, Text similarity tools IRIDESCENT – Text data mining tool ARGH – Acronym resolving software ELXR – Exon locator and extractor for resequencing. Microarray BLAST – UTSW BLAST utility for comparison against EST/cDNA sequences from UTSW microarrays. MarC-V, Signal, SNPCEQer …….

7 Text data mining can speed reduction of data to knowledge. DNA Sequencing Invented Human Genome Project Begun Gap Science (Genome Issue) 15 Oct. 1999

8 Now over 10 million articles in MEDLINE ® 400,000 new articles added each year Over 1 million are in Genetics and Molecular Biology Online biomedical literature is growing rapidly… Most Biomedical results are reported in scientific papers, that are now searchable. Can you keep up with the literature?

9 The differences in PubMed and eTBLAST PubMed is database driven PubMed performs a boolean search into Medline. PubMed returns hits sorted by date (and other). eTBLAST is currently written in C. eTBLAST automatically extracts common words from text and then the remainder are keywords. eTBLAST performs a similarity comparison using weighted keywords eTBLAST returns his sorted by similarity (and soon other).

10 Lets research a topic – Wilms’ Tumor 79) Wilms' tumour A childhood nephroblastoma (solid tumour of the kidney)affecting one in 10 000 children, usually appearing within the first five years of life. A susceptibility to Wilms' tumour is associated with inheritance of defects in several different genes, including the TUMOUR SUPPRESSOR GENE WT-1 which maps to chromosome 11p13. Tumours arise from mesenchymal STEM CELLS that would normally differentiate into parts of the nephron. Around 5-10% also contain ectopic tissues such as bone and cartilage. Wilms' tumour also appears as one of the manifestations of the WAGR syndrome - a CONTIGUOUS GENE SYNDROME. The WT-1 gene has been cloned and encodes a zinc finger protein which is presumed to be a TRANSCRIPTION FACTOR.

11 Entrez is a search and retrieval system that integrates information from databases at NCBI. Follow That Link

12 Our text data mining tools – eTBLAST, FRISC, TRITE eTBLAST (2) – similarity comparison engine for electronic text using weighted keywords, concepts and grammar induction. Psi-eTBLAST is iterative. Example use of natural entry process.eTBLAST(2) Example use of natural entry process. FRISC (2) – using eTBLAST, a UTSW faculty research interests page is checked regularly against new Biomedical abstracts from Medline and ranks to cluster information that best fits interests of researcher.FRISC(2) TRITE (2) (3) (4) – using eTBLAST, topical interests will be searched regularly against new Biomedical abstracts in Medline.TRITE(2) (3) (4)

13 You are being asked to conduct a series of reference checks on a set of topics. Each student will be given 3 topics to research, out of a total of 120 total different topics. You are asked to research the topics, first, in the standard way using keyword-based searches using PubMed over the web, and then using a new code, eTBLAST, that performs keyword identification and searching automatically. You are asked to research each of the topics, reading their titles, abstracts and any other information to determine the relative sensitivity and selectivity of each of the methods for finding the documents. You will also be asked to compare and contrast the results of the approaches based on other criteria, like speed, user interface, etc.

14 What you will be doing. Please obtain floppy disk, instructions and 3 topic sheets. Find a computer, some on North Campus Library, South Campus Library, and the computer training room. Fill in the spread sheet with the results of your search on the topic. Then write a brief paragraph in word to compare and contrast the two methods. Come back to Lacynda’s office (NA2.504) to return the floppy disk. She will then log the disk in, and open it and verify that the files are there and complete. You are then free to go.

15 http://innovation.swmed.edu/


Download ppt "Medline Text Searching Tools – a Comparison Experiment McDermott Center for Human Growth and Development Center for Biomedical Inventions."

Similar presentations


Ads by Google