Copyright OpenHelix. No use or reproduction without express written consent1
2 XplorMed Software For Text Mining Abstracts Materials prepared by: Mary Mangan, Ph.D. Updated: Q Version 2
Copyright OpenHelix. No use or reproduction without express written consent3 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent4 Text Mining Abstracts for Relevant Relationships XplorMed: get more from a PubMed search Text mining (statistical) PubMed abstracts XplorMed relevant word associations and context Discover new and more relevant relationships among the literature; use context to make better choices for reading
Copyright OpenHelix. No use or reproduction without express written consent5 Why Use XplorMed? PubMed is a great collection, but search results are often daunting…hundreds or thousands of abstracts Which to read? Will you miss some? Titles + guess? Example query: depression AND hypothyroid
Copyright OpenHelix. No use or reproduction without express written consent6 3 Types of Queries or “Gates”: Yellow, Green, Red yellow r Input: identifiers Original site: Current site: yg Input: MEDLINE query Input: files of saved abstracts
Copyright OpenHelix. No use or reproduction without express written consent7 Overview of an XplorMed Analysis Analysis completed in a series of steps Sample: depression AND hypothyroid Many iterations possible to refine the set of abstracts 1. Start query 2. Select MeSH categories of interest 3. Find related words 4. Context, or iterate… Identify relevant abstracts
Copyright OpenHelix. No use or reproduction without express written consent8 Credits, References & Contact Information Developed at Peer Bork’s lab at EMBL Papers by: Carolina Perez-Iratxeta, Antonio Perez, HS Keer, Miguel Andrade and Peer Bork
Copyright OpenHelix. No use or reproduction without express written consent9 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent10 Yellow Gate: Start with a PubMed Query Yellow: plain PubMed query
Copyright OpenHelix. No use or reproduction without express written consent11 Yellow Gate: Step 1, Options Text search, examples shown Or retrieve a previous search (stored 1 week) Submit: sort abstracts according to MeSH category step 1 enter text; can use Booleans searches stored click to submit marys_query1 See Entrez PubMed documentation, or OpenHelix’s PubMed tutorial
Copyright OpenHelix. No use or reproduction without express written consent12 Tips on Words to Use in Searching Word: use the “lemma” form of a word Lemma: Nouns: the singular form = gene [not genes] Stop words: a list of non-helpful words, which are ignored and, the, or, … analyze, between, only, … TreeTagger (Schmid)
Copyright OpenHelix. No use or reproduction without express written consent13 Sample Query: Depression AND Hypothyroid Yellow gate to use keywords in a PubMed search depression AND hypothyroid Step 1: just enter the text, and click NEXT ACTION enter textclick button
Copyright OpenHelix. No use or reproduction without express written consent14 MeSH Terms - Organizing your Results MeSH categories (Medical Subject Headings) National Library of Medicine (US NIH) Terms assigned to literature by professionals MeSH terms assigned to a record:
Copyright OpenHelix. No use or reproduction without express written consent15 Yellow Gate Step 1 Results Query: depression AND hypothyroid Step 1 results shown MeSH categories; same abstract can be in multiple categories
Copyright OpenHelix. No use or reproduction without express written consent16 Yellow Gate, Next Step: Categories of Interest You can use the whole set of abstracts, or Select a subset of interesting categories with checkboxes Sample: Diseases, Chemicals and Drugs, Psychiatry and Psychology date range, if desired
Copyright OpenHelix. No use or reproduction without express written consent17 Yellow Gate: Resulting Words Related words from abstracts displayed Ranked by score How much the word appears with others
Copyright OpenHelix. No use or reproduction without express written consent18 Association Score Perez-Iratxeta et al, BioTechniques 32(6): 1380 Computing fuzzy associations for the analysis of biological literature Relatedness: Essentially, the ratio of the number of abstracts that contain 2 words, compared to the number where either one or the other occurs. Keywords: Essentially, more relevant words have more strong relations to other words. You can count the co-occurances and create a score.
Copyright OpenHelix. No use or reproduction without express written consent19 Yellow Gate: Resulting Words Related words from abstracts displayed Ranked by Score Click word for context To PubMed
Copyright OpenHelix. No use or reproduction without express written consent20 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent21 Yellow Gate: Other Options Click [R] for relations to other words Click [X] for co- relationships
Copyright OpenHelix. No use or reproduction without express written consent22 Examined Shared Words and Context Using [X] If words appear in the same abstract, results shown Words in same sentence: sentence is BLUE If words are immediately adjacent, MAGENTA Links to complete abstract at PubMed
Copyright OpenHelix. No use or reproduction without express written consent23 Yellow Gate: Next Step = Find Chains From word list set, compute chains Chains are an ordered set of words Alpha: ↑ for fewer connections Score: ↑ for fewer relations Alpha, strength value Score, threshold Click here
Copyright OpenHelix. No use or reproduction without express written consent24 Chains of Related Words 2 chains found in this example Checkbox to proceed, rank by chains Large star denotes a review article
Copyright OpenHelix. No use or reproduction without express written consent25 Chains, with Extra Features Also with the chains, you can get other data types added to your output: OMIM, SwissProt, or SMART Diagrams will highlight available linked items
Copyright OpenHelix. No use or reproduction without express written consent26 Chains, Other Options Collect MeSH terms related to these abstracts Diseases shown, others available Must scroll down past abstracts to see these results
Copyright OpenHelix. No use or reproduction without express written consent27 Iterate Run XplorMed again on your results… Select YES to add “neighbors” of the papers you chose Go to the next level
Copyright OpenHelix. No use or reproduction without express written consent28 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent29 Green Gate: Start with Saved Abstracts If you already have a saved set Medline, EndNote, XML, or XplorMed format Upload file to begin file name/ location Click here on XplorMed homepage
Copyright OpenHelix. No use or reproduction without express written consent30 Input Abstract File Options: Medline Send to: File as “MEDLINE” or “XML” format PubMed Advanced Search: “microtubule AND muscle” Example: microtubule AND muscle, Limit to 1 year
Copyright OpenHelix. No use or reproduction without express written consent31 Sample Query with My Saved File 1. Locate pubmed-results.txt file with Browse button 2. Indicate format used 3. Click “Sort abstracts…” button to submit 1 3 2
Copyright OpenHelix. No use or reproduction without express written consent32 Green Gate: Results with My Sample Saved Set Outcome of green gate search with saved abstracts Click for related words, proceed as in Yellow Gate searches proceed
Copyright OpenHelix. No use or reproduction without express written consent33 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent34 Red Gate: Start with Identifiers You can use a variety of IDs to collect abstracts Examples shown: SwissProt, OMIM, more… Click here on XplorMed homepage
Copyright OpenHelix. No use or reproduction without express written consent35 Red Gate: Example Starting with an OMIM ID Sample query: OMIM , Major Depressive Disorder, MDD
Copyright OpenHelix. No use or reproduction without express written consent36 Red Gate: Results Collection of abstracts, categorized Proceed with subsequent steps as for yellow, green gates proceed
Copyright OpenHelix. No use or reproduction without express written consent37 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent38 Text Mining in Abstracts: Refine Your Searches Input: MEDLINE query Input: Identifiers Input: files of abstracts
Copyright OpenHelix. No use or reproduction without express written consent39 Keywords, Context & Relationships Pinpoint only relevant abstracts
Copyright OpenHelix. No use or reproduction without express written consent40 XplorMed Agenda XplorMed: Introduction & Credits Yellow Gate: PubMed Query Yellow Gate: Relations Green Gate: Stored Abstracts Red Gate: Identifiers Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent41