14 May 2013Ganesha Associates1 Competências Básicas de Investigação Científica e de Publicação Lecture 3: Searching the Literature.

Slides:



Advertisements
Similar presentations
Special Features of Publishers Web Sites. Objectives Review standard features via Elsevier website Identify special features in the websites of the following.
Advertisements

MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the: course is a PowerPoint demonstration intended to introduce.
Search Strategy and Information Retrieval By Rekha Gupta, NIC
Zoology 305 Library Databases/Indexes Lab Goals for session: 1) Meet your librarian Kevin Messner 2) Understand.
Searching for Scientific Information 3rd March 2015 Kirsi Heino.
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
1 CHBE 594 Lecture 14 Data Bases For The Chemical Sciences.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
The Thomson Reuters CITATION CONNECTION Digital Library st March – 3 rd April 2014, Jasná David Horký Country Manager – Central and Eastern Europe.
NATIONAL LIBRARY OF MEDICINE The PubMed ID and Entrez, PubMed and PubMed Central Edwin Sequeira National Center for Biotechnology Information June 21,
Information Retrieval in Practice
 BioMed Central is an STM (Science, Technology and Medicine) database. All articles are reviewed before publishing.  It offers full texts, citations,
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
10 April 2008Copyright: Ganesha Associates Basic reading, writing and informatics skills for biomedical research Segment 4. Other types of database.
Information & Library Services SwetsWise User Guide Emma Crowley Senior Academic Services Librarian
1 Using Scopus for Literature Research. 2 Why Scopus?  A comprehensive abstract and citation database of peer- reviewed literature and quality web sources.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Overview of Search Engines
Accessing journals by via PubMed Note the link to find articles through HINARI/PubMed. Using this option will be covered in later in the Short Course.
PubMed/How to Search, Display, Download & (module 4.1)
New Web of Science Rachel Mangan Customer Education
Gene Ontology Project
24 August 2012Ganesha Associates1 Basic reading, writing and informatics skills for biomedical research Segment 4. Other types of database and browser.
Indexing 1/2 BDK12-3 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
How to do a literature search Saharuddin Ahmad Aida Jaffar Department of Family Medicine.
24 August 2012Ganesha Associates1 Basic reading, writing and informatics skills for biomedical research Segment 3. Using the virtual library.
ISI Web of Knowledge Service for UK Education
JUMPSTART YOUR DISSERTATION TIME SAVING METHODS FOR SEARCHING AND CITING.
The Latest in Information Technology for Research Universities.
IL Step 1: Sources of Information Information Literacy 1.
Gene Ontology (GO) Project
MBBS Hons 2010 Jill McTaggart Joint PA Hospital/UQ Library MBBS Honours Literature Review.
1 How to find literature - A very short introduction SMED 8004 Medicine and Health Library October 2014.
1 Scopus as a Research Tool March Why Scopus?  A comprehensive abstract and citation database of peer-reviewed literature and quality web sources.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Web of Knowledge Service for UK Education April 2007 An Overview Web of Knowledge Support Officer
Bio-Medical Information Retrieval from Net By Sukhdev Singh.
PubMed Overview From the HINARI Content page, we can access PubMed by clicking on Search inside HINARI full-text using PubMed. Note: If you do not properly.
Part 1 – PubMed Interface, Display options, Saving, Printing, and ing results. Instructions This part of the course is a PowerPoint demonstration.
University of Antwerp Library TEW & HI UA library offers... books, journals, internet catalogue -UA catalogue, e-info catalogue databases -e.g.
Competências Básicas de Investigação Científica e de Publicação Lecture 1: Hypotheses and Search Phonoaudiology November /08/2013Ganesha Associates.
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
Click on the tab to find journals by Subjects. From the drop down menu, we will select Parasitology and Parasitic Diseases.
IL Step 3: Using Bibliographic Databases Information Literacy 1.
Thomson Reuters ISI (Information Sciences Institute) Azam Raoofi, Head of Indexing & Education Departments, Kowsar Editorial Meeting, Sep 19 th 2013.
Gene Ontology Project
Competências Básicas de Investigação Científica e de Publicação Physio lecture 1: Introduction, Hypotheses and Search 13/08/2013Ganesha Associates.
Strategies for Conducting Research on the Internet Angela Carritt User Coordinator, Oxford University Library Services Angela Carritt User Education Coordinator,
WISER: Citation searching Web of Knowledge is a powerful way to access the ISI's multidisciplinary citation indexes. It allows you to discover what research.
Accessing journals by via PubMed Note the link to find articles through HINARI/PubMed. Using this option will be covered in later in the Short Course.
3.1. Types of scientific reports and their purpose oral form - it is not obligatory, could be very formal and it can not be used for the justification.
ISI Web of Knowledge Service for UK Education An Overview Suzanne Williams ISI Web of Knowledge Support Officer
Electronic Resources for Psychology Karine Barker and Kate Williams 8 th February 2006.
Bibliometrics toolkit Website: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Further info: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Scopus Scopus was launched by Elsevier in.
Journal Searching Nancy B. Clark, M.Ed. Director of Medical Informatics Education FSU College of Medicine 1 All recourses are available online in Medical.
NCBI Literature Databases: PubMed
Gene Ontology Project
Partner Publishers’ Websites From the Partner publisher services dropdown menu, click on the Elsevier Science - Science Direct website. Note that this.
EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI Information Workshop on European Bioinformatics Resources.
Copyright OpenHelix. No use or reproduction without express written consent1.
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Roger Mills February don’t be evil stand on the shoulders of giants.
WISER: What’s new in Science SCOPUS, SCIRUS and Google Scholar Kate Williams and Juliet Ralph May 2006.
1 e-Resources on Social Sciences: Scopus. 2 Why Scopus?  A comprehensive abstract and citation database of peer-reviewed literature and quality web sources.
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
Jacynthe Touchette, MSI JGH Health Sciences Library
Information Retrieval in Practice
IL Step 3: Using Bibliographic Databases
“In the last few years, we’ve moved from an information-scarce economy to one driven by an information glut. According to Eric Schmidt of Google, every.
Presentation transcript:

14 May 2013Ganesha Associates1 Competências Básicas de Investigação Científica e de Publicação Lecture 3: Searching the Literature

Types of scientific output Abstracts Primary journal articles – peer-reviewed interpretations of original research Reviews Book chapters, monographs Conference proceedings Lectures, seminars Sequences, data sets Patents, other forms of intellectual property Blogs, tweets… 14 May 2013Ganesha Associates2

Usage of output differs 5 July 2012Copyright: Ganesha Associates

Some sources of scientific content Google PubMed/Medline (NLM) Scopus (Elsevier) Web of Science (Thomson Reuters) Google Scholar PubMed Central, PubMed Central Europe SciELO, Biblioteca Virtual em Saude Science Direct, Ovid, SpringerLink, Wiley Online Library, BiomedCentral, Public Library of Science, SWETSwise… CAPES Portal de Periódicos 14 May 2013Ganesha Associates4

Each source is different Free – Google, Google Scholar, Pubmed Central Subscription – Scopus, ScienceDirect Abstracts and citations only – PubMed, Web of Science Full text, single publisher – SpringerLink Full text, many publishers – Pubmed Central, SwetsWise Online Content

Classify sources of content Abstract only Full text Free access Subscription

14 May 2013Ganesha Associates7 You can get access if… The journal is subscribed to by CAPES You have a personal subscription The journal is of the ‘Open Access’ type – Note: some journals only make their content ‘Open Access’ after 6 or longer months. Some journals contain a mixture of OA and non-OA articles. See for more info. Journals in the ‘red’ categories are available anywhere. Most journals subscribed to by CAPES will be available from more than one source. CAPES journals are only available from computers within the University network unless you have remote access privileges.

14 May 2013Ganesha Associates8 So which sources should I use ? No single source contains all of the articles relevant to your research Google has the broadest coverage, but not all of the documents you find will be peer- reviewed articles Scopus, WoS and PubMed give you the best balance between quality and quantity, and, in theory, should link to all the content subscribed to by CAPES, plus OA content.

24 August 2012Ganesha Associates9 So usually you will visit several sources to find the information you are looking for ? GoogleScopus Web of Science PubMedScielo HighWire Science Direct Springer Link National Literature CAPES Portal OA: BMC Or PLoS Other Databases, e.g. NCBI

Components of a bibliographic database Content such as abstracts and full-text articles [or a pointer to where these may be found] Metadata [data about data] Index Search engine Ranking/relevance algorithm Plus many additional features 14 May 2013Ganesha Associates10

14 May 2013Ganesha Associates11 Content (Basic PDF)

14 May 2013Ganesha Associates12 Content (HTML)

14 May 2013Ganesha Associates13 Content (Page source)

14 May 2013Ganesha Associates14 Content (metadata)

14 May 2013Ganesha Associates15 Sources of article metadata Journal name, publisher, ISSN Date of publication, volume and page numbers Document object identifier [DOI] Article title Authors names Address, affiliation, contact details Article section identifiers Sources of funding Semantic tagging, e.g. protein name

The basis of search: Indexing The purpose of an index is to optimize speed and performance in finding relevant documents for a search query. Without an index, the search engine would have to scan every document in the corpus, which would require considerable time and computing power. Metadata helps the indexing algorithm to select different classes of terminology from which to make an index, so a search can be carried out on just the authors names, for example 24 August 2012Ganesha Associates16

Example: Inverted index Suppose we have three short documents: – D[0] = "it is what it is" – D[1] = "what is it" – D[2] = "it is a banana" The inverted index looks like: – "a": {2} – "banana": {2} – "is": {0, 1, 2} – "it": {0, 1, 2} – "what": {0, 1} So a search for “what” would produce a list of two documents, 0 and 1.

Search: stop words and stemming Not all search terms get used, and some others will get modified. Stop words such as “a”, “the”, “this” occur frequently. They are dropped at indexing time and thus ignored at search time. Often a stemming algorithm reduces words to their basic stem, e.g. "fishing", "fished", "fish", and "fisher" to the root word, "fish". Both strategies make search quicker, but at a cost. 14 May 2013Ganesha Associates18

Search: how the result list is ranked Date of publication Relevance – Frequency with which search terms occur in the document – Proximity of search terms Google’s PageRank algorithm uses "link popularity”- a document is ranked higher if there are more links to it 14 May 2013Ganesha Associates19

The question behind the query Search engines think in terms of words, but users think in terms of sentences! – How do you spell Bousfield? – What do we know about BRCA1? – Given these symptoms, what is the most likely diagnosis? – What are the side effects of aspirin? – Has this chemical structure been synthesized before? “Cancer causes X” vs. “Y causes cancer”

24 August 2012Ganesha Associates22 What real queries look like - Google pharmacogenomics and disorders bacteria growth casein media effect waal pseudomonas TRPM2 PCR mouse Chitinases in carnivorous plants glycerophosphoinositol 4-phosphate Dai N, Gubler C, Hengstler P, Meyenberger C, Bauerfeind P. Improved capsule endoscopy after bowel preparation. Gastrointest Endosc 2005;61(1)

24 August 2012Ganesha Associates23 What real queries look like - PubMed ATR1 HAL2 Fuzzy[ALL] AND Hanage[AU] AND 2005[DP] arndt and rhabdomyosarcoma "Vorster HH"[Author] (rotavirus infections[majr] OR rotavirus[majr]) AND english[la] AND humans[mh] NOT (editorial[pt] OR letter[pt])

24 August 2012Ganesha Associates24 Query changes people actually make Query series 1 – latrunculin – latrunculin fm3a cell arrest – latrunculin fm3a arrest – latrunculin fm3a – latrunculin FM3A Query series 2 – cytokinin signalling in arabidopsis – "cytokinin signalling in arabidopsis" – cytokinin delta – spindly arabidopsis Results – Remember to look beyond the first page. Compare the results of Query 1 in PubMed and Google (add the term PubMed)

Improving search accuracy Wild card characters – "a * saved is a * earned" Operators – jaguar speed -car – Pandas -site:wikipedia.org – “ribosome” Synonyms – MeSH terms Boolean terms – AND, OR, NOT Faceted search – GO terms

Anatomy of a query - Pubmed invasive fungal infections in young children invasive[All Fields] AND ("mycoses"[MeSH Terms] OR "mycoses"[All Fields] OR ("fungal"[All Fields] AND "infections"[All Fields]) OR "fungal infections"[All Fields]) AND ("Young Child"[Journal] OR ("young"[All Fields] AND "children"[All Fields]) OR "young children"[All Fields]) 14 May 2013Ganesha Associates26

So… Using the same search terms will produce different results in different databases because: – Content different – Preparation of search terms will be different, e.g. only Pubmed uses MeSH terms – Indexing process, implementation of stemming, removal of stop words will be different – Ranking algorithms will be different

24 August 2012Ganesha Associates29 Search terms - summary Make sure you understand the search term syntax used by your preferred site, i.e. AND, +, “ ”, etc Search engines ‘see’ only certain words, not sentences Do not use ‘stop’ words, i.e. a, the, of, before unless they are part of “a text string search” Try to think of different ways to search for the same subject Look beyond the first page of search results

Other useful features Related articles – PubMed, Scopus Cited by… – WoS, Scopus Saved searches – PubMed, Scopus Reference management – EndNote, Zotero, Mendeley

PubMed Related Articles Algorithm The similarity between documents is measured by the words they have in common A list of 310 common, but uninformative, stop words are eliminated from processing at this stage. Next, a limited amount of stemming of words is done. Words from the abstract of a document are classified as text words. Words from titles are also classified as text words, but words from titles are added in a second time to give them a small advantage in the local weighting scheme. MeSH terms are placed in a third category, and a MeSH term with a subheading qualifier is entered twice, once without the qualifier and once with it. These three categories of words (or phrases in the case of MeSH) comprise the representation of a document. No other fields, such as Author or Journal, enter into the calculations. 24 August 2012Ganesha Associates31

Quick tour

Break

Other types of database Some databases contain mainly text, but others contain image, sequence or structural data The technologies required to search and retrieve these different data types are very different. There is a growing amount of information in publicly available databases. For example, in 2013 the Nucleic Acids Research journal online Molecular Biology Database Collection listed 1512.Molecular Biology Database Collection The National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute(EBI) host some of the most important databases used for biomedical research.NCBIEBI 24 August 2012Ganesha Associates57

Linking different data types is a challenge 24 August 2012Ganesha Associates58 Gene Expression Warehouse ProteinDisease SNP Enzyme Pathway Known Gene Sequence Cluster Affy Fragment Sequence LocusLink MGD ExPASy SwissProt PDB OMIM NCBI dbSNP ExPASy Enzyme KEGG SPAD UniGene Genbank NMR Metabolite

Databases available at NCBI 24 August 2012Ganesha Associates59

Asking meaningful questions is a challenge 24 August 2012Ganesha Associates60

24 August 2012Ganesha Associates61

Other ways to search – BLAST, PubChem, UCSC Genome Browser 24 August 2012Ganesha Associates63 >DinoDNA from JURASSIC PARK p. 103 nt GAATTCCGGAAGCGAGCAAGAGATAAGTCCTGGCATCAGATACAGTTGGAGATAAGGACGGACGT GTGGCAGCTCCCGCAGAGGATTCACTGGAAGTGCATTACCTATCCCATGGGAGCCATGGAGTTCGT GGCGCTGGGGGGGCCGGATGCGGGCTCCCCCACTCCGTTCCCTGATGAAGCCGGAGCCTTCCTG GGGCTGGGGGGGGGCG By sequence – BLAST: By structure – PubChem:

24 August 2012Ganesha Associates64 Example of BLAST search results

PC Compound Record 24 August 2012Ganesha Associates65

UCSC Genome Browser The Genome Browser zooms and scrolls over chromosomes, showing the work of annotators worldwide.Genome Browser The Gene Sorter shows expression, homology and other information on groups of genes that can be related in many ways.Gene Sorter Blat quickly maps your sequence to the genome. Blat TheTable Browser provides convenient access to the underlying database.Table Browser VisiGene lets you browse through a large collection of in situ mouse and frog images to examine expression patterns. VisiGene Genome Graphs allows you to upload and display genome-wide data sets. Genome Graphs 24 August 2012Ganesha Associates66

24 August 2012Ganesha Associates67

For example: adding meta data with a gene ontology There is no universal standard terminology in biology and related domains, and the use of different terms may be specific to a species, research area or even a particular research group. This makes communication and sharing of data difficult. The Gene Ontology project provides an ontology of defined terms representing gene product properties. Useful links: The ontology covers three domains: – cellular component, the parts of a cell or its extracellular environment – molecular function, the elemental activities of a gene product at the molecular level, such as binding or catalysis – biological process, operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. 14 May 2013Ganesha Associates68

24 August 2012Ganesha Associates69 attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. MicroArray data analysis with GO

Learning points 13/08/2013Ganesha Associates Google is a good place to start Learn to use several information resources Modify your search terms during the course of a search session Understand how the results are ranked and don’t just look on the first page