PubMed Central Archive at the US National Institutes of Health Prepared byMartha R. Fishel Prepared by Martha R. Fishel Deputy Chief, Public Services Division.

Slides:



Advertisements
Similar presentations
Special Features of Publishers Web Sites. Objectives Review standard features via Elsevier website Identify special features in the websites of the following.
Advertisements

NIH Public Access Policy What It Means for Authors and for Universities.
NATIONAL LIBRARY OF MEDICINE PubMed Central Edwin Sequeira National Library of Medicine May 26, 2004.
A Guide to PMCID numbers Anca Geana, MBA, CRA – May 2012.
NIH Public Access Compliance Cleveland Health Sciences Library Case Western Reserve University Kathleen C. Blazar.
Research Performance Progress Report (RPPR) Grantees may access a list of progress reports that are due using the Status page in eRA Commons, and selecting.
NIH Public Access Policy and Maintaining Publications in eRA Commons.
PubMed.
NIH Public Access Policy- Update Karen M. Albert, MLS, AHIP.
Christina Hansen, Assistant Vice Chancellor Bob Johnson, Research Librarian for Nursing & Allied Health May 2008 NIH Public Access Policy UCI Libraries.
The NIH Public Access Policy and Compliance Requirements Karen McElfresh, MSLS Temporary TraCS Librarian Health Sciences Library August 7,
1 Get Up to Speed on the NIH Public Access Policy UNC-CH Health Sciences Library
Engineering Your Publication for the Future: Putting the NIH Mandate into Practice Martin Frank, Ph.D. Executive Director, APS Coordinator, DC Principles.
NATIONAL LIBRARY OF MEDICINE PubMed Central and the NLM Journal Archiving Vocabulary.
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
PubMed Review Medical Library Association Annual Meeting May 20 – 22, 2007 Philadelphia.
PubMed and its search options Jan Emmerich, Sonja Jacobi, Kerstin Müller (5th Semester Library Management)
NIH Public Access Policy Gary Byrd, PhD Linda Hasman, MSLS Health Sciences Library University at Buffalo State University of New York Gary Byrd, PhD Linda.
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
NIH PUBLIC ACCESS POLICY NIHMSID, PMCID, PMID OBJECTIVE When the National Institutes of Health (NIH) Public Access Policy became law on April 7, 2008 several.
Deborah L. Smith, Ed.D. Research Affairs Brenda F. Green, MLS Associate Professor/Coordinator UTHSC Library May 20, 2008 NIH Policy on Public Access and.
PubMed Central ANCHASL Spring Meeting April 1, 2005 Robert James Associate Director of Public Services Duke University.
Using MyNCBI for Training Grants Stephanie Scott Communications and Outreach Director Sponsored Projects Administration (SPA) Dina Matsoukas Head of Reference.
The NIH Public Access Requirement March 2014 Tara Douglas-Williams, MSLS.
NIH Public Access Policy Bethany R. Harris, MSI Research Librarian for Health Sciences Sponsored by the UCI Libraries’ Digital Services Operations Team.
Complying with the NIH Public Access Policy Lisa Oberg, M. Libr. Associate Director for Public and Research Services.
PubMed Central Mahyar Ahmadpour-B. Kowsar Publicatin Corp. Kowsar Editorial Meeting 1 September 19th, 2013 Tehran, Iran.
1.
____________________________________________________________________ _____________________________________________ WEILL CORNELL MEDICAL LIBRARY
NIH Public Access Policy What it means to OHSU Researchers Presented by: Andrew Hamilton Date: 10/22/2009.
PubMed Central Update Jennifer Jentsch Medical Library Association Conference May 2006.
PubMed Central Update Mark R. Desierto MLA Conference May 2007.
NATIONAL LIBRARY OF MEDICINE PubMed Central Brooke Dine National Library of Medicine Medical Library Association Conference May 2004.
NATIONAL LIBRARY OF MEDICINE PubMed Central Brooke Dine National Library of Medicine Medical Library Association Conference May 2005.
NATIONAL LIBRARY OF MEDICINE NLM Journal Archiving and Interchange Tagset Jeff Beck National Center for Biotechnology Information National Library of Medicine.
NATIONAL LIBRARY OF MEDICINE PubMed Central Martha Fishel National Library of Medicine CENDI Meeting September 15, 2004.
Bookshelf Leafing through XML NLM Journal Article Tag Suite Conference 2010 Martin Latterner and Marilu Hoeppner National Center for Biotechnology Information.
Contents and Formats Existing Digital Sources Gertraud Griepke Cornell University, July 26th 2002.
1 NIH PUBLIC ACCESS POLICY Overview Office of Research & Sponsored Programs Compliance Subgroup 1, 2 & 3 Meeting April 1, 2008.
NIH Public Access Policy What it means to OHSU Researchers Presented by: Andrew Hamilton Date: 3/18/2007.
1 NIH Public Access Policy Policy on Enhancing Public Access to Archived Publications Resulting From NIH-Funded Research (Public Access Policy)
American Medical Association Journals include: JAMA (journal of the American Medical Association.), Archives of surgery, Archives of ophthalmology and.
Accessing journals by via PubMed Note the link to find articles through HINARI/PubMed. Using this option will be covered in later in the Short Course.
Introductory Overview
NCBI Webinars Closed captioning: and enter www.captionedtext.com All content, including a video recording, will be available.
THE NIH PUBLIC ACCESS POLICY A How-to guide By Nick Farris.
Complying with the NIH Public Access Policy: Depositing manuscripts in PubMed Central Julie Speer, Lori Critz, Michelle Powell Office of Organizational.
Complying with the NIH Public Access Policy: From Soup to Nuts
Cambridge Journals Online (CJO). CJO – E Publishing Service Content Delivery Site Administration Online Production Online Marketing and Promotion Customer.
PubMed/How to Search, Display, Download & (module 4.1)
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
1 J-STAGE Electronic Journal Publication & Dissemination Center
Open Access and the Wellcome Trust: providing funds for open-access publishing Kathryn Lallu Grants Policy, Liaison and Support Manager Grants Administration.
Accessing journals by via PubMed Note the link to find articles through HINARI/PubMed. Using this option will be covered in later in the Short Course.
BERNARD BECKER MEDICAL LIBRARY Washington University School of Medicine December 2008 The NIH Public Access Policy How to Demonstrate Compliance RA Forum.
Manuscript submissions in support of the NIH Public Access Policy Rebecca Wilson and Bart Trawick National Center for Biotechnology Information MLA 2007.
FEDERAL UPDATE Jeff Warner Senior Contract and Grant Officer Alisia Ford Contract and Grant Officer Spring QRAM - Tuesday, March 12, 2013.
ACCESS TO THE VISION LITERATURE SUPPORTING INTERNATIONAL COLLABORATIONS: NEW CHALLENGES, NEW OPPORTUNITIES PAMELA C. SIEVING¹ AND BETTE ANTON² FOR THE.
Partner Publishers’ Websites From the Partner publisher services dropdown menu, click on the Elsevier Science - Science Direct website. Note that this.
NIH Considerations for CBI Trainees Leslie Kinsland November 12, 2015.
Access to Research Data: NIH Public Access and PMC International Seminar on Open Access for Developing Countries 21 September 2005 Jane Bortnick Griffith.
1 The NIH Public Access Requirement [short presentation] June 2013.
1 The NIH Public Access Requirement [short presentation] November, 2009.
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
The National Library of Medicine and its databases a PhD Lívia Vasas February.
NATIONAL LIBRARY OF MEDICINE PubMed Central, an XML-based Archive of Life Sciences Journal Articles (at the US National Library of Medicine) Jeff Beck.
The National Library of Medicine and its databases
Accessing journals by Language 4
Updated NIH Public Access Policy
Presentation transcript:

PubMed Central Archive at the US National Institutes of Health Prepared byMartha R. Fishel Prepared by Martha R. Fishel Deputy Chief, Public Services Division Presented byBecky J. Lyon Presented by Becky J. Lyon Deputy Associate Director, Library Operations 10 th International Conference of Medical and Health Librarians Cluj-Napoca, Romania September 19, 2006

Focus of today’s Presentation 1. What is PubMed Central? 2. What material is included? (current content, scanned back files, Manuscripts, books) 3. How does the content get added? 4. How is it used? 5. Value added features

W hat is PubMed Central? Digital archive of life sciences journals Digital archive of life sciences journals –includes health policy, bioinformatics and other fields Participation is open to journals that are: Participation is open to journals that are: –covered by a major abstracting/indexing service –or, that have 3 editorial board members with current grants from major non-profit funding agencies Free access to full-text articles and supporting data Free access to full-text articles and supporting data Integrated with PubMed and other bibliographic and factual databases in NCBI’s Entrez network Integrated with PubMed and other bibliographic and factual databases in NCBI’s Entrez network

PMC Basic Policy Journal deposits an authoritative electronic copy that meets PMC data quality standards Journal deposits an authoritative electronic copy that meets PMC data quality standards –full-text XML –original high-resolution graphics –PDF –supplementary data Journal may delay free access (up to 2 years) Journal may delay free access (up to 2 years) –research articles usually free in one year or less Copyright is retained by publisher or author Copyright is retained by publisher or author Deposits – and free access permissions – are permanent Deposits – and free access permissions – are permanent –journal may stop depositing new material but may not withdraw material already deposited

PMC Numbers – August 2006 Articles available: 696,000 Articles available: 696,000 –60 percent from back issue digitization –Oldest content – Trans Am Ophthalmol Soc – is from 1865 –Total items incl. issue covers, corrections, etc.: 758,000 Unique IP addresses: 2.07 million Unique IP addresses: 2.07 million –Estimated unique users (1.5x): 2.35 million Articles retrieved – HTML full text, PDF, or scanned article summary: 7.7 million Articles retrieved – HTML full text, PDF, or scanned article summary: 7.7 million Total page views, incl. searches: 11.5 million Total page views, incl. searches: 11.5 million

How content gets in 1. SGML/XML from Publishers 2. Back Issue Scanning 3.NIH Manuscripts from Publ i c Access 4. Books and other non-article content

Current Content – XML format 1. Each journal workflow is different, and frequency depends on publishing cycle of the journal 2. PMC accepts SGML or XML – must meet certain criteria 3. Problems are often present (e.g. special characters that must be mapped to NLM DTD) 4. Time consuming and error prone to adjust data in every issue

Text Processing Source SGML Source XML OpenSX XML Resolve Named Character Entities Parse Source-specific XSL Transform to PMC Style Validate with PMC StyleChecker Load to PMC QA These steps can take a lot of time and cause NLM to reject or send content back for rework. All articles must pass validation and style checker

1. SGML/XML from Publishers 2. Back Issue Scanning 3.NIH Manuscripts from Publ i c Access 4. Books and other non-article content How content gets in

Back Issue Digitization Objective: Create a complete digital archive of PMC journals back to volume 1 Objective: Create a complete digital archive of PMC journals back to volume 1 Cover-to-cover digital copy of everything up to where journal began producing electronic copy Cover-to-cover digital copy of everything up to where journal began producing electronic copy –(includes articles, covers, TOCs, advertisements and administrative matter) Publisher gets free, unencumbered copy Publisher gets free, unencumbered copy

Scanning Highlights Currently in Production: American Journal of Public Health from 1873 American Journal of Public Health from 1873 Biophysical Journal (1960) Biophysical Journal (1960) Canadian Veterinary Medical Association Titles Canadian Veterinary Medical Association Titles Environmental Health Perspectives (1972) Environmental Health Perspectives (1972) Health Services Research (1966) Health Services Research (1966) Public Health Reports (1896) Public Health Reports (1896) Transactions of the American Ophthalmological Society (1864) Transactions of the American Ophthalmological Society (1864)

Production Planning Contractor’s capacity at 5 facilities is approximately 250,000 to 300,000 pages per month Contractor’s capacity at 5 facilities is approximately 250,000 to 300,000 pages per month Production is currently planned through December 2006 Production is currently planned through December 2006 Schedules are adjusted monthly based on deliveries Schedules are adjusted monthly based on deliveries New shipments are sent from NLM every 6-8 weeks New shipments are sent from NLM every 6-8 weeks

Progress to Date 83 Journal Titles Included 83 Journal Titles Included 5,606,059: Pages collected for processing 5,606,059: Pages collected for processing 3,500,000: Page images delivered 3,500,000: Page images delivered 297,000: XML Citations created 297,000: XML Citations created 470,000: Scanned articles in PMC 470,000: Scanned articles in PMC

Digitized Samples

Wellcome Trust Collaboration September Cooperative agreement signed September Cooperative agreement signed Expect to complete an additional 2.7 million pages Expect to complete an additional 2.7 million pages –Biochemical Journal - Largest archive to date – 350,000 pages scanned – Annals of Surgery –British Journal of Pharmacology –Journal of Physiology –Journal of Anatomy

Challenges To Date Locating old, rare copies in good condition Locating old, rare copies in good condition Assessing donations for completeness (covers, TOCs) Assessing donations for completeness (covers, TOCs) Scanning and delivering fill-in pages at NLM Scanning and delivering fill-in pages at NLM Feeding the pipeline Feeding the pipeline Quality Assurance (understanding requirements) Quality Assurance (understanding requirements) Information tracking Information tracking Every title is different Every title is different

Related Activities Citations for scanned articles are being phased into PubMed Citations for scanned articles are being phased into PubMed Some completed archives delivered to their publishers Some completed archives delivered to their publishers –ASM titles (Highwire) –Plant Physiology (Highwire) –Biochemical Journal (Portland Press)

1. SGML/XML from Publishers 2. Back Issue Scanning 3.NIH Manuscripts from Publ i c Access 4. Books and other non-article content How content gets in

NIH Public Access Program The National Institutes of Health (NIH) Policy on Enhancing Public Access to Archived Publications Resulting from NIH- Funded Research (Public Access Policy), which took effect on May 2, 2005, requests and strongly encourages all investigators to make their NIH-funded peer-reviewed, author's final manuscript available to other researchers and the public through the NIH National Library of Medicine's (NLM) PubMed Central (PMC) immediately after the final date of journal publication. The NIH has developed a password-protected, Web-based, NIH Manuscript Submission (NIHMS) system to implement the NIH Public Access Policy. PubMed CentralNIH Manuscript SubmissionPubMed CentralNIH Manuscript Submission

NIH Manuscript Submission System Author deposits began May 2, 2005 Author deposits began May 2, 2005 Voluntary submissions by NIH funded authors Voluntary submissions by NIH funded authors Third party deposits began in July 2005 Third party deposits began in July 2005 August Just under 5% of qualifying authors are submitting manuscripts August Just under 5% of qualifying authors are submitting manuscripts

NIH Public Access Policy - Status As of February 10, 2006, two bills are in Congress that mandate participation in the NIHMS As of February 10, 2006, two bills are in Congress that mandate participation in the NIHMS Neither bill is expected to pass in this legislative session Neither bill is expected to pass in this legislative session

NIH Manuscript Submission

1. SGML/XML from Publishers 2. Back Issue Scanning 3.NIH Manuscripts from Publ i c Access 4. Books and other non-article content How content gets in

Bookshelf The books may be accessed in two ways: (1) searched directly using any search term or phrase (in the same way as the bibliographic database PubMed); More... (1) searched directly using any search term or phrase (in the same way as the bibliographic database PubMed); More...More... (2) found through links to PubMed abstracts. Each PubMed abstract has a "Books" button that displays a facsimile of the abstract, in which some phrases are hypertext links. (2) found through links to PubMed abstracts. Each PubMed abstract has a "Books" button that displays a facsimile of the abstract, in which some phrases are hypertext links.

NCBI Bookshelf NLM prefers the content of the book to be supplied in SGML NLM prefers the content of the book to be supplied in SGML The book text files are converted into XML according to the NCBI Book Document Type Definition (DTD) The book text files are converted into XML according to the NCBI Book Document Type Definition (DTD)

PMC – Added Value Separate, high resolution images for all illustrations Separate, high resolution images for all illustrations Links from the PMC article to its bibliographic citation in PubMed Links from the PMC article to its bibliographic citation in PubMed Links from the references in the bibliography to the citation in PubMed Links from the references in the bibliography to the citation in PubMed Links from the original article to corrections and retractions and vice versa Links from the original article to corrections and retractions and vice versa Links to PubMed related articles Links to PubMed related articles Links to PubMed articles by each author Links to PubMed articles by each author Links to related resources – such as chemical compounds and protein sequences Links to related resources – such as chemical compounds and protein sequences

PMC International Collaboration between NLM, publishers in PMC and international partners Collaboration between NLM, publishers in PMC and international partners Portable PMC (pPMC) Portable PMC (pPMC) Literature Archiving Software Suite Literature Archiving Software Suite –pPMC –NLM XML DTD Suite –NLM XML Authoring Tool –Portable NIHMS (pNIHMS)

PMC News – August 2006 Springer ‘Open Choice’ and Blackwell ‘Online Open’ articles now coming in to PMC Springer ‘Open Choice’ and Blackwell ‘Online Open’ articles now coming in to PMC Also working with OUP ‘Oxford Open’ Also working with OUP ‘Oxford Open’ Detailed tagging guidelines released for NLM Journal Publishing DTD Detailed tagging guidelines released for NLM Journal Publishing DTD Library of Congress and British Library are adopting NLM Journal DTD as a standard Library of Congress and British Library are adopting NLM Journal DTD as a standard

Links PubMed Central PubMed Central NLM DTDs and documentation NLM DTDs and documentation

Thank you! Contact: Martha Fishel

Costs Costs vary title to title – factors include: Quantity of color and or grayscale images vs. straight black & white text Quantity of color and or grayscale images vs. straight black & white text Quantity of new xml citations prepared Quantity of new xml citations prepared Errors in the deliverables (e.g.image quality, accuracy of xml, OCR) Errors in the deliverables (e.g.image quality, accuracy of xml, OCR) Media (number of DVDs) Media (number of DVDs)

Costs – real example Biochemical Journal – (scanned ) 287,000 pages Total cost=$152,000 USD* Approximately $.053 per page *Excludes project management costs at NLM and Wellcome/JISC

Sample Pages