Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Perspective Paul Price Dow Chemical Company

Similar presentations

Presentation on theme: "A Perspective Paul Price Dow Chemical Company"— Presentation transcript:

1 A Perspective Paul Price Dow Chemical Company

2 Publications are changing Leather-bound journals and dedicated libraries, the format of the scientific paper, weird abbreviations (Tox. & App. Pharm.) Recent on the need for packing materials Dump the filing cabinets - PDF/HTML replaces paper (free color!) Paper journals are evolving into curated web sites Upsetting the status quo – – No technical reason for not sharing detailed technical findings

3 Sharing data Ethical issues for not sharing – Privacy of individuals Economic reasons for not sharing – Intellectual property rights – Charging for access: the economics of journals and data owners – Academics: My career depends on mining my data on my schedule Internet-based expectations – I expect to see everything from home using my web browser

4 Social contracts Permission to sell is contingent on demonstrating safety Credence for findings is less contingent on peer review and more contingent on sharing relevant data Science that supports regulatory decisions needs to be in the sunlight

5 Parting thought When I share data I am asking the world “can someone do a better job then me in understanding the data?” When I withhold data I am saying “no one can do a better job then me in understanding the data” Therefore journals should require the sharing of raw data as a condition or publication

6 Data Access: Issues and Opportunities Alan F. Karr National Institute of Statistical Sciences February 13,

7 Points for Discussion The problem is hard – Players are responding rationally to incentives – Not “one size fits all” “The data” is ill-defined “Availability” is vague: what about – Cost – Liability – Tech support – Co-authorship – Data subjects Reproducibility (data + code) vs. replicability (data only?) There are effective mechanisms for access, based on statistical disclosure limitation 7

8 The Analysis Matters 8

9 Data Dissemination: High-Level View 9

10 Should Journals Require the Release of Supporting Data as a Condition of Publication? Jane C. Schroeder, DVM PhD Science Editor, Environmental Health Perspectives

11 11 No.

12 12 Why is access to raw data desirable? To advance scientific knowledge Is it a given that access to raw data will advance knowledge?

13 How would access advance knowledge? 1. Identify unintentional errors Data entry errors, transcribing, labeling Errors in coding, misconstrued variables Copy editing errors –Some can be identified by a careful review of reported results –Avoid via documentation, data management, internal review –Some would require truly raw data 13

14 How would access advance knowledge? 2. Identify scientific misconduct If the perpetrator is competent, unlikely to be evident If not competent, likely to be multiple cues –Plagiarism, inconsistent logic, incredible findings If access to raw data is the only way to prevent fraud, we are in trouble 14

15 How would access advance knowledge? 3. Identify “errors” in decision-making Such “errors” may represent legitimate differences –There is no single “best way” to analyze data However, decision-making should be completely transparent 15

16 How would access advance knowledge? 4. Reduce the time from data collection to full dissemination Investigators must be able to recoup their investment of time and effort –Loose jobs  no data for anyone Confidentiality, informed consent agreements 16

17 What should journals do? Careful & detailed reviews, including requests for code, data when appropriate Require complete methods –Rationale/criteria for decisions –Information on data management, QA/QC Require information to assess study quality –Missing data, participation, drop-out, numbers of observations 17

18 What should journals do? Require full reporting of all results used to support key analytic decisions and conclusions –Essential when interpretation is subjective or criteria are not widely accepted –Null findings as well as positive ones –Sensitivity analyses of assumptions, alternate approaches –Supplemental material, external archiving Review and update policies when it is in the best interest of science communication to do so 18

19 What should the community do? Discipline-appropriate standards for data management, QA/QC, and reporting Bona fide internal reviews before publication Support for costs of data sharing Encourage and reward analyses of combined data from multiple studies Avoid regulations that may ultimately impede scientific advancement by serving some members of the community at the expense of others 19

20 Introducing the Dryad Digital Repository Society of Toxicology webinar February 2013 Peggy Schaeffer 20

21 Many journals require data sharing upon request Psychology – Requested data from 141 articles – “6 months later, after … 400 s, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” data was obtained from 27% of articles. – Wicherts et al. (2006). Am. Psych. 61: Genetics – 47% of respondents denied a request for data or materials w/in 3 yrs – 28% unable confirm others’ published research as a result. – #1 reason for data withholding (80%): effort required to share it. – Campbell et al. (2002) JAMA (4):

22 Data archiving has many benefits Modified from Beagrie et al. (2009) Keeping Research Data Safe 2 Direct Verification of published research Preserving accessibility to data Allowing reuse and repurposing of data Discoverability of data Indirect (costs avoided) Redundant data collection Inefficient legacy data curation Burden of sharing-upon-request Opportunity cost of science not done Near term Protection against personnel turnover Availability for review and validation Long term Secure long-term stewardship Increased impact per publication Private Increased citations New collaborations New research opportunities Fulfilling funding mandates Public More efficient use of research dollars Public trust in science Educational opportunities Improved methodologies More informed policy

23 Joint Data Archiving Policy [Journal] requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as [list of approved archives here]. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species.

24 Why use Dryad rather than Supplementary Online Materials? DryadSOM Discoverable: indexed and exposed to both web and bibliographic search engines ✔✗ Identifiable: DataCite DOIs within articles serve as permanent, resolvable identifiers ✔✗*✗* Permanent: processes in place to promote preservation (incl. format migration) ✔✔ / ✗ ** Curated: quality control by both automated processes and human inspection ✔✗*✗* Ease of deposit: streamlined deposit, allowance for large and complex datasets ✔✔ / ✗ ** Formatted for reuse: support for non-PDF file formats ✔✔ / ✗ ** Updatable: new versions of data files can be added, metadata can be enhanced ✔✗ Support for embargoes: can delay release of data in accordance with journal policy ✔✗ Free reuse: no paywall, clear terms of reuse (all data released under CC Zero) ✔✔ / ✗ ** Economy of scale: cost efficiency from shared infrastructure ✔✔ / ✗ ** Alignment to organizational mission: focus on archiving and reuse of scientific data ✔✗ * A few publisher SOM sites are exceptions to the general rule. ** Practices differ among publishers, see Smit (2011), doi: /january2011-smit

25 Researchers are using Dryad for data archiving… As of 7 Feb-2013, Dryad contains 7306 data files associated with 2662 publications from 191 different journals

26 and using the data for research…

27 Over 25 integrated journals.. and 20 more on the way

28 Trustworthy repository infrastructure  Making data available is the primary mission of the organization  No pay-walls or restrictive licenses (all released under CCZero)  The same data may be hosted by other services (non-exclusivity)  Built on the DSpace repository platform  An open source framework used by hundreds of institutional repositories  Multiple machine and human interfaces for discovery and access  Dublin Core metadata harvestable through OAI-PMH  DOIs registered through DataCite  Curation-enhanced metadata to enhance keyword searching  Indexed by Web of Science and other bibliographic services  Assurance of data integrity and permanent availability  Service mirroring and backup  File migration and bit-level integrity assurance  Organizational failover through DataONE and (soon) CLOCKSS

29 Governance  Not-for-profit organization  Incorporated in North Carolina (USA)  Membership is open to a diversity of stakeholder organizations  Scientific societies, publishers, funding agencies, universities, libraries, etc.  Members need not publish a partner journal  Governed by a rotating 12-member Board of Directors, nominated and elected by the membership

30 Sustainability  Long-term preservation requires an organization with a viable business model  Not dependent on the vagaries of grant funding  Or the largesse of an institution that may have other priorities  Revenue will be primarily from deposit fees  This enables Dryad to make access to the data free in perpetuity  The time of deposit is when the majority of costs are incurred  Revenue scales with costs (i.e. volume of deposits)  The costs are distributed both fairly and widely  Additional revenue  Membership fees ($1000/yr) will cover costs of annual Membership meetings  Project grants will supplement the operational budget for R&D activities  With research and development activities funded by grants at various institutions (e.g. Duke University, Univ. of North Carolina at Chapel Hill)

31 Payment plans PlanContract?Paid byNon-member Cost 1 SubscriptionyesJournal, society, or publisher, in advance Based on total annual volume of research $30/article Deferred payment yesJournal or other sponsoring organization, invoiced periodically for prior deposits $75/data package 2 Voucher yesJournal or other sponsoring organization, paid in advance $70/data package Pay on deposit noAuthor, at time of deposit$80/data package, with a process for granting waivers for authors from less-developed countries 1 Up to a fixed deposit size (currently 10GB). Additional charges for larger deposits. 2 Data package = all the data associated with an article.

32 The value proposition  For researchers, Dryad…  increases the impact of, and citations to, published research  preserves and makes available others’ data  frees researchers from the burden of data preservation and access  For societies, journals, and publishers Dryad…  offers more visibility for research outputs  promotes prestige for the discipline  supports a wide range of journal policies on data sharing  frees journals from the burden of maintaining supplemental data  For libraries and institutions, Dryad…  makes data available at no cost, under clear terms of use  helps fulfill their research data management mandates  For funders, Dryad…  provides a cost-effective mechanism to make research more accessible

33 To learn more Repository home: News: Project documentation: Facebook: contact us: Todd Vision, Project Director, Laura Wendell, Executive Director, Peggy Schaeffer, Communications Coordinator,

Download ppt "A Perspective Paul Price Dow Chemical Company"

Similar presentations

Ads by Google