ISTeC Data Management Forum #3 Pat Burns Dean of CSU Libraries & VP for IT Friday, May 2, 2014 05/02/2014 ISTeC DM Forum31.

Slides:



Advertisements
Similar presentations
E-resources Collection Management Anna Grigson E-resources Manager.
Advertisements

Big Data Forum April 18, 2013 Beth Oehlerts Digital Management Librarian Nancy Hunter Coordinator of Acquisitions and Metadata Services.
CrossRef Linking and Library Users “The vast majority of scholarly journals are now online, and there have been a number of studies of what features scholars.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
TDL Labs Partnerships for Exploration Luis Francisco-Revilla, Unmil P. Karadkar School of Information The University of Texas at Austin.
Promoting Open Digital Scholarship - A Canadian Library Perspective Leila Fernandez Rajiv Nariani Marcia Salmon York University Libraries, Canada.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
1 CS 502: Computing Methods for Digital Libraries Lecture 27 Preservation.
Open Dialogue on Digital Data management
CSU ResearchCloud Proposal Scott Baily Director, ACNS.
Research Data Service at the IT Pro Forum HEIDI IMKER, DIRECTOR.
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
Harlan Onsrud Department of Spatial Information Science and Engineering University of Maine, USA Atibaia (Sao Paulo) Brazil 8 May 2007 Legal Constraints.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
CrossRef, DOIs and Data: A Perfect Combination Ed Pentz, Executive Director, CrossRef CODATA ’06 Session K4 October 25, 2006.
Presenter: Karla Strieb Assistant Executive Director Transforming Research Libraries June 3, 2010 Supporting E-science: Progress at Research Institutions.
Evolving Roles in Scholarly Communications Susan Reilly, APA, Frascati, 7th Nov, 2012.
Open Journal Systems Project or The UB Libraries as Publisher: Information and Roles for Liaison Librarians Charles Lyons Library Liaison Summit December.
Libraries as Partners in Research: the UC Curation Center’s Tools and Services UC3 Team University of California Curation Center California Digital Library.
Sociologists for Women in Society: Open Access Publishing Panel Julie G. Speer Summer Meeting 2010 August 15, 2010.
1 Developing a Data Management Plan C&IT Resources for Data Storage and Data Security Patrick Gossman Deputy CIO for Research January 16, 2014.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
13 September 2012 The Libraries’ Role in Research Data Management: A Case Study from the University of Minnesota Meghan Lafferty, Chemistry, Chemical Engineering,
Dataset Citation: From Pilot to Production Mark Martin Assistant Director, Office of Scientific and Technical Information U.S. Department of Energy.
Finding the Right Digital Resources Rick Johnson E-Research and Digital Initiatives March 12, 2013.
The New Digital World and the Transformation of Information and Libraries Patricia L. Thibodeau Associate Dean Library Services & Archives Oct. 26, 2011.
Research Data Management Victoria University Context Lyle Winton Adrian Gallagher Julie Gardner.
Latrobe.edu.au CRICOS Provider 00115M Panel discussion: Support needs of academics, roles for librarians and skill sets needed Simon Huggard, Digital Infrastructure.
The DOI Standard Nettie Lagace NISO Associate Director for Programs CEAL Workshop on Electronic Resources Standards and Best Practices March.
Collection Management Strategies in a Digital Environment Cecily Johns CMI Project Director August 2001.
32. 2 “The Obama Administration is committed to the proposition that citizens deserve easy access to the results of scientific research their tax dollars.
Data Management Recommendation ISTeC Data Management Committee.
VIVO and Scholarly Repositories: Synergistic Opportunities.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
ScholarSpace & Open UH Mānoa March 2013 Beth Tillinghast Web Support Librarian ScholarSpace & eVols Project Manager UHM Library.
Data Management & the Library. FACT #1 Research is increasingly digital and produces digital data.
EBSCO Information Services The Changing Nature of Collection Management in the Digital Environment: From Independence to Interdependence Dan Tonkery VP.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
What are libraries doing? What can you do? LIBRAS Annual Meeting: Scholarly Communication May 22, 2008 Caroline Sietmann, Dominican University.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Updates from the library. ORCiD Update
AACP Annual Meeting #RxOA #PharmEd14.  What is Open Access?  Spencer D. C. Keralis Research Associate  Institutional Repositories.
Research Cloud Proposal & Globus Scott Baily, ACNS.
DATA: The Issues From A Publisher’s Perspective Matt Day Database Publisher Nature Publishing Group
LIS2670 Digital Libraries in Their Communities Module 11: Social Platforms Part A Karen Calhoun Library and Information Science Program.
Leveraging the Expertise of our Staff and the Information Resources We Manage MIT Libraries Visiting Committee April 13, 2005.
Managing ETDs with Associated Complex Digital Objects Gabrielle V. Michalek Director, Scholarly Publishing, Archives and Data Services Carnegie Mellon.
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
WiFi Open Forum Pat Burns, VPIT; Greg Redder, ACNS & Lance Li Puma, Chair UTFAB Monday, Feb. 23, 2014 Morgan Library Events Hall 5-6:30 PM WiFi.
Networked Information Resources Federated search, link server, e-books.
Writing a successful data management plan Kathleen Fear October 17, 2013.
Redefining the Library’s Role through an Institutional Repository Sharon Mader, Dean Jeanne Pavy, Scholarly Communications Librarian Earl K. Long Library.
YOUR TITLE HERE Courtney Matthews, Digital Repository Librarian Web Advisory Committee April 20, 2016 uwspace.uwaterloo.ca Library Scholarly Communications.
Todd Quinn – Business & Economics Librarian
Elke Dawson Manager – Open Research
Emphasize “scholarly” and “universities” to distinguish TDL from other efforts. A digital infrastructure for the scholarly activities of Texas universities.
Current as of April/May 2013
Open Research Data and Open Access publications: How do they sit in the Web of Science? Guillaume Rivalle, Manager, Europe solution specialists
Fresno State Digital Repository
GISELA & CHAIN Workshop Digital Cultural Heritage Network
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Promoting and Preserving FIU Research and Scholarship
A robust scholarly repository that puts your UVA research center stage
GFBio – Education module
Managing ETDs with Associated Complex Digital Objects
What Are Institutional Repositories?
DIGITAL LIBRARY.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Presentation transcript:

ISTeC Data Management Forum #3 Pat Burns Dean of CSU Libraries & VP for IT Friday, May 2, /02/2014 ISTeC DM Forum31

 “A dangerous, foreboding, or deathlike influence or vapor.” 05/02/2014 ISTeC DM Forum32

 “We are drowning in information and starving for knowledge.” – Rutherford D. Rogers 05/02/2014 ISTeC DM Forum33

 Understand needs for data management  Understand support for data management ◦ Local IT infrastructure ◦ Central IT infrastructure: Globus ◦ Libraries’ Digital Repository, DM support ◦ Other, external  Develop an institutional approach/strategy? 05/02/2014 ISTeC DM Forum34

05/02/2014 ISTeC DM Forum35 Data Understanding Knowledge Information Wisdom IT Systems: ‘Doing’ Increasing access to data = Increasing research productivity? Innovation!

Analysis Physical Experiment Numerical experiment Big Data 05/02/2014 ISTeC DM Forum36 The 4 th Way of Doing Science?

 Agencies with > $100m research funding must require federally- funded research to be made publically available, including data sets 05/02/2014 ISTeC DM Forum37 Pubs & Data

Journal Publication w/ Data Sets Journal Subscription Journal Access Research (Grants) 05/02/2014 ISTeC DM Forum38 Faculty Libraries IT: Networks, Storage, Access Mgmt.

05/02/2014 ISTeC DM Forum39

05/02/2014 ISTeC DM Forum310 Working Data Sets Scholarly Data Sets (linked to pubs) ✔ Locality

 Data sets, by themselves, are not “creative works,” and are therefore not copyrightable ◦ But, publications and user’s manuals describing them are 05/02/2014 ISTeC DM Forum311

 “I’d sooner show someone my underwear, than share my data sets with them!” 05/02/2014 ISTeC DM Forum312

05/02/2014 ISTeC DM Forum313

 Unfunded mandate for us! ◦ Faculty ◦ RA’s ◦ Students ◦ IT ◦ Libraries 05/02/2014 ISTeC DM Forum314

NSF is SERIOUS About Data Sharing 05/02/2014 ISTeC DM Forum315

05/02/2014 ISTeC DM Forum316 Required as of Jan. 18, 2011 Sharing data will speed research and economic development!

 ARL Meeting Oct ◦ Myron Gutmann, Assistant Director, National Science Foundation, Directorate for Social, Behavioral & Economic Sciences 05/02/2014 ISTeC DM Forum317

05/02/2014 ISTeC DM Forum318 ProposalReview ‘Big’ Funding

05/02/2014 ISTeC DM Forum319 ProposalReview ‘Small’ Funding Produce Data Set Measure Usage If High‘Big’ Funding

05/02/2014 ISTeC DM Forum320

Pubs w/ Data Processed Data & Data Representations Data Collections & Structured DB’s Raw Data & Datasets 05/02/2014 ISTeC DM Forum321 1.Data within published articles 2. Supplementary files to articles 3. Data referenced from articles & held elsewhere 4. Published datasets 5. Data on local disks

 Discoverable ◦ Crawled by Google and others  Accessible ◦ Robust infrastructure, accessibility  Organized/searchable ◦ Metadata ◦ Catalogued  Persistent ◦ Persistent Digital Object Identifier (DOI) ◦ Preserved and transcoded as needed 05/02/2014 ISTeC DM Forum322 > 25% Failure Rate of URL’s!!

1. Federal agency systems: NIH PubMed Central ◦ NASA only other federal agency deploying a DR? 2. Disciplinary repositories 3. Local repositories: CSU’s DigiTool ◦ Data Management Plan templates ◦ Discoverability, accessibility, & preservation ◦ Usage stats ◦ Linked to pubs 4. Local file share files from a data store ◦ Granular access control via a common infrastructure, e.g. Globus 05/02/2014 ISTeC DM Forum323

1. Where do I put my scholarly data? ◦ Wherever it’s free!  Locality? Can I move it around? ◦ Repository: persistence/longevity ◦ Globus-enabled file share 2. How do I expose my data? ◦ PubMed Central, NASA repository ◦ Disciplinary repository ◦ Local repository (metadata) 05/02/2014 ISTeC DM Forum324

 Association of Research Libraries (ARL) SHARE initiative: local repositories ◦ “Shared Access Research Ecosystem” ◦ For preservation and access of scholarly works and data sets ◦ Not for storage of working files  CSU’s DigiTool Digital Repository (DR) 05/02/2014 ISTeC DM Forum325

05/02/2014 ISTeC DM Forum326

05/02/2014 ISTeC DM Forum327

05/02/2014 ISTeC DM Forum328 HPC Centers, Disciplinary Repositories Internet NASA? CSU ACNS Globus File Store DigiTool DR Metadata Local Remote Globus User Access Control

 If files stored in (any) digital repository ◦ ~OK for metadata and discoverability  If files stored elsewhere ◦ Metadata should be stored on a digital repository for discoverability, accessibility, citability, and persistence; and “point” to the data  If files stored locally ◦ Can use Globus to manage access granularly ◦ What about backup and preservation? 05/02/2014 ISTeC DM Forum329

05/02/2014 ISTeC DM Forum330 DigiTool Local RAID Off- campus RAID Preservation RAID Preservation RAID Internet 2X 2X DR Site Total = 6X 2X Main Site X = Storage Size

 NSF likes: ◦ Managed jointly by Librarians and IT Professionals ◦ Connected to utrahigh-speed research network ◦ The best metadata, standards,… ◦ Discoverability (crawled by all services), accessibility ◦ For preservation of scholarly data and pubs  Limitations ◦ Very limited access control (NSF likes this too) ◦ Does not support structured data (e.g. databases) ◦ Not for working data sets 05/02/2014 ISTeC DM Forum331

1. Cultural change 2. More work – ugh! 3. Greater capital cost ◦ The cost of additional storage required could be extreme 05/02/2014 ISTeC DM Forum332

 It depends! 05/02/2014 ISTeC DM Forum333

 SSD’s are required for performance, but expensive  Amazon announced ~80% cost reduction in storage - $10/TB-mo! ◦ We are exploring!!! 05/02/2014 ISTeC DM Forum334

05/02/2014 ISTeC DM Forum335

05/02/2014 ISTeC DM Forum336 Big Data Us

 If “incidental,” can be included as a direct cost in the grant  Some storage for scholarly data sets “free” to CSU Faculty ◦ First 100 GByte free, thereafter $4k/TB  Storage of pubs+ always “free” to CSU faculty and students ◦ Assumes no “big” embedded data  Exploring Amazon as an option 05/02/2014 ISTeC DM Forum337 Game Changer!

 For a single file ◦ Small: 1 GByte ◦ Medium: GBytes ◦ Big: >100 GBytes  For a set of files ◦ 100X that of a single file?  How many files is ‘big?’ 05/02/2014 ISTeC DM Forum338 1 full rack stores: ~350 Tbytes usable & Costs ~$150k (~$1,000/TB)

05/02/2014 ISTeC DM Forum339 ACNS Storage HPC & Other Devices Local Storage

04/08/14 CC-NIE Panel40 BiSONBiSON BiSONBiSON FRGP Research Networks Research Networks Internet Border 1 Border 2 Production LAN Production LAN Core 2 Core 1 Campus 100 Gig Core Routing Cluster DYNES Server & Storage, FDT “Science DMZ” Commodity Users & Researchers (typ.) Dynamic VLANs 3 ea. 10 Gbps Wave DYNES IDC 10 Gbps Research Connections (typ.) ~ 40 or 100 Gig? 1. ✔ 5. ~ 2. ✔ 3. ✔ 4. ✖ 3. ✔

 Later in this forum, you will hear more from our consummate experts 05/02/2014 ISTeC DM Forum341

 ORCID – the Organizational Researcher Contributor ID  An unique numeric identifier to “Connect Research and Researchers”  Makes publication data gathering much easier  Complements the DOI  See 05/02/2014 ISTeC DM Forum342

05/02/2014 ISTeC DM Forum343 So much of what we call management consists of making it difficult for people to work. - - Peter Drucker Make Work as Easy as Possible

 Do we need a standing Data Management Committee? 05/02/2014 ISTeC DM Forum344

45 Move in any direction, as long as its somewhat positive! 05/02/2014 ISTeC DM Forum3

05/02/2014 ISTeC DM Forum346 Sometimes it is better to journey hopefully, than to arrive!!!