Presentation on theme: "ARL/CNI e-Science Fall Forum 2008 Data Curation Panel Pam Bjornson, Director General October 15, 2008."— Presentation transcript:
ARL/CNI e-Science Fall Forum 2008 Data Curation Panel Pam Bjornson, Director General October 15, 2008
Introduction Some common elements from yesterday’s presentations: E-science is a new way of working for scientists that requires new organizational support There is a series of layers from the individual scientist or research team layer through e-dbases, archives and collections, to the tools and services layer, to the cyberinfrastructure layer of supercomputers, HPC and national networks Beyond collaboration to engagement with scientists and researchers – new way of working for libraries across and within institutions All of the above = Disruptive change? Transition? Is this an adoption of our earlier practices or real disruptive change?
Two Canadian Data Initiatives Research Data Strategy Working Group –Multiple agencies, cross-disciplinary –Policy level –Ontario Council of University Libraries (OCUL) –Data documentation project building on successful Scholars’ Portal initiative
Research Data Strategy Working Group
About the Research Data Strategy WG What it is A collaborative effort to address the challenges and issues surrounding the access and preservation of data arising from Canadian research. Who it is Multi-disciplinary group with representation from university research libraries and CIOs, national institutions, federal granting agencies, federal research institutes, and individual researchers e.g. CARL, CUCCIO, LAC, NRC-CISTI, CANARIE, NSERC, SSHRC, CIHR, CFI, CODATA Canada
Activities to Date Stewardship of Research Data in Canada: Gap Analysis Analysis of current state versus ideal state via 10 indicators Identification of gaps Final Report to be posted shortly Three Task Groups formed: Policies, Funding and Research –Team Lead - Walter Stewart, CANARIE Infrastructure and Services –Team Lead - Chuck Humphrey, U. Alberta Data Centre Capacity (Skills, Training, Rewards System) –Team Lead - Margaret Haines, Carleton University Librarian
RDS Canada Current State - Gaps IndicatorGap level Policies Moderate Funding Large Roles and responsibilities Large [Trusted digital] data repositories Moderate Standards Moderate Skills and training Large Reward and recognition systems Large Research and Development Moderate Access Moderate Preservation Large
Ontario Data Documentation, Extraction Service and Infrastructure
background background An intuitive data portal for researchers, teachers and students; inspiring, developing and supporting research excellence. includes Statistics Canada and public opinion poll data jointly funded project between the Ontario Council of University Libraries (OCUL) and OntarioBuys (BPS Supply Chain Secretariat, Ontario Ministry of Finance) is a centralised, standardised web-based data exploration/extraction system delivered through the OCUL Scholars Portal only provincially available tool that allows the user to search multiple datasets at the variable level
next steps next steps seek new datasets creating a suite of tutorials and other training materials investigate providing access to to the wider education sector work toward creating a national co-ordination committee for DDI projects in Canada investigate using as a depository for Ontario research data explore links with CARL and CISTI in the aim of creating a national data archive explore international links e.g. CESSDA, IFDO
Comments/Questions Engagement w/ researchers and their workflow Right now, many science projects that collect data are curating their own data, without the help of the library (e.g. astronomy and some big science projects are well advanced in data management). In fact, not all are convinced that the library, or librarians, have a role to play in helping them manage data. We have heard about the skills gap – technical, team and even partnership (heedful interaction) My own institution, the National Research Council of Canada, is very diverse, as are your own institutions. We are planning a project to assess needs, but in early stages at present. The Data Audit noted by Liz Lyon is a great tool.
Comments/Questions - Resources Universities and research organizations, and the agencies that fund them, have to consider the usual questions when we direct resources (people, operational $ and attention) to new activities: –Will new funding be found for this activity, and if so from where? –What will we stop doing or do less of? –How will we re-allocate funds to support this rather than that? At the granting agency level, there is concern that funding data could erode support available to new research. Or there is project money for start-up or transformational projects but no ongoing resources for sustainability Engagement – have to define the problem of data stewardship as compelling, urgent (risks and opportunities) with economic and competitive consequences
Comments/Questions Models and Issue of Coherence/Linkages There are very different models (note examples) for data stewardship emerging in different countries and in different disciplines : Distributed versus centralized Disciplinary and international versus institutional and local (e.g. Euopean Bioinformatics Institute, MARS in meteorology) Major national funding to non-existent Question: Can you speak about some of the advantages and disadvantages of these models, and whether you see some as positive or others perhaps impeding the type of collaboration that seems to be demanded in order to realize the full value of access to data at web scale?
Comments/Questions - Policy Issues Funding agency policies – will they begin to include and mandate access to data as well as to published literature? Some data retention already regulated and mandated by granting agencies, but there is not always a capacity to actually confirm to the policy What is the readiness of our institutions to handle that if it were to happen?
Long term roles – looking forward There is a lifecycle to data creation, curation, dissemination and preservation. Is there also a lifecycle to what we are experiencing now as we work with researchers to integrate data into the arena of accessible knowledge? For example, data curation is labour-intensive, team-based (domain, computer, information skills), and particular to domains. Are we in an early “pioneer” stage for data? Will data curation evolve to be web- scaled, accomplished through network-enabled protocols and standards? Or will there always be a continuum?
Comments/Questions Themes emerging from papers and recent reports: In the flow – question of how libraries co-evolve with user behaviours, researcher workflows and also tie in to the network environment and networked flows. Partners in research. - questions around resources, skills, long term commitment In the cloud – if libraries use web-scaled tools and protocols, how much data curation would be simplified, how much work could be done collaboratively, in a federated way, collectively? - questions around coherence and linkages - whether that would be a self-organizing model that will emerge or one that will be guided/created? Role of private sector versus public sector?
urls CARL Research Data Strategy Working Group donnees.gc.ca/eng/index.htmlhttp://data- donnees.gc.ca/eng/index.html ODESI CISTI