Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unified Digital Format Registry (UDFR) Stakeholder Meeting Library of Congress Washington, DC April 13, 14, 2011.

Similar presentations


Presentation on theme: "Unified Digital Format Registry (UDFR) Stakeholder Meeting Library of Congress Washington, DC April 13, 14, 2011."— Presentation transcript:

1 Unified Digital Format Registry (UDFR) Stakeholder Meeting Library of Congress Washington, DC April 13, 14, 2011

2 Welcome! Stephen Abrams, Associate director Lisa Colvin, UDFR project manager Alex Genadinik, UDFR project developer University of California Curation Center Bibliothèque nationale de FranceLibrary of Congress Data Conservancy / Johns Hopkins ULos Alamos National Laboratory DataONE / UC Santa BarbaraNational Archives [UK] Deutsche NationalbibliothekNational Archives [US] Ex LibrisNational Library of New Zealand Family SearchNew York University Florida Center for Library AutomationOpen Planets F / Nationaal Archief GDFR / Harvard UniversityTessella Georgia Institute of Technology University of Pennsylvania Government Printing Office [US] Virginia Institute of Technology Koniklijke Bibliotheek

3 Objectives The desired outcomes of this stakeholder meeting are: Agreement on the scoping of functional and non-functional requirements Agreement on the data modeling process and ontology Agreement on key technology decisions Agreement on project plan and schedule Groundwork for the administrative and technical continuity of UDFR as an ongoing service

4 Key questions What subset (or superset) of PRONOM and GDFR functionality and data modeling should be supported? Is there a useful distinction between format “facts” and “policies”? What are the criteria for contributor eligibility? To what level of technical review should/will contributed information be subject, and by whom? Are new contributions immediately visible in an unreviewed state? What is the appropriate granularity of provenance and review? Should UDFR identifiers be transparent or opaque? Should UDFR support static or dynamic inheritance of properties? Must there be an explicit grant of license by content contributors? What is the proper replication model: master/slave(s) or peer-to-peer? Should UDFR support classes of information that is not replicated? What are the criteria for node eligibility? What is the ongoing relationship between PRONOM and UDFR?

5 Agenda TimeTopic 09:00 – 09:20Welcome and introductions 09:20 – 09:30Review of objectives and agenda 09:30 – 10:00Project background 10:00 – 10:30Use cases and functional requirements 10:30 – 11:00Break 11:00 – 11:30Function requirements (continued) 11:30 – 12:30Data modeling and ontology 12:30 – 13:30Lunch 13:30 – 14:30Data modeling and ontology (continued) 14:30 – 15:00Technical architecture 15:00 – 15:30Break 15:30 – 16:30Technical platform decisions 16:30 – 17:00Questions and discussion 17:00Adjourn

6 Agenda TimeTopic 09:00 – 09:30Project schedule 09:30 – 10:15Initial population of UDFR 10:15 – 10:45Community building 10:45 – 11:15Break 11:15 – 12:30Community building (continued) 12:30 – 13:00Follow-up planning 17:00Adjourn

7 Project background Why worry about formats? Information preservation Bit preservation Since formatted digital assets are inherently mediated by technology, they are particularly susceptible to disruptive technological change Format a set of syntactic and semantic rules for mapping between an information model and a serialized bit stream

8 Project background PRONOM http://www.nationalarchives.gov.uk/PRONOM/Default.aspx Global Digital Format Registry (GDFR) http://www.gdfr.info/ Unified Digital Format Registry (UDFR) http://www.udfr.org/ – “The Unified Digital Format Registry (UDFR) will provide a reliable, sustainable and publicly accessible knowledge base of file format information” – Fully open source implementation that “unifies” the function and data holdings of PRONOM and GDFR

9 UDFR project 1 year, 2+ FTE, funded by the Library of Congress Features – Use cases and functional requirements developed by the stakeholder community over the past two years – Support for linked data and semantic web – Support for a distributed network of independent but interoperable UDFR nodes Deliverables – Working, documented, single-node registry system, initially populated with an export from PRONOM, GDFR, and other appropriate sources – BSD license

10 Community building How can we ensure the administrative and technical continuity of the UDFR once the LC-funded work is completed? Policy and strategic planning Operation of the initial registry node Recruitment of additional nodes Technical maintenance and enhancement of the code base Content contribution Review of contributed information

11 Policy and strategic planning What is the lightest weight governance structure that is effective? Continue as an ad hoc group or develop a more formal organization? Operate as loose consortium under an MOU Look for an administrative umbrella under an existing organization

12 Operational considerations CDL is prepared to provide an operational home for the initial production node on an interim basis Any long-term commitment may require some (minimal) level of cost recovery Additional replication nodes Eligibility requirements? Minimal/maximal number desired?

13 Technical maintenance and enhancement Manage source code in a public code repository Enhancement planning and prioritization – Call for community-wide evaluation at 6/12 months of production operation Eligibility for contributors? Committers?

14 Content contribution Contributor eligibility – Are contributors recruited or self-selected ? What can we do to encourage contribution? – Engagement by institution and discipline

15 Technical review Reviewer eligibility – Are reviewers recruited or self-nominated? Single or multiple levels of scrutiny? Standard criteria for evaluation – What is the appropriate level of due diligence?

16 Follow-up planning Next steps Ongoing project work with early prototype releases Production release (single node) in January 2012 Governance, policy, and planning structure Solicitation of replication nodes Solicitation of content contribution 6/12 month evaluation

17 Key questions What subset (or superset) of PRONOM and GDFR functionality and data modeling should be supported? Is there a useful distinction between format “facts” and “policies”? – Priority for “facts”; support for “policies” as time permits. What are the criteria for contributor eligibility? – No criteria, but user account required (i.e. no anonymous contribution). To what level of technical review should/will contributed information be subject, and by whom? Are new contributions immediately visible in an unreviewed state? – Opportunity (but not a requirement) for review. Strong provenance will be maintained, as well as explicit tagging indicating the level of review. What is the appropriate granularity of provenance and review? – Individual assertion. … answered?!

18 Key questions Should UDFR identifiers be transparent or opaque? – Opaque, and without a node identifier component (to avoid the co- reference problem). Should UDFR support static or dynamic inheritance of properties? – Not clear if inheritance is a feature of the model, the query system, or the UI. Must there be an explicit grant of license by content contributors? – Yes, ideally using CC0. What is the proper replication model: master/slave(s) or peer-to- peer? – Master/slave(s), but replication is not the highest immediate priority. However, nothing in the design or implementation of the registry should preclude adding support for replication in the future. … answered?!

19 Key questions Should UDFR support classes of information that is not replicated? – Need to deal gracefully with legally encumbered information. In a master/slave configuration, data entered at a slave node would remain local. What are the criteria for node eligibility? – With no consensus on the immediate need for replication, this question does not require an immediate answer. Some identified criteria include: geographic dispersion and high-availability operation. What is the ongoing relationship between PRONOM and UDFR? – Continued close consultation and collaboration. … answered?!

20 Thank you! http://www.udfr.org/ Safe travels!


Download ppt "Unified Digital Format Registry (UDFR) Stakeholder Meeting Library of Congress Washington, DC April 13, 14, 2011."

Similar presentations


Ads by Google