National Archival Authorities Infrastructure Social Networks and Archival Context & National Archival Authorities Cooperative.

Slides:



Advertisements
Similar presentations
5 December 2002OA Forum Workshop Lisbon How real archivists can learn to love the OAI George MacKenzie National Archives of Scotland Göran Kristiansson.
Advertisements

Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
ISAAR (CPF) and a possible mapping to CIDOC CRM Based on and “State of.
Sandra McIntyre Program Director. OVERVIEW Analysis.
Conducting Archival Research Online Trevor Bond Trevor Bond Cheryl Gunselman Ben DeCrease.
Global Resources Forum October 21, 2010 The Western Waters Digital Library: Building a Resource Through Multi- State Collaboration and Technology
History of English Language Assessment Archives in context and as context Database structure ISAAR (CPF) Online Archival Sustainability.
RDA and DACS: Using a MARC-EAD Crosswalk to Improve Access to Special Collections Resources, a Project at UWG GUGM May 15, 2014 Presenters: Blynne Olivieri.
The National Endowment for the Humanities Brett Bobley Chief Information Officer
Sir Hilary Jenkinson: The duties of the Archivist …. are primary and secondary. In the first place he has to take all possible precautions for the safeguarding.
RLG Programs Karen Smith-Yoshimura OCLC Research CEAL, Philadelphia 24 March 2010 Cooperative Identities Hub.
Reference 2.0: Using New Web Technologies to Enhance Public Service Texas Library Association Conference April 17, 2008 Stephen F. Austin State University’s.
Case Studies in New Models of Collaboration: CANADA’S UNIVERSITY LIBRARIES Carole Moore Chief Librarian, University of Toronto Chief Librarian, University.
SLIDE 1I242 - Fall 2011 Connecting Archival Collections: The Social Networks and Archival Context Project Ray R. Larson University of California,
SLIDE 1IS 242 – Fall 2011 Examples of XML DTDs and XSDs Ray R. Larson University of California, Berkeley School of Information IS 242: XML.
Archival Description and Access After Finding Aids.
Working Together Revisited: Diverse Skills for Sustainability Robert P. Spindler Arizona State University December 5 th, 2006.
Open Discovery: Collaborative Approaches to Metadata 26 August 2011 Kira B. Homo Electronic Records Archivist.
Presented by Karen W. Gwynn LS – Metadata University of Alabama Prof. Steven MacCall Spring 2011.
Art Museum Image Consortium: enabling educational use of museum multimedia MUZEA, Kulturní Dedictiví a dígitaliní revoluce Jennifer Trant Executive Director.
NOBLE Digital Library. How does it work? The NOBLE Digital Library uses the DSpace platform. Image files and metadata are imported into DSpace using.
Use of METS in CDL Digital Special Collections Brian Tingle.
Canadian Research Libraries: A History of Cooperation Canadian Research Libraries: A History of Cooperation Gwendolyn Ebbett Dean of the Library University.
1 Open Library Environment Designing technology for the way libraries really work December 8, 2008 ~ CNI, Washington DC Lynne O’Brien Director, Academic.
Archival description and linked data: Opportunities and implementation challenges Karen F. Gracy, Ph.D., Kent State University The Metadata Vocabulary.
Mark Sullivan University of Florida Libraries Digital Library of the Caribbean.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
The Western Waters Digital Library: Building a Resource Through Multi- State Collaboration and Technology Dawn Paschal Assistant Dean, Digital Library.
Proposition: Digital Collections Are Easier to Find and Use through DLF Aquifer’s American Social History Online Katherine Kott, Aquifer Director Library.
VIAF (Virtual International Authority File) Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web Dr. Barbara B.
Pavia Workshop 28 February 2013 Michael Forstrom Modern Literary Fonds: Split by Principle Manuscript Unit, Beinecke Library.
RIAMCO Rhode Island Archival and Manuscript Collections Online [your name]
BETH Bologna (Italy) What is known must be shared Building on the insights from OCLC Research.
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
Uncovering Philadelphia’s Hidden Collections: The PACSCL Consortial Survey Initiative Delaware Valley Archivists Group October 18, 2007.
Texas Emergency Management Conference San Antonio April 3, 2012.
DACS Describing Archives: A Content Standard. The Background  Archives, Personal Papers & Manuscripts, 1980s –New Technologies with Web, XML, EAD –Revision.
389F/Description1 ARCHIVAL DESCRIPTION. 389F/Description2 INTRODUCTION Finding Aid Any descriptive medium that establishes physical, administrative and/or.
IFAP Special Event: Information and Knowledge for All, Emerging Trends and Challenges Information Preservation 4000 Years of Traditions Challenged by Digital.
Unearthing Philadelphia’s Hidden Collections: The PACSCL Consortial Survey Initiative Rachel Onuf SAA Annual Meeting August 2009.
November 2004 NDIIPP: Future Directions and Relevance to Other Countries Beth Dulabahn Office of Strategic Initiatives Library of Congress November 7,
The State of Collaborative Digitization: issues and approaches Tom Clareson, PALINET October 22, 2007.
BUILDING ON COMMON GROUND: EXPLORING THE INTERSECTION OF ARCHIVES AND DATA CURATION Lizzy Rolando & Wendy Hagenmaier 6/3/2015IASSIST 2015.
Archival Description People, Records, and Functions Daniel V. Pitti Institute for Advanced Technology in the Humanities University of Virginia March 2003.
Cooperative Identities Hub Karen Smith-Yoshimura ALA Authority Control Interest Group, The Future Is Now: Global Authority Control ALA Annual, Chicago.
Archival authority files and the representation of literary networks: first steps and opportunities Cataloguing Creativity, 15/11/2013: Bill Stockting,
Metadata and Documentation Iain Wallace Performing Arts Data Service.
By Addison, Jessica, and Lauren. Management The Mountain West Digital Library is a program of the Utah Academic Library Consortium (UALC) Three Governing.
Role of national bibliographic agencies in linked data environment Gordon Dunsire Presented to staff of the Bibliothèque nationale de France, Paris, 25.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
How Do We Keep From Getting Further Behind? A Case Study in the Application of Minimal-Level Description in the OSU Archives Elizabeth Nielsen Northwest.
INFO 6850 Archives II Week Seven THEORY, STANDARDS, BEST PRACTICES How do you encode the “context” of archival records?
INFO 6850 Archives II Week Eight PROGRAMMING AND PUBLIC SERVICE Is programming a “core” archival function?
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Presented by the College of Arts & Sciences with the Office of Contracts and Grants University of San Francisco April 2012.
Feb 2012Teldap, Taipai1 Creativity, Collaboration, Convergence and the change from print to a digital environment: Theme and case study. (Also Friday 09:30.
Fitting in Functions Katherine M. Wisser & Anila Angjeli Description Section August 2015.
The Importance of Standards in Digital Preservation Tina Norris Kayla Payne Jennifer
Northwest Digital Archives (NWDA) Jodi Allison-Bunnell NWDA Program Manager.
ADLUG Roma (Italy) What is known must be shared Building on the insights from OCLC Research.
EAD 101: An Introduction to Encoded Archival Description XML and the Encoded Archival Description: Providing Access to Collections Oregon Library Association.
FIND IT! USING LIBRARY CATALOGING CONCEPTS TO ORGANIZE AND MAKE RECORDS FINDABLE DIONNE L. MACK, INTERIM DIRECTOR OF QUALITY OF LIFE DEPARTMENTS.
The National Digital Stewardship Alliance: Stewardship, Collaboration, Inclusiveness, Exchange.
Linked Library (+AM) Data Presented LITA Next-Generation Catalog IG Corey A Harper Publish, Enrich, Relate and Un-Silo.
Exploring EAC-CPF with the Remixing Archival Metadata Project (RAMP) 8 May 2014 Society of Florida Archivists Annual Meeting Allison Jai O’Dell
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Publishing from the Library: New Roles for Libraries in Scholarly Communications David Ruddy Cornell University Library September, 2004.
Preservation efforts in the library community
AUC’s Role In Facilitating Access To Knowledge In The Arab World
Presentation transcript:

National Archival Authorities Infrastructure Social Networks and Archival Context & National Archival Authorities Cooperative

SNAC – National Endowment for the Humanities – Preservation and Access, Research and Development grant – Mellon Foundation Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Project Team Daniel Pitti (PI) and Worthy Martin (Institute for Advanced Technology in the Humanities, University of Virginia) Adrian Turner and Brian Tingle (California Digital Library, University of California) Ray Larson (School of Information, University of California, Berkeley) Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Project Objectives Archival finding aids currently intermix description of records with description of the creators of records and persons evident in the records Further the ongoing process of transforming archival description using advanced technologies By facilitating the separation of the description of people from the description of records Using EAC-CPF, an International archival authority control standard Goal: enhance the economy and effectiveness of archival description to improve access to and understanding of archival resources Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Rationale for Separation Authority control of forms of names Flexible description Integrated access to cultural heritage Biographical/historical resource Social/historical context (social-professional networks) Cooperative authority control (more later) Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

The Data EAD-encoded finding aids – Library of Congress (1,546) – Online Archive of California (~15,400 ) – Northwest Digital Archive (5,160) – Virginia Heritage (8,390) Authority records – Library of Congress: NACO/LCNAF (3.8M personal names; 900K corporate names) – Getty Vocabulary Program: Union List of Artist Names (293K personal and corporate names) – Virtual International Authority File (16M+ personal names, corporate, uniform titles, jurisdictions) Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Methods and Processing Extract EAC-CPF records from existing EAD-encoded archival descriptions – Extracting both creators and referenced CPF names Match EAC-CPF records against one another and against existing authority records (ULAN, VIAF, LCNAF); merge records for the same entity – Enhance EAC-CPF by normalizing entries, adding alternative entries, titles (VIAF), and historical data (ULAN) – Key challenge: two or more people with the same name; two or more names for the same person Create a prototype historical resource and access system – Historical data and social-professional networks – Links to archive, library, and museum resources (by and about) Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

EAD Source Data Encoded Archival Description – Intermixes description of creators of records and, at the discretion of the archivists, names associated with the content of the records – Detailed description of creators of records Widely varying quality – In the number of names identified and encoded – In the formation of the names (direct or inverted, capitalization, punctuation, and so on) – In the categorization of names (personal, corporate, or family Many names given but not identified as such Most important of these in biographies/histories and in correspondence description Extraction has focused on the “low hanging fruit,” that is the names tagged as names Attention shifting to names not identified as such Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Archival Records Records are the by-products of people living and working as individuals, in organized groups, in families Records document people living and working People exist in social-professional contexts, in relation to others Records document these relations All records created by the same entity are described together (a fonds or collection) – Creators documented in detail – Many of the people documented in the record referenced in description Archival descriptions document interrelations among people and records (documents) Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Source: J. Robert Oppenheimer Papers (LoC) Oppenheimer, J. Robert, Oppenheimer, J. Robert, Bethe, Hans Albrecht, Correspondence Born, Max, Correspondence Boyd, Julian P. (Julian Parks), Correspondence Bush, Vannevar, Correspondence Casals, Pablo, Correspondence Institute for Advanced Study (Princeton, N.J.) Los Alamos Scientific Laboratory

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia Source: Leonard Bernstein Collection (LoC) 1 Aaltonen, Erkki Abbado, Claudio […]

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia Biographical Sketch José Marcos Mugarrieta, prior to his term as Mexican consul in San Francisco , served in the Mexican army from He saw action in numerous battles and campaigns – Jamaica, under General Canalizo in 1841; Campeche, ; Merida, 1843; Veracruz, 1845; Mexico City, 1846; Angostura and Cerro-gordo, 1847; Guanajuato, 1848, and Sierra-Gorda under Bustamante, ; and Matamoros, […] In April 1857 Mugarrieta received an appointment from the Comonfort government for the consulship in San Francisco. He did not actually begin his new duties until September 1, 1859, due to illness and to the political situation in Mexico. […]

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia Chronology 1900 Born on Jan. 20 in Hastings, Minnesota Received baccalaureate from Princeton University, major in philosophy. […] 1965 Died on April 4.

EAC-CPF Encoded Archival Context-Corporate bodies, Persons, Families An international communication standard for archival authority control Based on International Council for Archives, International Standard Archival Authority Records- Corporate bodies, persons, families (ISAAR(CPF)) SAA Standards Committee, Technical Subcommittee on Encoded Archival Context Co-chairs – Katherine Wisser, Simmons College – Anila Angjeli, Bibliothèque nationale de France Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Library and Archive Authority Control Library (or bibliographic) authority control is almost exclusively about the control of names Archival authority control involves biographical- historical description of the CPF entity – Descriptions based on controlled vocabularies or values, for example, occupations, place of birth and death – But also biographical-historical description Prose Chronological list Archival authority control provides context for understanding records, the context of their creation, the provenance

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia person Oppenheimer, J. Robert, AACR2 Oppenheimer, J. Robert (Julius Robert), VIAF Oppenheimer, Julius Robert, VIAF Oppenheimer, Robert VIAF Ou-pẽn-hai-mo, VIAF

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia 1904, Apr , Feb. 18 Science--Societies, etc. Male Physicists.

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia 1904, Apr. 22 New York, N.Y. Born, New York, N.Y Los Alamos, N. Mex. Director, Los Alamos Scientific Laboratory, Los Alamos, N. Mex (1) Denied security clearance […] (2) Published Science and the Common Understanding […] 1967, Feb. 18 Princeton, N.J. Died, Princeton, N.J.

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia <cpfRelation xmlns:xlink=" xlink:type="simple" xlink:role=" xlink:arcrole="correspondedWith"> Bush, Vannevar, recordId: DLC.ms r007

Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia <resourceRelation xmlns:xlink=" xlink:arcrole="creatorOf" xlink:role="archivalRecords” xlink:type="simple” xlink:href=" J. Robert Oppenheimer Papers, (bulk ) Papers (bulk ) MSS35188 Oppenheimer, J. Robert, Manuscript Division. Library of Congress Physicist and director of the Institute for Advanced Study, Princeton, New Jersey. [...] Topics include theoretical physics, development of the atomic bomb, the relationship between government and science, nuclear energy, security, and national loyalty.

Year Two Results-Extraction Library of Congress: 43,702 EAC-CPF from 1,546 finding aids – corporateBody: 7,243 – person: 36,012 – family: 447 Northwest Digital Archive: 24,949 from 5,160 – corporateBody: 10,303 – person: 13,294 – family: 1,352 Online Archive of California: 91,811 from ~15,400 – corporateBody: 24,860 – person: 66,329 – family: 622 Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Year Two Results-Extraction Virginia Heritage: 15,175 from 8,390 – corporateBody: 4,783 – person: 9,919 – family: 473 Total: 175,637 EAC-CPF from 30,496 – corporateBody: 47,189 – person: 125,554 – family: 2,894 Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Year Two Matching and Merging Results Total: 128,783 EAC-CPF from 175,637 – corporateBody: 31,282 from 47,189 – person: 95,583 from 125,554 – family: 1,918 from 2,894 Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Early Observations-Extraction Depth of analysis and quality of description of CPF entities varies widely in EAD-encoded finding aids – LoC a lot of names under authority control – OAC and NWDA have less names and control varies – VH still less names, more variance To be fair, the finding aids were created without SNAC processing in mind! Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Next on Extraction Refine extraction processing, incorporating some NLP-like processing, for example – Verifying type of name: C or P or F – Massaging poorly formed names into better formed names – Identifying names in strings that are names-plus (but name not identified as such) – Provide context information to enhance matching, for example, date or dates of correspondence, or occupation of creator of records for referenced names Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

SNAC SNAC II: Mellon – 150,000 EAD-encoded finding aids – Most from U.S., but also U.K. and France – 1-2M WorldCat MARC archival descriptions – British Library: 300K names from mss. Collections – Smithsonian Institution: entire agency history; expeditions; and correspondents of Joseph Henry – National Archives and Records Administration (80K authority records – 16M VIAF clusters – And more... And more... Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

For more information on SNAC (Project website) h (Public prototype) h Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

National Archival Authorities Cooperative Building a National Archival Authorities Infrastructure – IMLS funded two-year project, October September 2013 – EAC-CPF SAA workshops: 140 scholarships – National Archival Authorities Cooperative planning – Transforming SNAC into a sustainable national cooperative program Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Benefits for Archivists Archival authority control at last! Best done cooperatively Consistent use of same form of name across descriptions This is can only effectively be accomplished by maintaining a single, shared authority file Economic benefits to cooperating: the creator in one description is the correspondent in another: people exist in social contexts, records document these contexts Working cooperatively will ensure identifying the interrelations of different collections Cooperative authorities will enable integrated access to distributed records: all of the records relevant to one person, corporate body, or family A shared national authority file would be a substantial historical resource, quite apart from the access enabled by it Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Benefits for Users For scholars Integrated access to distributed archival resources Contextual data for not only the records of one creator, but other related records Access to the socio-historical networks in which people lived and worked A biographical-historical resource Time for an anecdote Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

But Not Only Scholars Use in K-12 education Time for an anecdote Life-time learners – Historical curiosity – Genealogy Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Building the Infrastructure Institute for Museum and Library Services Funding two activities – 140 scholarships to seven regional workshops on EAC-CPF (Administered by Simmons College) – Series of meetings to develop a blueprint for a sustainable National Archival Authorities Cooperative (NAAC) Transforming SNAC into NAAC, project into program Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

NAAC Series of three meetings leading to the development of a blueprint All hosted by the National Archives and Records Administration Soliciting community input on the business, governance, and technological requirements First meeting broad, consensus building and idea gathering, followed by two meetings of three teams to address the requirements Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

First Meeting May Around 90 people Archivists, librarians, scholars (40 or so) Representatives of the federal repositories (40 or so) Funders (10 or so) Other stakeholders (OCLC and Getty Vocabulary Program) One and one-half day meeting Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Federal Repositories National Archives and Records Administration – Including two presidential libraries Library of Congress Smithsonian Institution National Library of Medicine National Agricultural Library National Park Service Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia

Conclusion This may well be a groundbreaking moment for the national archival profession An opportunity to do something really important, really useful To accomplish together what none can accomplish alone I hope (or is it now hopefully?) Daniel V. Pitti § Institute for Advanced Technology in the Humanities § University of Virginia