Presentation is loading. Please wait.

Presentation is loading. Please wait.

PREMIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.

Similar presentations


Presentation on theme: "PREMIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0."— Presentation transcript:

1 PREMIS Rathachai Chawuthai rathachai.chawuthai@live.com Information Management CSIM / AIT Issued document 1.0

2 Preservation Metadata PREMIS Overview Data Dictionary Conventions PREMIS Data Model The Data Dictionary PREMIS In use 2

3 3

4 Metadata is often defined as “Data about Data”. It defines information about one or more characteristics of the data; such as, – Data’s name, description, purpose, created date-time, creator, basic information, and etc. For example – Library catalogues: a small card contains a book’s title, author, subject, category, shelf, and etc. that describes resource in library Furthermore, it can say that – “Metadata is commonly understood as an amplification of traditional bibliographic cataloguing practices in an electronic environment.” Metadata Meaning wikipedia.org 4

5 Descriptive – It always describes identification and information of resource; such as, title, author, and etc. Administrative – It helps to manage information of resource; such as, version number, archiving data, technical information, right management, and etc. Structure – It informs relationships within and among resource objects; such as, web page contains html files, image files, css files, javascript files, links to others files, and etc. Metadata Categories wikipedia.org 5

6 Overview It is “an essential component of most digital preservation strategies”. [Wikipedia] It’s basic requirements are: [OCLC] – To store technical information that supports making decision and action in order to do preservation – To document actions taken, such as migration. – To record the effects of preservation strategies – To ensure authenticity of digital resources over the long-term – To note information about collection management and rights management It’s basic functional objectives are: [OCLC] – Providing knowledge about actions to maintain digital resource over the long-term – Ensuring that the digital resources can be rendered originally OCLC.org, wikipedia.org 6

7 Basic features According to preservation requirements, preservation metadata should include following information: Provenance – Describe history of creation, ownership, access, and change Authenticity – Ensure trustworthiness (Does digital resource render originally?) Preservation activities – Record process supporting preservation, such as migration Technical environment – Provide name and version of hardware, platform, OS, and software that is required to render digital resources Rights management – Inform concern of intellectual property rights and agreement that need to be observed when execute preservation process. E.g. does a creator allow to copy his/her work or not? OCLC.org, usenix.org, wikipedia.org 7

8 Example Date Transcriber Producer Capture Device Capture Details Change History Validation Key Encryption Watermark Resolution Compression Source Color Color Management Color Bar/Gray-scale Bar Control Targets 16 preservation metadata elements ( recommended by oclc.org, May 1998) OCLC.org 8

9 Overview A framework that is an overview or description types and association of digital preservation metadata Following OCLC/RLG, the framework should have 3 requirements – Comprehensive The metadata completely includes information that meet requirements of big picture of digital preservation data structure and processes – Structured Preservation metadata should represent in structural format which makes human and machine understand clearly. – Broadly applicable Digital object type, preservation activities, their relationship should be flexible for implementing in real world, such as institution, and etc. OCLC.org 9

10 Overview In order to meet the requirements, it should realize these 3 steps 1.Design metadata model that supports content model, long- term accessibility, and preservation activities. 2.Think of future interoperability, then, modify the model for supporting metadata exchange and resource sharing. 3.Improve the model to be flexible to intergrade with external archive. OCLC.org 10

11 Example AHDS Technical Description Persistent ID File Description Text Format Version Structure division Image Format Resolution Size Management Description Created date Storage information Software Environment Application required OS NameVersion Functionalities Ingest Migrate Agent Date SoftwareVersionAccess Share Modify from AHDS Preservation Metadata Framework AHDS.ac.uk 11

12 Metadata is “Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource” [LOC] Preservation Metadata is “A metadata that supports and documents the digital preservation process” [LOC] Preservation Metadata Framework is “An important contribution toward shaping an international consensus on the metadata requirements of archived digital objects and consolidating expertise on the use of metadata to support digital preservation” [OCLC] LOC.gov, OCLC.org 12

13 13

14 PREservation Metadata: Implementation Strategies Sponsor by Library of Congress (LOC) People usually refer to “PREMIS” as “Data Dictionary” Represent in XML format What? LOC.gov, wikipedia.org 14

15 Set of Semantic Unit Metadata for digital object – Can read from media – Can render – Store securely – Keep track of changing format Metadata Scope – Format-spec e.g. audio, video, image, … – Implementation-specHow to access it (by app) – Descriptive metadataData properties; like, MARC, DC – Detailed info(For media or hardware) – Agents infoe.g. people, org, or software – Right infoe.g. license, permission PREMIS Data Dictionary PREMIS from LOC.gov 15

16 Where is PREMIS? PREMIS responses itself as a coordinator among several types of metadata in order to perform preservation function on all digital resources. Thus, PREMIS is a small core at the heart of preservation metadata PREMIS from LOC.gov 16

17 Administrative metadata that support the process of digital preservation Information providing to support preservation management – Technical information (Characteristics) E.g. creator, created date-time, creating software, … – Information about action of a digital object E.g. ingest, migrate, verify, … – Relationship Structural : point out how objects are put together Derivative : result from actions of preservation – Rights E.g. Rights and agreement metadata associated with preservation PREMIS data dictionary covers: PREMIS from LOC.gov 17

18 Support managing repository system – Long-term preservation – Repository migration (to another) Scope – Repository Design – Repository Evaluation – Exchange of archived ‘information package’ among repositories Development view – Use PREMIS as a guideline for what info should be recorded Usefulness PREMIS from LOC.gov 18

19 Support Data preservation by having – Inhibitors Password, encryption, … in order to access digital objects – Digital Provenance Record change of object format e.g..DOC .PDF Contain application, version, environment, … in order to render digital objects – Significant Properties (If important) Object’s characteristics e.g. font, formatting, color, …., etc Look and feel – Right Copyright status, License term Using PRMIS if you have to PREMIS from LOC.gov 19

20 20

21 Information a repository uses to support the digital preservation process – Guidelines/recommendations to support preservation process; such as, creation, use, and management. Information is defined as: – Thing that most working repositories have common concern and need in order support digital preservation Data dictionary PREMIS from LOC.gov 21

22 PREMIS prefers to use term “Semantic Unit” rather than “Metadata Element”. Semantic unit is an entry of data dictionary Semantic unit is defined as a property of entity in PREMIS data model Semantic unit supports the recording of relationship between objects. Example – Identifier, size, format, environment, software, … Semantic Unit PREMIS from LOC.gov 22

23 Example : Size PREMIS from LOC.gov 23

24 Software - swName = “Windows” - swVersion = “XP” - swType = “OperatingSystem” Container Software = “Windows|XP|OperationSystem” What should we do if the semantic unit’ value has to address with many meaning? The data dictionary allow concept of container that group as set of related semantic units together. Container components 24

25 Example : Software PREMIS from LOC.gov 25

26 New in PREMIS 2.0 Contains externally defined semantic units Allows to extend PREMIS with semantic units which are more granular, non-core or out of scope of the PREMIS data dictionary Data in the container may replace, refine or be additional to the appropriate PREMIS semantic unit One schema per extension; if more schemas are needed, the extension element needs to be repeated Extension Container (General) PREMIS from LOC.gov 26

27 Example : Normally, has information following PREMIS schema like: PREMIS louis.xml from LOC.gov 27

28 Example : If it need more information a part from PREMIS schema, the information from other schemas (e.g. METS) can be address in PREMIS louis.xml from LOC.gov 28

29 29

30 Data Model Including: Entity – Thing relevance to do digital preservation that is described by preservation metadata such as, Intellectual, Objects, Events, Rights, and Agents Property of entity (Semantic Unit) – Such as, Identifier, size, format, environment, software Relationship between entities – Linking entity together e.g. isPartOf, isSourceOf, isDerivedFrom, … – For example: Document X2 is a newer version of document X1 Document AA is a chapter of document A 30

31 Entities PREMIS from LOC.gov 31

32 May called “Bibliographic Entities” A set of content that is considered a single intellectual unit for purposes of management and description – E.g. book, map, photograph, or database Not fully described in PREMIS Data Dictionary – It can use by other metadata standard, such as, DublinCore. Overview Intellectual Objects Rights Agents Events PREMIS tutorial from LOC.gov 32

33 To be stored and managed in the preservation repository E.g. – Intellectual Entity : “Thailand Map” Object Entity: Image file 3 Kinds of object – File A computer file, likes a PDF or JPEG – Representation Set of files that work together E.g. web page including, html, image, css, javascript – Bitstream A part of file E.g. a frame image in video file Overview Intellectual Objects Rights Agents Events PREMIS tutorial from LOC.gov 33

34 Chapter1.pdf is a File Chapter1.pdf + Chapter2.pdf + chapter3.pdf is a Representation of a book having 3 chapters A TIFF file contain header and 2 images – It means that there are 2 Bitstreams of 2 images – Each bitstream (image) has own set of semantic unit Example 34

35 Example Thailand Map Intellectual Object 1 Object 2 Object 3 RepresentationFile 1 jpeg file 1 TIFF file include: 3 bitstreams of images of map layers Province mountain, river It can be a web page that contains 3 files HTML CSS JPEG Example types of object that is possible to preserve the Thailand Map 35

36 a unique identifier for the object (type and value), fixity information such as a checksum (message digest) and the algorithm used to derive it, the size of the object, the format of the object, which can be specified directly or by linking to a format registry, the original name of the object, information about its creation, information about inhibitors, information about its significant properties, information about its environment – OS  MacOS, Browser  Safari where and on what medium it is stored, digital signature information, relationships with other objects and other types of entities. Data Dictionary PREMIS from LOC.gov 36

37 Example Object example of TIFF file XML format PREMIS louis.xml from LOC.gov 37

38 Example Object example of TIFF file (in Table format) 1 PREMIS from LOC.gov 38

39 Example Object example of TIFF file (in Table format) PREMIS from LOC.gov 39 2

40 Example Object example of TIFF file (in Table format) PREMIS from LOC.gov 40 3

41 Action that effect object in the repository – The action must has at least one object and agent recorded – Event must has outcome (a result of event); such as, success or fail. Overview Intellectual Objects Rights Agents Events PREMIS tutorial from LOC.gov 41

42 Event Type Description capture the process whereby a repository actively obtains an object compression the process of coding data to save storage space or transmission time creation the process of removing an object from the inventory of a repository deaccession the process of removing an object from the inventory of a repository decompression the process of reversing the effects of compression decryption the process of converting encrypted data to plaintext deletion the process of removing an object from repository storage 1 PREMIS from LOC.gov 42

43 Event Type Description digital signature validation the process of determining that a decrypted digital signature matches an expected value dissemination the process of retrieving an object from repository storage and making it available to users fixity check the process of verifying that an object has not been changed in a given period ingestion the process of adding objects to a preservation repository message digest calculation the process by which a message digest (“hash”) is created migration a transformation of an object creating a version in a more contemporary format PREMIS from LOC.gov 43 2

44 a unique identifier for the event (type and value), the type of event (creation, ingestion, migration, etc.), the date and time the event occurred, a detailed description of the event, a coded outcome of the event, (Result of event; success | fail | …) a more detailed description of the outcome, agents involved in the event and their roles, objects involved in the event and their roles. Data dictionary PREMIS from LOC.gov 44

45 Example : Validation PREMIS tutorial from LOC.gov 45

46 Actor, e.g. person, organization, or software Metadata standard, e.g. FOAF, vCARD, eduPerson, … Note: Agent can has many roles – Role is not belong to Agent – It is up to Event entities or Rights entities Overview Intellectual Objects Rights Agents Events PREMIS tutorial from LOC.gov 46

47 a unique identifier for the agent (type and value), the agent's name, designation of the type of agent (person, organization, software). Data dictionary PREMIS from LOC.gov 47

48 Example Adobe Reader PREMIS tutorial from LOC.gov 48

49 Information about Rights and Permissions that are directly relevant to preserving objects in repository – Rights: Assertions of one or more rights or permissions pertaining to a Digital Object and/or an Agent. Example: – John Hebeler grants AIT digital repository permission to make 10 copies of Semantic_Web_Programming.pdf for preservation purposes Pattern – Agent A – grants permission B to the repository – in regard to object C. Overview Intellectual Objects Rights Agents Events PREMIS tutorial from LOC.gov 49

50 a unique identifier for the rights statement (type and value), whether the basis for claiming the right is copyright, license or statute, more detailed information about the copyright status, license terms, or statute, as applicable, the action(s) that the rights statement allows, any restrictions on the action(s), the term of grant, or time period in which the statement applies, the object(s) to which the statement applies, agents involved in the rights statement and their roles. Data dictionary PREMIS from LOC.gov 50

51 Example : Copyright PREMIS tutorial from LOC.gov 51

52 52

53 Example Data dictionary of semantic unit Semantic Unit Name of semantic unit PREMIS from LOC.gov 53

54 Example Data dictionary of semantic unit Semantic Component If it contains child components, components will describe. Otherwise, display “None”. PREMIS from LOC.gov 54

55 Example Data dictionary of semantic unit Definition Description of the semantic unit PREMIS from LOC.gov 55

56 Example Data dictionary of semantic unit Rationale Reason that PREMIS include this semantic unit PREMIS from LOC.gov 56

57 Example Data dictionary of semantic unit Data constraint Specification on value of the sematic unit. For example: None (No constraint) Integer (Value must be integer number) Value from controlled vocabulary (The value must come from controlled vocabulary) Container (the unit is a container) PREMIS from LOC.gov 57

58 Example Data dictionary of semantic unit Object category This section is describe rule of data that depend on each object type : Presentation File Bitstream PREMIS from LOC.gov 58

59 Example Data dictionary of semantic unit Applicability Describe that is this semantic unit applicable to current working object type or not. If “Not applicable”, this semantic unit can be ignored from metadata. In this case, semantic unit “Size” can be apply to object types “File” and “Bitstream” only, but not “Representation”. PREMIS from LOC.gov 59

60 Example Data dictionary of semantic unit Example An example value of this semantic unit may use. PREMIS from LOC.gov 60

61 Example Data dictionary of semantic unit Repeatability Indicates that the semantic unit is able to take multiple value under same container “Not repeatable” = can use at most one time. “Repeatable” = can use more than one time. PREMIS from LOC.gov 61

62 Example Data dictionary of semantic unit Obligation Indicate that is the semantic unit required to store in metadata or not? “Mandatory” = It is required. “Optional” = It is not necessary to use. PREMIS from LOC.gov 62

63 Example Data dictionary of semantic unit Creation / Maintenance Note Further detail regarding how the values are created and or updated. In this case, the value is automatically generate by repository PREMIS from LOC.gov 63

64 Example Data dictionary of semantic unit Usage notes provides information regarding the use of the semantic unit. PREMIS from LOC.gov 64

65 Example list of PREMIS Semantic Unit Name : Name of semantic unit (It can be a container, if it has component units) PREMIS from LOC.gov 65

66 Example list of PREMIS Semantic Unit M : Mandatory (Must define) O : Optional (Not necessary to define) PREMIS from LOC.gov 66

67 Example list of PREMIS Semantic Unit R : Repeatable (Can use at most 1 unit) NR : Not repeatable (Can use more than 1 unit) PREMIS from LOC.gov 67

68 Example list of PREMIS Semantic Unit End with [a,b] : Apply to specific object types e.g. presentation and file None: Apply to all object types PREMIS from LOC.gov 68

69 Although descriptive metadata is important to describe Intellectual Entities, the descriptive metadata is not focused in PREMIS because: – There have existing well-defined standard, such as MARC, MOD, DublinCore, and etc. – The descriptive metadata is often domain specification. Thus, each domain should use a proper standard. Limitation of Data Dictionary PREMIS from LOC.gov 69

70 70

71 Institution – University of North Carolina at Chapel Hill Description – The Carolina Digital Repository (CDR) is being designed as repository for material in electronic formats produced by members of the University of North Carolina at Chapel Hill community. Its chief purpose is to provide for the long-term preservation of such materials. By preservation we mean the ability to ingest the material, index and search it, replicate it, and keep it safe from alteration. The project is recording and/or mapping to PREMIS elements as the repository with a preservation focus is built. Link – http://www.lib.unc.edu/cdr/ http://www.lib.unc.edu/cdr/ Tool – Locally developed Java web apps plus Fedora Commons, iRODS data grid, Solr search engine and the Duke Data Accessioner Carolina Digital Repository PREMIS registry from LOC.gov 71

72 Institution – The National Archives of Sweden Description – PREMIS is used for processing and storing digital objects in a digital repository. The National Archives is developing a transfer model for digital objects created in our scanning projects. A function is being developed for packaging and storing data about the digital objects in our archival information system ARKIS partly stored as PREMIS-metadata. The application is in use for storing data. An application for exporting PREMIS data as XML will be developed in the future. Tool – ESSearch Creating a digital repository at the Swedish National Archives using PREMIS PREMIS registry from LOC.gov 72

73 Institution – Florida Center for Library Automation Description – The FCLA Digital Archive is a preservation repository for the use of the libraries of the public universities of Florida. The FCLA Digital Archive uses a locally-developed software application called DAITSS, which implements most of the PREMIS data elements. Link – http://www.fcla.edu/digitalArchive/ http://www.fcla.edu/digitalArchive/ Tool – The archive is in production as of November 2005. Dissemination (DIPs) with PREMIS-conformant metadata is expected by July 2006. Document – http://www.fcla.edu/digitalArchive/daInfo.htm http://www.fcla.edu/digitalArchive/daInfo.htm FCLA Digital Archive and DAITSS PREMIS registry from LOC.gov 73

74 Institution – National Archives of Scotland Description – The NAS is preparing for the ingest of digital objects from the Scottish Executive (the government of Scotland) and the Scottish Courts. An application is under development that aims to be compliant with OAIS, PD0008 and PREMIS to met this requirement. Tool – The DDA aims to implement the DROID API of PRONOM, developed by the National Archives, among other tools. Digital Data Archive (DDA) Project PREMIS registry from LOC.gov 74

75 75

76 http://www.oclc.org/research/activities/past/orprojects/pmwg/presmeta_wp.pdf Preservation Metadata for Digital Objects: A Review of the State of the Art OCLC/RLG Working Group on Preservation Metadata January 31, 2001 http://www.oclc.org/research/activities/past/orprojects/pmwg/presmeta_wp.pdf http://en.wikipedia.org/wiki/Metadata http://en.wikipedia.org/wiki/Preservation_metadata http://www.usenix.org/event/tapp09/tech/full_papers/factor/factor.pdf Authenticity and Provenance in Long Term Digital Preservation: Modeling and Implementation in Preservation Aware Storage Michael Factor, Ealan Henis, Dalit Naor, Simona Rabinovici-Cohen, Petra Reshef, Shahar Ronen,IBM Research Lab in Haifa, Israel and Giovanni Michetti, Maria Guercio, University of Urbino, Italy http://www.usenix.org/event/tapp09/tech/full_papers/factor/factor.pdf http://www.ahds.ac.uk/preservation/preservation-metadata-review.pdf AHDS Preservation Metadata Framework Raivo Ruusalepp, Estonian Business Archives, Ltd, September 2002 http://www.ahds.ac.uk/preservation/preservation-metadata-review.pdf http://www.loc.gov/standards/premis/understanding-premis.pdf http://www.loc.gov/standards/premis/v2/premis-2-0.pdf http://www.loc.gov/standards/premis/premis-registry.php http://www.loc.gov/standards/premis/tutorials.html http://www.loc.gov/standards/premis/louis-2-0.xml 76


Download ppt "PREMIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0."

Similar presentations


Ads by Google