Presentation is loading. Please wait.

Presentation is loading. Please wait.

InterParty 1 InterParty: Common Metadata and Linking Public Identities A presentation to the final InterParty Seminar The Hague 13 June 2003 Robina Clayphan.

Similar presentations


Presentation on theme: "InterParty 1 InterParty: Common Metadata and Linking Public Identities A presentation to the final InterParty Seminar The Hague 13 June 2003 Robina Clayphan."— Presentation transcript:

1 InterParty 1 InterParty: Common Metadata and Linking Public Identities A presentation to the final InterParty Seminar The Hague 13 June 2003 Robina Clayphan The British Library

2 InterParty 2 Outline Common Metadata for Public Identities - background and issues Proposed Common Metadata set InterParty Links Proposed Link Record

3 InterParty 3 Common functionality and metadata InterParty will be a network of InterParty members (IPMs) who have databases containing party metadata All potential members share a common need for accurate metadata to support the identification of parties All member databases have identification of parties as a common function Sharing access to party metadata between databases can substantially reduce costs of data creation and improve data quality Creating links will add value to the metadata held in separate systems

4 InterParty 4 We need sufficient metadata to allow “disambiguation” between parties with shared or similar attributes What common metadata is required? Is not the same person as Is known as John Williams … because different people use the same name...

5 InterParty 5 … and the same person uses different names We need sufficient metadata to allow “collocation” of the same party with different attributes Is also known as John Williams … in different contexts –eg Iain Banks and Iain M Banks … to hide their identity –use of pseudonyms … sometimes it’s simply a matter of language –Mao Tse Tung or Mao Zedung? Is also known as a member of the group called “Sky” What common metadata is required?

6 InterParty 6 What common metadata is required? How much is sufficient? The answer is contextual –Metadata about people (as about anything else) is essentially unbounded –A unique identifier may be enough (if you trust it’s source) The same person? For InterParty - we need enough metadata to make a comparison between parties in different databases in order to make a decision

7 InterParty 7 The InterParty approach InterParty “Common Metadata” is a subset of what may be publicly known Person Sensitive Not publicly known Publicly known “Personal” metadata Public Identity InterParty is concerned with “public identities” not “persons”

8 InterParty 8 The nature of Public Identities Person Public Identity One person usually has only one public identity Public Identity But some people have more than one, with different attributes For example, may write under a pseudonym

9 InterParty 9 The nature of Public Identities Person Public Identity Relationships between real persons and public identities out of scope

10 InterParty 10 Public Identity InterParty and Public Identities InterParty is concerned with Public Identities in different namespaces Within the InterParty network, each Public Identity will require a Public Identity Identifier or “PIDI”. This is a combination of a Namespace and a Unique Identifier within that namespace PIDI Namespcae B : 876X5 PIDI Namespace A : Brian Green

11 InterParty 11 Common Metadata and Public Identities Metadata that IPMs are willing and able to share over the network Information that is in the public domain Excluding information that is private or sensitive

12 InterParty 12 Common Metadata Designed to be a practicable set of elements: –To enable disambiguation –To enable the creation of Links asserting a relationship between Public Identity records in IPM databases –That IPMs will be willing and able to provide it is not expected that all IPMs will be able to provide all the elements Feasibility is the biggest issue –Certainly for the demonstrator we will not be able to achieve what we might see as “the ideal solution”

13 InterParty 13 Common Metadata “standards” Need to define rules and format conventions appropriate –The more standardised the Common Metadata (in terms, for example, of controlled “values”) the higher its value – but the higher its cost To what extent will the “common metadata” need to adhere to common forms of semantic or syntactical expression? –Manual links: only to a limited extent, if its function is primarily for human interpretation –Automated links: algorithmically-based linking would require more standardised “common metadata”

14 InterParty 14 Metadata questionnaire Unique identifier? Persistent? Standard & variant name forms? Party types? Corporate? Personal? Pseudonyms - single or multiple public Ids? Other standard identifiers? Dates of birth/death? Dates of incorporation? Period of activity? Address/contact details? Works? Roles? Author? Composer? Artist? Associations? Other distinguishing metadata? Nationality? Citizenship?

15 InterParty 15 Limitations to common metadata Is the element there? Is it held in a discrete field that could be mapped to a common metadata set? Many elements missing from individual data sets, e.g. – addresses, dates, works, roles, etc. Where data is held there is variable practice in how it is held, e.g. –works data contained within broader Notes fields –data held outside the “party file” as links within a given IPM’s working context - library authority files links to bibliographic files Options for automated, algorithm-based linking are limited

16 InterParty 16 Common metadata - conclusions There are unique party records to be linked There are sufficient metadata elements in different systems to support judgements about links Different databases bring different strengths - with potential for enrichment and more accurate identification –examples to be shown in demonstrator Proposal for a reduced metadata set based on areas of greatest commonality Assumption that the system is primarily a manual search/edit facility to support accurate identification - with Links themselves built via usage

17 InterParty 17 Proposed Common Metadata Set PIDI –An identifier assigned to a public identity by an IPM that is unique within the domain of that IPM –The unique ID comprises Namespace/identifier to ensure it is unique on the network –Must be persistent, though the associated metadata will typically change Standard Name –The standard, preferred or usual name by which a Public Identity is known Party Type –Nature of the public identity –Categories: personal, corporate, unknown Variant Names –different forms of names belonging to the public identity Related Ids –may contain names of other public identities linked to the public identity within the IPM’s own system, e.g. pseudonyms Date of birth –usually 4 digit year of birth Elements defined for the demonstrator system

18 InterParty 18 Works –Works with which the Public Identity is associated, represented by title –Accompanied by date & role of Public Identity if known Address/contact details –permissible only where Party Type = Corporate Notes –includes other identifiers, associations, roles, other metadata that may include works, dates, etc. where not recorded in a discrete field for display InterParty Links themselves –Access to the Links is key element of Common Metadata Currently, the only mandatory elements are expected to be the PIDI and a Name BUT more data will be essential to support the task of identification Proposed Common Metadata Set Elements defined for the demonstrator system

19 InterParty 19 InterParty Links

20 InterParty 20 InterParty’s added value proposition - InterParty Links All metadata is ultimately about expressing relationships that someone claims to exist –e.g. Book A ‘has’ Author B All the participating databases in InterParty will express such relationships internally –Sometimes the same relationships, sometimes different relationships InterParty will create value to the extent that it enables new relationships to be expressed between databases... –e.g. Person X in Namespace A ‘is the same as’ Person Y in Namespace B … and recorded as an “InterParty Link”

21 InterParty 21 Establishing new relationships The establishment of InterParty Links will require effort and judgement –hence the need for enough metadata to make a decision about relationships These Links need to be recorded and made available to others to be of real value –therefore InterParty will need a facility to record the decision and store it as a Link record. For the demonstrator this new Link data will be held centrally as part of the “aggregated metadata” Is the same person

22 InterParty 22 InterParty Links An InterParty Link is the assertion of a relationship between two PIDIs PIDI Namespcae B : 876X5 PIDI Namespace A : Brian Green is Links may only be made by the owner of one of the PIDIs… –…and endorsed or disputed by the owner of the other PIDI The assertion of a relationship between the two PIDIs is held in a single Link record For the purposes of the demonstrator project, the relationships expressed in a Link will be restricted to “is”, “is complex” and “is not”

23 InterParty 23 Types of Relationship PIDI 1 “is” PIDI 2 –It is asserted that PIDI 1 and PIDI 2 have a functional and reciprocal equivalence for the purposes of InterParty PIDI 1 “is not” PIDI 2 –It is asserted that PIDI 1 does not have a functional equivalence with PIDI 2 despite appearances PIDI 1 “has a complex relationship with” PIDI 2 –It is asserted that PIDI 1 has a partial equivalence or complex relationship with PIDI 2 that is not necessarily reciprocal

24 InterParty 24 What is a “Complex” assertion? IPMs hold records for parties and names in different ways - there may not be easy one-to-one relationships between databases –IPM A assigns a single PIDI for Ruth Rendell, with a note that Barbara Vine is a pseudonym of Ruth Rendell –IPM B assigns separate PIDIs for both Ruth Rendell and Barbara Vine (with or without an internal assertion between them) –The “complex” relationship must be used There are numerous other circumstances where IPMs may take a different approach to identification –It is not proposed to define all the relationships covered by “Complex” any more precisely for the demonstrator

25 InterParty 25 Establishing a Link The status of a link will relate to how it is established and to what degree the two IPM owners have been involved. There are 4 status types “Proposed” –The relationship has been asserted by one IPM owner only “Authorised” –Concurring assertions have been made by both IPM owners “Disputed” –Assertions have been made by both IPM owners but they do not concur “Inferred” –Generated automatically based on inference from “is” relationships only

26 InterParty 26 Inferred Links PIDI NSC: 876X54 PIDI NSA:123456 PIDI NSB:Brian Green is IFis AND isTHEN An inferred relationship

27 InterParty 27 Outline of a Link Record Link ID – a unique identifier for the Link Record PIDI 1 and PIDI 2 –the Identifiers being linked Link relationship –the relationship asserted (is, is not, is complex) Link status –Proposed, Authorised, Disputed, Inferred Link method –Manual or automatic Link timestamp –when the record was created or last updated Elements defined for the Link Record

28 InterParty 28 Owner Assertion composite -a group of elements which record each Owner IPM’s assertion about the Link, including -Owner ID -PIDI owned -Owner assertion – used to set up or amend Link Relationship -Assertion comment – notes field -Asserted by – name of individual -Assertion timestamp Comment composite -A group of elements to allow other IPMs to add further notes or comments to the record without directly affecting status of the assertion Elements defined for the Link Record Outline of a Link Record

29 InterParty 29 The InterParty network A C B “Resolution Service” Metadata User “Common metadata” A B C “InterParty Link records”

30 InterParty 30 To summarise... A limited set of discrete metadata fields The fields are mostly optional to allow for variable practices among IPMs The system will be built on human judgements and interpretation of the metadata found in searches across the network Adding links will make connections that will mutually enrich metadata in any two systems by associating the different strengths of the different systems Adding links will be a simple procedure –illustrated by the demonstrator

31 InterParty 31 Thank you


Download ppt "InterParty 1 InterParty: Common Metadata and Linking Public Identities A presentation to the final InterParty Seminar The Hague 13 June 2003 Robina Clayphan."

Similar presentations


Ads by Google