Presentation on theme: "Using OpenURL Activity Data Activity Data Online Exchange Event Sheila Fraser 2 nd June 2011."— Presentation transcript:
Using OpenURL Activity Data Activity Data Online Exchange Event Sheila Fraser 2 nd June 2011
2 What are the OpenURL Router Data? Learners, teachers and researchers in UK HE institutions seek journals and papers for academic use A paper may be available from many different service providers - so which is the appropriate copy for a user? The OpenURL Router directs the request to the appropriate institutional resolver Each redirect request is logged, providing a record of the article the user was attempting to find The Router also logs non- bibliographic metadata: lookup requests (registry searches) and preferred button image requests Existing process … Institutional Resolvers OpenURL Router Institutional Resolvers Request Redirect request Log request Level 0 Data (Log)
3 What are we doing in this project? Existing process … Institutional Resolvers OpenURL Router Institutional Resolvers Request Redirect request Log request Level 0 Data (Log) Survey institutions to enable opt-out Level 1 Data Process to Level 1: Exclude data from opted- out institutions Anonymise IP addresses Anonymise institution & remove button & lookup data that would identify institution Parse OpenURL request into constituent parts Process to Level 2: Include only redirect to resolver requests Level 2 Data Use for prototypes & services Aim 1: Make this data available under open licence Aim 2: Develop prototype service using this activity data Aim 3: Explore including other institutions data in aggregation
4 Whats in the data set? Log-specific data (based on OpenURL Router log entries): –logDate (Date the record was logged) –logTime (Time the record was logged in format HH:MM:SS) –encryptedUserIP (Anonymised IP address/session identifier) Request-specific data (based on the OpenURL standard): –institutionResolverID (Anonymised institutional identifier) –routerRedirectIdentifier (Redirect identifier passed as part of the URL) –aulast (Last author) –aufirst (First author) –auinit (First author's first and middle initials) –auinit1 (First author's first initial) –auinitm (First author's middle initial) –au (Full name of a single author) –aucorp (Organization or corporation that is the author or creator of the document) –atitle (Article title) –title (Journal title, for compatibility with version 0.1) –jtitle (Journal title) –stitle (Short journal title) –date (Date of publication) –ssn (Season (chronology). Legitimate values are spring, summer, fall, winter) –quarter (Quarter (chronology). Legitimate values are 1, 2, 3, 4.) –volume (Volume designation, usually expressed as a number but could be non-numeric) –part (Part can be a special subdivision of a volume or it can be the highest level division of the journal. Parts are often designated with letters or names) –issue (Designation of published issue of a journal) –spage (First page number. Pages are not always numeric) –epage (Second (ending) page number) –pages (Start and end pages, e.g ) –artnum (Article number assigned by the publisher) –issn (International Standard Serials Number) –eissn (ISSN for electronic version of the journal) –isbn (International Standard Book Number) –coden (Alphanumeric bibliographic code) –sici (Serial Item Contribution Identifier) –genre –btitle (The title of the book - can also be expressed as title) –place (International Standard Book Number) –pub (Publisher name) –edition (Statement of the edition of the book) –tpages (Total pages) –series (The title of a series in which the book or document was issued) –doi (Digital Object Identifier) –sid (Service ID, the item(journal, article etc) provider) Further details:
5 How might the data be used? Article/journal recommendations Student analysis Research thesis Publishers comparing listings with texts sought Identifying priorities for eJournal preservation Innovative services to meet your users needs Other, unanticipated uses
6 Explore including other institutions data Can it be aggregated? –Data compatibility (OpenURL standard) What are the issues? –Legal DPA: cannot share personal data without permission –Technical Can we / how do we extract resolver data at the same level of detail? How to identify duplicates? Regular sharing & maintainability? –Financial What potential effort could be involved? What other issues are there?