Presentation on theme: "1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,"— Presentation transcript:
1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director, CrossRef ckoscher@CrossRef.org Ed Pentz Executive Director, CrossRef epentz@CrossRef.org
2 User Group Meeting June 6, 2006 Agenda * 2:30 - 3:00 CrossRef Overview & Update – Ed Pentz, Executive Director 3:00 - 3:20 System Update - Chuck Koscher, Technical Director 3:20 -3:45 Data Quality Initiative – Chuck Koscher 3:45 - 4:00 Services: Forward linking & Simple Text Query - Ed Pentz. 4:00-4:15 Break 4:15-4:30 Multiple Resolution Pilot - Ed Pentz 4:30-5:15 CrossRef Web Services Program overview & status - Ed Pentz OpenURL, RSS, & OAI-PMH Interfaces – Chuck Koscher 5:15 - ? Questions * Please feel free to open discussion at any point on any topic.
3 User Group Meeting June 6, 2006 System Update (status) System performance has remained acceptable, but loading has increased - Query response times remain under 1 second (typically 300-500 msec) - Stored query cycle time is ~4 weeks - Deposit times Less than 5 mn:47285(57 %) Less than 1 hr:14194(17 %) Less than 6 hr:16616(20 %) Less than 12 hr:926(1 %) Less than 18 hr:1301(1 %) Less than 24 hr:895(1 %) More than 24 hr:1350(1 %) Less than 5 mn:40255(46 %) Less than 1 hr:23384(26 %) Less than 6 hr:16867(19 %) Less than 12 hr:4903(5 %) Less than 18 hr:651(0 %) Less than 24 hr:13(0 %) More than 24 hr:446(0 %) April May Upgrades: move Oracle DB to a 4 cpu dual core x86 machine (from Sun Sparc) - Postpone major re-architecture project until effects of new machine are know
4 User Group Meeting June 6, 2006 System Update (new features) As of April we have been accepting deposits for extended content types - Technical-reports/working-papers - Dissertations / thesis - Standards -Soon we will add support for deposit of DOIs for database records New query result format: UNIXREF - Returns exact data publisher deposited for the individual DOI - Returns to the DOIs owner http://doi.crossref.org//servlet/query?usr=X&pwd=X"&format=unixref &qdata=|Journal of Neuroscience Research|Chen|66|4|612|2001||| format=xsd_xml
5 User Group Meeting June 6, 2006 System Update (new features) Improved queue management - Deposits and batch queries are upload as files to CrossRef. They are then processed out of a single queue of jobs. - We run up to 12 processors (SP) to work on these jobs. - Control by limiting file size, specify user, exclude user 1-2 SPs < 20,000 2-3 SPs < 50,000 2-3 SPs < 200,000 2-3 SPs no size limit - If you are going to be submitting a large number of back files or other significant volumes please contact us to discuss creating a special user name for this activity.
6 User Group Meeting June 6, 2006 Data Quality Initiative Metadata quality, initially CrossRef was not intended to be a metadata distribution service, MD simply had to be good enough to match DOIs. Now (primarily due to forward linking) CrossRef metadata is displayed on publishers web sites. Focus areas, 1) publication title & ISSN accuracy 2) complete metadata record 3) author name accuracy Tactic: Make publishers aware of their data quality Link persistence, an unacceptable number of DOIs no longer work! Primarily due to journals that move between publishers. Old publisher abandons DOIs, new publisher assigns new DOIs. Tactic: Ping test all journals on a regular basis. Notify publishers, publicize results.
9 User Group Meeting June 6, 2006 06:53:04 - Missing Conflict Checker for 6-JUN-20006:53:04 - Missing Conflict Checker for 6-JUN-2006 started. Submission contained email: firstname.lastname@example.org and journalciteids: 2040 ================================ Checking DOIs in file: 2040.xml 10.1016/S0169-5150(02)00072-5 : 10.1111/j.1574-0862.2002.tb00125.x : Titles match(The dynamics of land- cover change in western Honduras: exploring spatial and temporal complexity) 10.1016/S0169-5150(02)00073-7 : 10.1111/j.1574-0862.2002.tb00124.x : Titles match(Land use dynamics in the central highlands of Vietnam: a spatial model combining village survey data with satellite imagery interpretation) 10.1016/S0169-5150(02)00074-9 : 10.1111/j.1574-0862.2002.tb00123.x : Titles match(Temporal and spatial modelling of tropical deforestation: a survival analysis linking satellite and household survey data)6 started.
10 User Group Meeting June 6, 2006 CrossRef Web Services: Interfaces Current practice: Local Hosters (members or affiliates): receive one of three forms of XML data for use in internal linking systems and data clean- up, No redisplay permitted. No citation data made available. OAI-PMH interface will support all user types and all data formats (replace current local hoster methods). Allow selective delivery based on publisher opt-in/opt-out profile and on user agreement (available Q3 2006) - HTTP based protocol - Defined set of verbs: Identify, getRecord, ListIdentifiers, ListMetadataFormats, ListRecords, ListSets http://www.crossref.org/OAI? verb=ListSets - IP authentication (OAI-PMH standard method) - username authentication (added to allow each CR publisher retrieve theyre own data) OAI-PMH
11 User Group Meeting June 6, 2006 CrossRef Web Services: Interfaces OpenURL CrossRef currently operates a NISO Z39.88.2004 compliant resolver at http://www.crossref.org/openurl Allows public query and resolution services (no login required) http://www.crossref.org/openurl?url_ver=Z39.88- 004&rft_id=info:doi/10.1103/PhysRev.47.777 http://www.crossref.org/openurl?aulast=Maas%20LRM&title=JOURNAL%20OF%20 PHYSICAL%20OCEANOGRAPHY&volume=32&issue=3&spage=870&date=2002 http://www.crossref.org/openurl?id=doi:10.1103/PhysRev.47.777&noredirect=true http://www.crossref.org/openurl?issn=03770273&aulast=Walker&volume=54&spag e=117&date=1983&noredirect
12 User Group Meeting June 6, 2006 CrossRef Web Services: Interfaces RSS RSS has been discussed as a method for distributing metadata - Possibly a daily feed listing new DOIs (an alerting service?) Challenges - RSS does not easily support large bulk distributions (great for daily changes and newsy content) - Does not have integral support for discovery (if you want only a subset of the data or dont know exactly what is available) RSS feed is just a URL to an XML file http://server.com/my_rss.xmlhttp://server.com/my_rss.xml - What is the real business value of a CrossRef metadata feed using RSS (is it complimentary or in conflict with publisher feeds)?