Presentation is loading. Please wait.

Presentation is loading. Please wait.

OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13.

Similar presentations


Presentation on theme: "OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13."— Presentation transcript:

1 OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13

2 OCoLR 20041025 #53928015 OCLCR Overview  Some context  Looking at data in action OpenWorldCat FRBR Data mining

3 OCoLR 20041025 #53928015 OCLCR Context: value  Amazoogle: what should we be doing which fits into a world that they occupy. Where do we provide unique value.  ROI: libraries invest in data but do not extract as much value as they might from it. Unless we release more value, then the argument for this investment becomes weaker.  User: how do we co-create value with users. What opportunities are there for mixing catalog data and user contributed data?  Management intelligence: how do we use data better to inform management decisions?

4 OCoLR 20041025 #53928015 OCLCR Context: consequences  The role of the catalog?  The role of structured data?  The role of the library?

5

6 OCoLR 20041025 #53928015 OCLCR Data  Open WorldCat  FRBR  WorldCat Wiki  Management intelligence

7

8

9

10

11 OCoLR 20041025 #53928015 OCLCR FRBR  ‘Interim FRBR’ in OWC  FRBR in research projects FictionFinder Curioser xISBN Algorithm Top 1000  FRBR in FirstSearch – late this year

12

13

14

15

16

17 OCoLR 20041025 #53928015 OCLCR

18 Top Sets for Fiction (Records) RecordKeys 1,296defoe, daniel\1661 1731/robinson crusoe 1,267 carroll, lewis\1832 1898/alices adventures in wonderland 971 cervantes saavedra, miguel de\1547 1616/don quixote 828 stevenson, robert louis\1850 1894/treasure island 689 twain, mark\1835 1910/adventures of huckleberry finn 624 twain, mark\1835 1910/adventures of tom sawyer 618 swift, jonathan\1667 1745/gullivers travels

19 Top Sets for Fiction (Holdings) HoldingKeys 29,043twain, mark\1835 1910/adventures of huckleberry finn 26,088carroll, lewis\1832 1898/alices adventures in wonderland 20,843twain, mark\1835 1910/adventures of tom sawyer 19,410defoe, daniel\1661 1731/robinson crusoe 18,566cervantes saavedra, miguel de\1547 1616/don quixote 18,492stevenson, robert louis\1850 1894/treasure island 18,123dickens, charles\1812 1870/christmas carol

20 OCoLR 20041025 #53928015 OCLCR Taking FRBR onto the open web  Curio(u)ser

21

22

23

24 OCoLR 20041025 #53928015 OCLCR MetaWiki  WIKI – web pages  metaWIKI – data  Capture user input in structured ways

25 OCoLR 20041025 #53928015 OCLCR Extending Wiki’s utility Wiki:  supported markup: wikitext  page editing: a single text block  searches: full text searching  collections managed: one per wiki MetaWiki:  supported markup: wikitext structured data (e.g., MARC, METS, DC…)  page editing: a single text block, or, field level  searches: full text searching fielded searching  collections managed: one/multiple per OaiWiki

26 Lorcan: note that this is a work in progress

27 OCoLR 20041025 #53928015 OCLCR Management intelligence  So we have all this data – what can it tell us?  Several projects underway: only some discussed here

28 OCoLR 20041025 #53928015 OCLCR Making Data Work Harder  Activities “shed” data: Cataloging  bibliographic information Web site traffic  transaction logs Reference queries  search term lists  Need to mine this data for intelligence that creates value for libraries and users  OCLC Research undertaking a number of data-mining projects aimed at: Knowing more about the characteristics of library collections Creating interesting and useful data displays Generating intelligence to support library decision-making

29 OCoLR 20041025 #53928015 OCLCR Data mining  OCLC has a new collection analysis service  Some research projects looking at systemic questions described here.

30 OCoLR 20041025 #53928015 OCLCR Looking at Library Print Book Collections … Systematically 32 million print books, representing 26 million distinct works Half of print books published after 1977; more than 80% still “in copyright” Rareness is common! Only a third of print books have more than five holdings; half have two or less OCLC/Ithaka collaboration: Use WorldCat to characterize the “system-wide” print book collection – i.e., aggregate print book holdings in WorldCat Intelligence of this kind can help establish digitization priorities and inform preservation planning More information: http://www.oclc.org/research/presentations/lavoie/cni2005.ppt Only about 120,000 works had both print book and e-book manifestations

31 OCoLR 20041025 #53928015 OCLCR The Implications of GooglePrint … Potentially covers about one third of print books in WorldCat ~60 percent of “GooglePrint” books held by only one of the Google 5 Less than 5 percent held by all of the Google 5 ~20 percent of “GooglePrint books” out of copyright Paper forthcoming …

32 OCoLR 20041025 #53928015 OCLCR Know Your Audience! Implies: we can infer materials’ audience level from holdings patterns, which in turn can support: Collection management Readers’ advisory services Reference services Information retrieval Holdings represent selection decisions by librarians … implies there are about 1 billion individual selection decisions in the WorldCat holdings file Selections are made to serve the interests of a library’s target community … Associate target community (audience level) to particular library profiles - e.g., ARL, non-ARL academic, public, K-12 school … Paper forthcoming! ?

33 OCoLR 20041025 #53928015 OCLCR “Last Copy”: Identifying At-Risk Materials ~23 million WorldCat records have only a single holding attached Libraries need to know what portions of their collections are: Rare … Rare and valuable … “Last copy” (artifact and/or content) Identification of rare materials essential intelligence in support of storage, digitization, and preservation decision-making Data-mining study of Vanderbilt holdings in WorldCat: Identified 23,000 items held uniquely by Vanderbilt ~60 % are print books ~60 % produced prior to 1950; ~25 % produced after 1970 Paper forthcoming!

34 OCoLR 20041025 #53928015 OCLCR Thank you! OCLC Research: http://www.oclc.org/research/http://www.oclc.org/research/ Lorcan: http://orweblog.oclc.org/


Download ppt "OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13."

Similar presentations


Ads by Google