Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining the Largest Library Database in the World Roy Tennant OCLC Research Leveraging WorldCat.

Similar presentations


Presentation on theme: "Data Mining the Largest Library Database in the World Roy Tennant OCLC Research Leveraging WorldCat."— Presentation transcript:

1 Data Mining the Largest Library Database in the World Roy Tennant OCLC Research Leveraging WorldCat

2 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L Worldcat.org/identities/ Algorithmically constructed from WorldCat records Algorithmically constructed from WorldCat records

3 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L Viaf.org A Union database of authority records A Union database of authority records

4 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L The Responsible Party Thom Hickey Chief Scientist OCLC Research

5 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L 290+ million records

6 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L Language Coverage 30 June 2012 60.2% 274 million 36.5 million 25.5 million 11.3 million 4.7 million 4.3 million 3.6 million 3.5 million Total German French Spanish Italian Dutch Russian Latin

7 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L Worldcat.org/identities/Worldcat.org/identities/

8

9

10 (J.K. Rowling) (Diana Gabaldon) (Galileo)

11 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L

12

13 Viaf.org

14 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L VIAF Participants

15 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L

16 “Super” Authority File

17 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L

18

19 Our Cataloging Future “Moving from cataloging to catalinking” Eric Miller, Zepheira

20 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L

21 Some Lessons Widespread collaboration is essentialWidespread collaboration is essential Normalizing the data is essentialNormalizing the data is essential Normalizing the data is complicatedNormalizing the data is complicated Everything is interrelated:Everything is interrelated: –You can’t bring names together if titles don’t match –You can’t bring titles together if names don’t match Batch mode processing still rules (but we’re getting better and faster at it)Batch mode processing still rules (but we’re getting better and faster at it)

22 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L Conclusions Data mining isn’t just useful, it’s essentialData mining isn’t just useful, it’s essential Extracting data from MARC that is useful in other contexts is possible, but will require sophisticated processingExtracting data from MARC that is useful in other contexts is possible, but will require sophisticated processing Only very large organizations (e.g., OCLC, national libraries) have the data and resources to do this workOnly very large organizations (e.g., OCLC, national libraries) have the data and resources to do this work Thankfully, we are doing it, but there is much more to be doneThankfully, we are doing it, but there is much more to be done

23 E U R O P E, M I D D L E E A S T & A F R I C A R E G I O N A L C O U N C I L Roy Tennant tennantr@oclc.org@rtennantroytennant.com


Download ppt "Data Mining the Largest Library Database in the World Roy Tennant OCLC Research Leveraging WorldCat."

Similar presentations


Ads by Google