Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Slides:



Advertisements
Similar presentations
E-resources Collection Management Anna Grigson E-resources Manager.
Advertisements

EPS/Rooms and Normative Data Project Jeff Schilling Director, Sales Support SCUUG Meeting October 2005.
The EBSCONET Subscription Management System is a multi-lingual
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
How Come It Takes Me So Long to Get Answers to Simple Questions About My Business? Technologies for Business Intelligence Introduction to Microsoft Access.
Chapter 1 Business Driven Technology
Cutting-edge technology for the development of business software applications Takes advantage of the most recent international trends, combining Microsoft.NET.
Introduction to Integrated Library Systems
C6 Databases.
Database Management3-1 L3 Database Management Santa R. Susarapu Ph.D. Student Virginia Commonwealth University.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Please do not reuse these slides without prior permission from Copyright 2004 Scott Nicholson, Syracuse.
Moving libraries to Web scale Matt Goldner Product & Technology Advocate 14 June 2011.
Chapter 3 Database Management
Business Intelligence Andrew Davis Andria Zippler Jana Krinsky Tiffany Ferris.
Library Statistics: what’s needed and what’s new Lynn Copeland Simon Fraser University Library Thurs. March 15, 2007 Vancouver Ass’n of Law Libraries.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Library Automation and Digital Libraries Class #5 LBSC 690 Information Technology.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Lecture-8/ T. Nouf Almujally
System Analysis and Library Automation Session 12 LBSC 690 Information Technology.
Information and Communication Technologies in the field of general education in Armenia NATIONAL CENTER OF EDUCATIONAL TECHNOLOGIES.
OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill.
Technical Services & Cataloging and Classification Jennifer Anielski and Christina Tracy IS 554 Public Library Management.
Copyright © 2014 McGraw-Hill Education. All rights reserved
C A S E S T U D I E S—S T R A T E G I E S F O R S U C C E S S November 7 - 9, 2002.
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Database Systems – Data Warehousing
Library Assessment in North America Stephanie Wright, University of Washington Lynda S. White, University of Virginia American Library Association Mid-Winter.
Why Open-Source? No Vendor-Locking In a proprietary software --- Your supports lock with it. freedom to customize and improvements in software needs,
MBAD/F 619: Risk Analysis and Financial Modeling Instructor: Linda Leon Fall 2014
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Plagiarism What is it? Any time a student represents work done by someone else as his or her own, that student has committed an act of plagiarism.
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
1 CONCERT 2004 Power to the Librarian Delivering Transparency in the Serials Market Doug McMillan Managing Director Bowker UK Ltd.
Case 2: Emerson and Sanofi Data stewards seek data conformity
Relational Databases Melton, Beth “Databases: Access Terminology and Relational Database Concepts.” 09/LPMArticle.asp?ID=73http://pubs.logicalexpressions.com/Pub00.
Monograph Collection Development in an Age of Uncertainty: The University of Haifa Library Experience Cecilia Harel Head of Collection Development, Gifts.
College Library Statistics: Under Review Teresa A. Fishel Macalester College Iowa Private Academic Libraries March 22, 2007 Mount Mercy College, Iowa.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
ALEPH Software Development Plans November 2002 James Steenbergen Director of Support Services ALEPH Software Development Plans November.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
System Analysis and Library Automation
Using Data Analytics for School Library Assessment and Improvement Dr. Lesley Farmer California State University Long Beach.
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
X Geac Welcome to our Library Client Server Solution tour.
Do Approval Plan Purchases Circulate More Than Firm Orders? Friday November 8, :15pm - 3:00pm Drayton Room, Francis Marion Hotel 33rd Annual Charleston.
Types of Information Systems Dr. D. Bilal IS 582 Spring 2007.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Collection Evaluation and Management: Decision Center Launch & Reporter Update Amanda Schukle Product Manager.
Iowa’s Shared Library Automation System Information Session June 18, 2008.
Applications of Big Data Analytics. Introduction ●Big data is an evolving term, and it consists of high volume of structured, semi-structured and unstructured.
Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.
Popular Database Management Systems
Pengantar Sistem Informasi
Types of Information Systems
Framework for a Forensic Audit and Investigative Capability
ENHANCING MANAGEMENT DECISION MAKING
Database Management System (DBMS)
MANAGING DATA RESOURCES
OCLC, WorldCat and Connexion
Chapter 3 Database Management
Data mining Data mining is the process of analyzing data from different perspectives and summarizing it into useful information.
Collection Analysis with Circulation, ILL and Collection Statistics: A Follow-up Presentation Lynn Silipigni Connaway OCLC, Inc. Heather Wicht University.
Presentation transcript:

Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel The Library Corporation Computers in Libraries 2005 Session B203, March 17, 2005

The Plan What is data mining and why is it useful? Who else does it? Does it make sense for libraries? Are libraries already doing data mining? What data can libraries mine? How much sophistication do I need?

What is Data Mining? Collection and Analysis of ones own data in order to make better business decisions. More than simple data storage Business intelligence technology for discerning unknown patterns from large databases Uses statistics, artificial intelligence, various modeling techniques Related to, but different from, bibliomining

Value and Importance By identifying patterns and predicting future trends … –Make decisions based on facts, not guesswork –Develop sensible processes –Reduce costs or increase services by efficient use of resources Serve the customer better

High Level planning Remember -- GIGO. Define the data mining goals Data collection Data organization and normalization Analysis Reiteration

Who is Data Mining now? Manufacturing –process control Banks and financial institutions – full service Government and law – fraud, abuse Sports – RHP versus LHB? Sucker for a curve ball? Service industries – almost all CRM systems Retail: product stock and placement Travel: airline overbooking Las Vegas: guest tracking for comps and benefits Groceries: affinity cards Internet: GoogleAds

Nuggets Found by Mining Chase Bank: minimum balance versus other bank business Home Depot hurricane planning WalMart (UK) diapers and beer (actually a hoax, but an informative one) Casino security in Las Vegas - fraud

Implementer Level Tools Oracle® Data Mining Suite Microsoft SQL Server 2000 SPSS and similar Statistica STATSOFT Open Source: –Cornell Univ. Himalaya Data Mining Tools –WEKA Waikato Environment for Knowledge Analysis (Univ. of Waikato, NZ)

Looking for the Dog that Doesnt Bark NORA – Non Obvious Relationship Awareness –Examines third ++ level relationships between datasets ANNA – Anonymized Data –Double-blind application/offshoot of NORA that deals with personal attributes anonymously

Vocabulary Lesson Bagging (averaging) Boosting (calculating predictive data) Drilling down Stacking (combining predictions from different models) Predictive mining (using X to predict Y) Data Models: –CRISP = Cross Industry Standard Process for DM –SEMMA = Sample, Explore, Modify, Model, Access

Value to Libraries a Tool Citizens demand more/better service at a time of reduced funding. Anticipate USER behavior Anticipate STAFF behavior Service hours and staffing needs, facilities planning Collection development – anticipating customer needs

Do Libraries Use DM? Association of Research Libraries ARL Spec Kit 274 (2003) – Mento and Rapple –124 surveys, 65 responses –40% already doing some data mining –90% had plans Major areas of activity –Research and Collection Support –Administration –Repository management (future)

ARL Member Benefits Seen Serials cancellation projects Collection Development tuning Budget allocation by material use Workflow analysis Weeding OPAC and Web presence usability and redesign Hacking and break-in analysis (defensive data mining)

Other Library Data Mining Kun Shan University of Technology (Taiwan) –ABAMDM Model = Acquisition Budget Allocation Model based on Data Mining –More material use More money –Compared: Circulation Collection size Department size # of courses # students/faculty per department

Other Library Data Mining (2) OCLCs ACAS (Automated Collection Analysis System) (recently upgraded!) –Analyzes bibliographic records by call number ranges (LC 4-digit, Dewey tens for example) –Subdivides by years and aggregated years –Subdivides by branch / collection –Collection conspectus as a way to: Compare library collections Identify collection deficiencies

Other Library Data Mining (3) Univ. of Florida with FCLA –Decision Support System for acquisitions activities –Extracted from NOTIS bib files; saved to DB2 –Screen scraped Acq files –Created large database of bib and in-process records which allowed querying: Circ history of approval versus firm orders? $ spent on titles that never circulate Do originally-cataloged items circulate? More or less than copy cataloged items? How many items circulate more than n times? –Assesses collection development and tech service activity

Libraries are fountains of data

Everything is countable ( example: Circulation transaction ) Book: branch location Media type pubdate size color thickness #circs cost vendor holds Extractable: Census Tract Curriculum Holds Circ History Repairs User: age Location Language Sex Zipcode phone# School Loan history delinquencies Multiply this by 10 million times a year!

Expand to: Acquisitions information (book attributes, vendor history and performance, fund history, requester and department, etc.) OPAC searching and navigation (databases, searches, not founds) Metasearch usage (databases, usage) Reference desk interactions (who, what, how long?). VRD by extension Resource sharing (NCIP, ILL) In-house usage transactions Physical plant: elevator, restroom, copier use

Crunch (Data) Creatively Unlikely variables give interesting data Ideas: –Sex of user versus color of book –Call # range vs. age of item vs. circulation ratio by avg. $ paid per item –Story hour attendance vs. Adult circ vs. Fines collected –Best sellers cost vs. Trade books by cost per circ –Etc.

If you can count it, you can analyze it But remember - QUALITY and CONSISTENCY

Library Automation vendor for over 30 years Family-owned, customer focused LibrarySolution® LibrarySolution for Schools CARLSolution® CARLX

LibrarySolution Reports Utilizes ReportNet software Drag and Drop Report Design Completely Web-based Fitted to Library.Solution data framework Zero footprint on workstations Central reporting with enhanced distribution Multiple export formats Charts, tables, etc. Powerful

Using Library Data Outside the Library City, County, RCOG, State Planning and Development Authorities –Require solid statistics about population, educational level, etc. –Quality of Life and capital budget services planning Preserve user anonymity but share trends Input to GIS systems for real time projection of future library needs

Applying GIS in the Library Market Library.Decision product Works with ILS vendors including TLC Focus collections development Strengthen advocacy planning; undertake cardholder development campaigns Support grant applications Site new facilities Calculate service indicators Evaluate service delivery in relation to the unique needs of your community

In closing … Libraries are producing data every minute of every day You need: –Some tools –Some creativity –Some analytical ability Knowledge is Power !

Acknowledgements Nicholson and Stanton, Gaining strategic advantage through bibliomining. At Banerjee, Is Data Mining Right for your library? Computers in Libraries, Nov. 98 Kao, Chang, and Lin. Decision Support for the Academic Library…, Information Processing and Management 39(2003) Fabris. Advanced Navigation. CIO May 1998 Library Administration and Management (journal) Winter 1996, section on Data Mining

Thank You Contact information Ted Koppel The Library Corporation (800)