Presentation is loading. Please wait.

Presentation is loading. Please wait.

Search to Discovery: Finding Global Scholarly Resources with Primo

Similar presentations


Presentation on theme: "Search to Discovery: Finding Global Scholarly Resources with Primo"— Presentation transcript:

1 Search to Discovery: Finding Global Scholarly Resources with Primo
Pascal Calarco & Alison Hitchens, Library December 6, 2011

2 Agenda The state of search in libraries (Pascal)
Expanding Primo beyond the local catalogue (Alison) Questions 2011

3 Library Information Systems: Milestones
Discovery Metasearch Citation Linking ILS 3rd gen (Client-server; 1990s) ILS 2nd gen (Mainframe; 1980s) OCLC (library network; 1972) Early systems MARC 2011

4 In the beginning, there was the card catalog (1901+)
Indexes: Subject Author Title Interfiled cards, call number access 2011

5 Library of Congress National Union Catalog (pre-1956)
2011

6 Henriette Avram, Developer of MARC
Programmer/analyst at Library Of Congress Developed system for printing card catalog information (MARC) ISO certification 1973 2011

7 Later, there was the Online Public Access Catalog (OPAC)
Machine Readable Cataloging (MARC) Inventory of the print/physical holdings of a library Better than the card catalog; keyword searching & boolean functionality Non-intuitive; required training or intermediation (information professional) Limited generally to single library 2011

8 Library networks & resource sharing
2011

9 Print to Electronic 2011

10 Now: Electronic Almost Ubiquitous
85%+ of journal literature digital Hundreds of specialized scholarly databases Mass print book digitization efforts Electronic books going mainstream Aggregated meta-indexes: 750 million metadata for journal/newspaper articles 2011

11 Goal: improve user experience
Users want to FIND not search Source required information to user regardless of format or location Leverage our knowledge of academic uWaterloo Integrate into key services: LMS, CMS, other library services 2011

12 Database Content Silos
Science-Direct Web of Science JSTOR ETDs EEBO Catalog ILL Website Meta-search eReserve System Silos

13 Metasearch: an interim step
aka Federated Search; emerged 2003 Distributed search from one interface via web services, SOAP/XML gateways Idiosyncratic and slow; vendors implemented variously Relevancy of merged results problematic 2011

14 Problems with catalog searching & evolution to discovery
UCLA & Berkeley: information retrieval & user behavior ( ) Google Books: “digitize the world’s knowledge” (2002) Karen Schneider, Andrew Pace, Roy Tennant: “The OPAC ‘Sucks’”(2002) Next generation catalogs -> Discovery (2008+) 2011

15 Catalogs: Information Science Research
Christine L. Borgman (1986) “Why are online catalogs hard to use? Lessons learned from information retrieval studies” Journal of the American Society for Information Science Ray R. Larsen (1991) “The decline of subject searching: Long-term trends and patterns of index use in an online catalog” Journal of the American Society for Information Science Ray R. Larsen (1992) “Evaluation of advanced retrieval techniques in an experimental online catalog” Journal of the American Society for Information Science Ray R. Larsen (1996) “Cheshire II: designing a next-generation online catalog” Journal of the American Society for Information Science Christine L. Borgman (1996) “Why are online catalogs still hard to use?” Journal of the American Society for Information Science

16 How Users Search: What We’ve Learned
Most people make typos at least some of the time Most searches are 2, 3, 4 words with no Boolean operators Most searches use keyword Search is hesitant, iterative, often random process of discovery Most people start elsewhere Few read help screens Few use advanced search – this is true even in Google

17 The Google Effect Expectations for web search tools now:
Radically simplified UI, fast results Aggregated content Relevant results on first page Natural Language queries Spelling correction/adaptation 2011

18 The OPAC “Sucks” The OPAC lacks common features of most search engines
Relevance ranking vs. last in, first out Spell checking (related - did you mean?) Popular query operators like + and – Refine search Sort flexibility Faceting Citation indexing vs full text Developed for print materials, limitations with electronic materials or atomized items (like articles) Difficult for certain known item search Karen Schneider, Andrew Pace, Roy Tennant

19 Industry Trends Decouple the front end (search and discovery) from the back end (inventory and cataloguing) Service Oriented Architecture – many programs loosely coupled Cloud services -- SaaS The 5th generation of library business systems emerging now – hosted, cloud solutions

20 Discovery Characteristics
Enhanced Search Functionality Faceted browse Relevance ranking “Did you mean?” / Spell Checking auto-correction, resubmit search Content aggregation Integrating search for books, articles, etc. Single, Simple Search Box FRBR – functional requirements for bibliographic record, grouping editions

21 Discovery Characteristics, cont.
Enhanced Experience Sometimes fun and engaging Interactive/Collaborative User centered design Enhanced Services Find it / Get it for me Book Covers / Synopsis Full text Availability on same page as results

22 Discovery Characteristics, cont.
Enhanced Content Article Searching Commercial Data Merging Special Collections Harvesting Online Collections Grey Literature Free Content Enhanced Access Syndication - Getting into users tools Course Management Systems Browser and Desktop Tool Bars Portals

23 Discovery Components Next Generation Catalog
Next Generation “Unified Search” Aid Vendor Data Vendor Data Full Text OAI User Interface ILS OPAC Normalization & Apache SOLR/Lucene MetaSearch MARC Circ Data

24 Content Components Primo TUG Phase I Phase II Future Others
Primo Central Others TUG OCUL HathiTrust Archives Geospatial RACER Primo

25 Evolution of Discovery
Catalog Primo Meta-search Primo Central

26 Options for Expanding Primo
Local ingestion of resources using FTP or OAI harvesting Searching remote resources in Primo using the Primo DeepSearch API* Subscribing to a large centralized index, such as Primo Central *Application Programming Interface 2011

27 Local ingestion of records
Example: Hathi Trust Digital Library Harvest the public domain records from Hathi Trust Digital Library Normalize the records Index the records in our local Primo database Schedule updates from Hathi Trust into Primo From Wikipedia: HathiTrust is a very large-scale collaborative repository of digital content from research libraries including content digitized via the Google Books project and Internet Archive digitization initiatives, as well as content digitized locally by libraries. HathiTrust was founded in October 2008 by the thirteen universities of the Committee on Institutional Cooperation and the University of California. The partnership includes over 50 research libraries[1] across the United States and Europe, and is based on a shared governance structure. Costs are shared by the participating libraries and library consortia. The repository is administered by Indiana University and the University of Michigan. The Executive Director of HathiTrust is John Price Wilkin, who has led large-scale digitization initiatives at the University of Michigan since the mid 1990s. As of December 2010, HathiTrust comprises over 7.5 million volumes, over 1.8 million of which are public domain. HathiTrust provides a number of discovery and access services, notably, full-text search across the entire repository. As of October 2011, the Authors Guild is pursuing legal action against Hathitrust (Authors Guild v. HathiTrust) citing massive copyright violation. 2011

28 Normalization: creating local sort field (Date – Oldest)
2011

29 Primo Normalized XML (PNX)
2011

30 Open source & Open platform
Primo uses Lucene for its indexing SOLR exposes Lucene as a web service and allows for faceting APIs and web services allow flexibility and customization 2011

31 We can’t index everything!
Trying out a subscription to Primo Central, a centralized index of scholarly journal articles, newspapers, conference proceedings etc. User sees one interface; user is searching 2 indexes 2011

32 What is Primo Central Index?
A centralized index of free and restricted resources primarily articles & e-books based on metadata & full-text provided by publishers/aggregators based on the collections selected by the library in the Primo Administration module created & maintained by our vendor, Ex Libris Alexander Street Press IOP Publishing Project MUSE National Academy of the Sciences NLM/PubMed American Institute of Physics SPIE OECD LexisNexis Publishing Technology (Ingenta Connect) Springer My iLibrary Gale (e.g. Academic OneFile, ECCO) Accessible Archives Web of Science American Psychological Association ACM Bepress BioOne OUP Wiley-Blackwell

33 What is Primo Central Index?
A centralized index of records harvested using the same process as our local Primo database created using the same PNX record structure as our local Primo database indexed using the same indexing tools as our local Primo database

34 Blending local and remote resources
Both local and remote results are represented in the facets Blended relevance ranking Can configure Primo to boost high ranking local results so that when Primo is doing relevance ranking on our 4 million records alongside 100s of millions of Primo Central records local results aren’t missed by the user

35 Search = local resources & Primo Central
uWaterloo decided on a 3-tabbed approach and we are questioning our users about this approach; uGuelph decided to try just one basic search box The first tab is what Ex Libris calls a “blended” search Hover text is: “Search books, articles & more”

36 How does it work? Ex Libris has created & indexed records for millions of items based on information from the publishers Primo searches Primo Central the same way it searches the local database Full text availability is determined in advance by our URL resolver SFX, i.e. Delivery of the resource uses menu for

37 New features: snippets give context
If your search term is found in the full-text, Primo supplies a snippet highlighting the term

38 New features: expanding the search
Defaults to our library’s electronic subscriptions but users can expand the search to all of Primo Central

39 New Facets & Facet Values
**resource pre-filter **top level -- online resource for full-text **resource type **creator **topic **journal title (need to turn on in view) **collection (database name) ?? with our static facets ** genre ** creation date ** language

40 Added value: bX Recommender
*bX Recommender was activated in our Primo Central trial and a “recommendations” tab now appears with citations to scholarly recommendations to related articles *One-on-one interviews were conducted with 12 students during the last week of August to get feedback on the bX Recommender service.  Overall, the students agreed that they liked the idea of a scholarly recommender service for articles, and they thought that the recommended articles that they saw in bX Recommender were relevant.  We will continue to investigate student and faculty opinions over the course of the fall term

41 Trouble-shooting remote resources
We can view the PNX records using web services but we have no control over the content or the normalization rules Records have the same structure as our local records but are missing local fields and don’t reflect local policies 2011

42 Assessing Primo Central
Over 65 hours of one-on-one usability testing and focus groups with undergraduate students, graduate students, faculty, staff and alumni Library staff survey Feedback form Statistics from Cognos -want multiple tabs but don’t understand labels -interested in expand beyond --- but didn’t understand without an explanation -thought the recommender service was useful -generally pleased with breadth of resources, ease of use and flexibility -some concerns about library instruction and information literacy – don’t just rely on relevance ranking and these resources, how still teach critical thinking etc. 2011

43 Looking to the future What other content should be added to Primo?
How can we improve/enhance the interface? What is the right balance for boosting local physical resources? How do we point users to resources that can’t be searched using Primo? 2011

44 Questions? Pascal Calarco Alison Hitchens
Associate University Librarian, Digital & Discovery Services Alison Hitchens Cataloguing & Metadata Librarian 2011


Download ppt "Search to Discovery: Finding Global Scholarly Resources with Primo"

Similar presentations


Ads by Google