ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Open Scholarship 2006 Bielefeld Academic Search Engine a Scientific Search Service for Institutional Repositories Open Scholarship 2006 New Challenges.
How to Use LucidWorks Search
Looking Ahead Archive-It Partner Meeting November 12, 2013.
JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot CERN Library GS/SIS The Library behind the scene Opportunities for Scientific.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
JSTOR User Services l February 2009 Using the JSTOR Interface User Services, February 2009.
Medical Knowledge Watch at the Belgium Poison Centre Christophe Dupriez 26 June 2007.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Eric Sieverts University Library Utrecht IT Department Institute for Media & Information Management (Hogeschool van Amsterdam)
Lund Online 07/10/2009 Ingolf Kaspar, Regional Sales Manager EBSCO Publishing.
New User Interface Demo with the Plug-in Reader. Contents  UI overview  Select Reader  Search  Simple  Focus  Advance  Recent  Chapter rank 
Overview of Search Engines
Library HITS Helpful Information for Trinity Students/Staff Library eResources for Languages & Literatures Michaelmas Term 2013 Trinity College Library.
An introduction to databases In this module, you will learn: What exactly a database is How a database differs from an internet search engine How to find.
Databases & Data Warehouses Chapter 3 Database Processing.
PubMed/How to Search, Display, Download & (module 4.1)
Web Archiving at the Innsbruck Newspaper Archive Innsbrucker Zeitungsarchiv / IZA Presentation by Renate Giacomuzzi, Elisabeth Sporer, Armin Schleicher.
ELISQ Discussion with QNL Director Lux 20 May 2015 Edward A. Fox Professor, Computer Science, Virginia Tech Blacksburg, VA USA
RESEARCHING TIPS & STRATEGIES Summer 2008 Melanie Wilson Academic Success Center MSC 207.
Wasim Rangoonwala ID# CS-460 Computer Security “Privacy is the claim of individuals, groups or institutions to determine for themselves when,
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Databases and Library Catalogs Global Index Medicus/Global Health Library PubMed Source Bibliographic Database: International Health and Disability.
Slide Title CSA Illustrata – a new way of searching… Sean Mckone Area Sales Manager.
Online Autonomous Citation Management for CiteSeer CSE598B Course Project By Huajing Li.
Qatar Content Classification Presenter Mohamed Handosa VT, CS6604 May 6, 2014 Client Tarek Kanan 1.
Web Scale Discovery Service Vs Federated Search NIKESH NARAYANAN
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
SUMMON ® 2.0 DISCOVERY REINVENTED. What is Summon 2.0? A new, streamlined, modern interface New and enhanced features providing layers of contextual guidance.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
Qatar Content Classification Presenter Mohamed Handosa VT, CS6604 March 6, 2014 Client Tarek Kanan 1.
SharePoint 2010 Search Architecture The Connector Framework Enhancing the Search User Interface Creating Custom Ranking Models.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Geo-spatial Search Engine Liqiang Cheng, Naiqi Jin, Jason Yap.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Indexing of Tables and Figures: Scientists’ Reaction Carol Tenopir University of Tennessee web.utk.edu/~tenopir/
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
1 Manage your Research Articles : Using Mendeley & Zotero Winter Term 2012 Helen B. Josephine
DIGITUM: Digital Deposit of the University of Murcia Antonia Angosto, Enrique Mingorance Murcia, 2012.
DSpace - Digital Library Software
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
ELISQ Seminar Qatar National Library 20 May 2015 Introduction by Edward A. Fox Professor, Computer Science, Virginia Tech Blacksburg, VA USA
PDS4 Demonstration Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Open Access Tools for Scholars Scholarly Communication Retreat Wednesday December 12, 2007 Presented by Marcia Salmon.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
CERN Document Server 19 tth January 2006 CERN Document Server Jean-Yves Le Meur 19 th January 2006.
Databases. Databases – objective Define database Show differences between databases and search engine Show pathfinder Demonstrate databases Search databases.
Searching the Web for academic information Ruth Stubbings.
Bielefeld Academic Search Engine
Summon® 2.0 Discovery Reinvented
InfoTrac & PowerSearch: New User Interface and Features
Building Search Systems for Digital Library Collections
IL Step 3: Using Bibliographic Databases
Dr. Bhavani Thuraisingham The University of Texas at Dallas
International Marketing and Output Database Conference 2005
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
USER MANUAL - WORLDSCINET
USER MANUAL - WORLDSCINET
Presentation transcript:

ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015

SeerQ: SeerSuite for Qatar SeerSuite: A digital library management system developed at Penn State Key features: Crawls web to gather scholarly documents Extracts metadata from PDFs (title, author name, citation) using machine learning Stores extracted metadata in a database and allows metadata and fulltext search Differences from Google Scholar: Stores the metadata and exposes it through OAI-PMH Stores the citation graph which can be used later to measure scholarly impact Collects and stores the PDFs which can be used later for advanced processing such as table/ figure extraction, understanding the semantics SeerQ: The instance of SeerSuite running in Qatar University crawling scholarly content from the Qatari Web

SeerQ: Search Results

SeerQ: Details from Search Results

SeerQ: Components and Statistics System running at (available from within Qatar University, from outside use VPN). Components: Heritrix 3 and OAI based crawler (PSU uses Heritrix 1.2) Solr 3.6 (PSU just moved from Solr 1.2) MySQL and front end (same as PSU) Document collections: Documents crawled from QScience Documents crawled from the Web: seedlist provided by QNL

Some Statistics from SeerQ Total documents in the repository (as of May 2015): 3900 Documents from QScience: 2000 Main sources: qscience, rand, doha institute, doha film institute What can we do with the system: Scholarly analysis: How many authors are from Qatar/Doha/Qatar University? Citation analysis: QScience papers only have a inter journal citation rate of 0.15%. Use the stored PDFs to extract valuable information (Research: PSU RA). Expose the metadata through OAI/PMH.

SeerQ: Exposing Extracted Metadata through OAI-PMH

A searchable database for handwritten documents (both in English and Arabic) Motivation: Retrieve handwritten documents matching the search term Compare the difference in handwriting for Arabic words (recognize the writer) Demonstrate handling of images + text (in both languages) Arabic handwriting project interface: Arabic/English Bilingual Handwriting Database

Handwriting Project: Search Results

Handwriting Project: Image with Metadata

Fusion is a free search eco-system developed by LucidWorks. Includes crawler, Solr for indexing, tools for query log analysis and error reporting Advantages over simple Solr: Enhanced Admin UI Security Data Enrichment Machine Learning Advanced Relevancy Tuning Reporting Admin Signal Processing Recommendations API (Configuration, History, Node, System, Usage) Connector Framework Fusion: A Search Eco System

Using Fusion to collect Qatari Digital Content Around 2 million English & Arabic documents related to Qatar have been crawled and are accessible using Fusion. Specific collections: Qatari Newspapers: >1 million documents from Al-Raya, Gulf-Times, Qatar-tribune Sports: QA domain sports sites, 5000 documents Government: government websites in Qatar, documents Arabic News Articles Templates Summary : 120,000 newspaper articles along with their summary, generated automatically (Research from VT RA) Qatar University Fusion can help in providing a data curation service: users request a collection, curator creates it, exposes the curated content to the user through an interface. archive-it provides some similar functionality, on a broader scope. archive-it

Fusion: for Curators

Fusion: Creating a New Collection

Fusion: How to Combine Multiple Datasources

Fusion: How to Combine Multiple Datasources: 2

Fusion: Two Step Web Crawling: Step 1

Fusion: Two Step Web Crawling: Step 2

Search Interface for Fusion: End User Designed by elisq team for demonstrations.

Search Result on Newspaper Summary Collection