Problem Based Learning To Build And Search Tweet And Web Archives Richard Gruss Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science.

Slides:



Advertisements
Similar presentations
Metadata in Carrot II Current metadata –TF.IDF for both documents and collections –Full-text index –Metadata are transferred between different nodes Potential.
Advertisements

Integrated Digital Event Web Archive and Library (IDEAL) and Aid for Curators Archive-It Partner Meeting Montgomery, Alabama Mohamed Farag & Prashant Chandrasekar.
Web Archive Content Analysis: Disaster Events Case Study IIPC 2015 General Assembly Stanford University and Internet Archive Mohamed Farag Dr. Edward A.
Contents of the program Management in libraries and information centers Management in libraries and information centers Information and communication.
CS 101 Sect 7 – Databases (DB) Why databases Difference between a DB and a Web search What is a DB An hands-on case: the JCU Library 1
Crisis, Tragedy, and Recovery Network Digital Library (CTRnet) + Web Archiving in Qatar and VT Edward A. Fox, Seungwon Yang, & CTRnet Team Department of.
SimDL: A Model Ontology Driven Digital Library for Simulation Systems Jonathan Leidig - Edward A. Fox Kevin Hall Madhav Marathe Henning Mortveit.
1 CHCI Visit by Dean Benson, Associate Dean Lesko KW II Rm – 10/10/2011 Digital Library Research Laboratory Torgersen Hall Rm 2030 –
Ontology Classifications Acknowledgement Abstract Content from simulation systems is useful in defining domain ontologies. We describe a digital library.
Curriculum Overview School of Information and Library Science University of North Carolina at Chapel Hill BW.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
It seemed like a good idea … A Tablet PC-based Next-generation Laboratory Classroom Stephen H. Edwards Virginia Tech, Dept. of Computer Science
University of Delaware PBL Faculty Institute, University of Cincinnati November 1, 2001 Transforming Undergraduate Education Institute for Transforming.
Curriculum Overview School of Information and Library Science University of North Carolina at Chapel Hill BM.
1 Closing Session ETD 2010: 13 th Int. Symp. on ETDs Austin, TX Edward A. Fox Executive Director, NDLTD, Virginia Tech, Blacksburg,
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
CS 5604 Spring 2015 Classification Xuewen Cui Rongrong Tao Ruide Zhang May 5th, 2015.
SAKAI LIKES INITIATIVE: Living In the KnowlEdge Society at Sakai Regional Conference, Nov. 2008, VT Edward A. Fox, Christopher Zobel, and Seungwon Yang.
Humanities Centres and the Digital Humanities Alan Liu Australasian Consortium of Humanities Research Centres Annual Meeting, July 8, 2013.
Web Archives, IDEAL, and PBL Overview Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21.
A Materials Digital Library: Bridging Content and Customers Donald R. Sadoway Department of Materials Science & Engineering Massachusetts Institute of.
IST 441 Example Projects. Undergrad Project Find a customer – interest in xbox game forum Build a search engine for Xbox game forums etc. Compare two.
Developing a Concept Extraction Technique with Ensemble Pathway Prat Tanapaisankit (NJIT), Min Song (NJIT), and Edward A. Fox (Virginia Tech) Abstract.
CTRnet: A Crisis, Tragedy, & Recovery Network ( Oct.16, 2009 VCOM Research Day Blacksburg, VA USA Edward Fox Bidisha.
Reducing Noise CS5604: Final Presentation Xiangwen Wang, Prashant Chandrasekar.
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
Solr Team CS5604: Cloudera Search in IDEAL Nikhil Komawar, Ananya Choudhury, Rich Gruss Tuesday May 5, 2015 Department of Computer Science Virginia Tech,
: Strategies for a Successful IR Michael Boock Head of Technical Services Oregon State University Libraries.
Formalizing Project EMANI Ithaca, July 26 th, 2002.
Tweets Metadata May 4, 2015 CS Multimedia, Hypertext and Information Access Department of Computer Science Virginia Polytechnic Institute and State.
Digital library infrastructure -- systems Repositories for storing digital resources protect, manage, deliver, and preserve digital resources over time.
XXDL and CSTC and Virginia Tech NSDL Fall 2000 PI Meeting September 22-24, 2000 NSF, Arlington, VA Edward A. Fox CS DLRL.
VIRGINIA TECH BLACKSBURG CS 4624 MUSTAFA ALY & GASPER GULOTTA CLIENT: MOHAMED MAGDY IDEAL Pages.
1 Video Message: Welcome ETD 2015: 18 th Int’l Symposium on ETDs New Delhi, India Edward A. Fox Executive Director, Chairman of the Board NDLTD,
Introduction to Concept Maps Edward A. Fox and Rao Shen CS5604 Fall 2002 “Information Storage & Retrieval” Dept. of Computer Science Virginia Tech, Blacksburg,
1 IBM Academic Initiative Introduction for Pamplin School of Business Virginia Tech – October 13, 2011 “IBM Academic Skills Cloud and Computing Education.
Digital Library Syllabus Uploader Will Cameron CSC 8530 Fall 2006 Presentation 1.
Crisis, Tragedy and Recovery Network (CTRnet) Slides by Kiran Chitturi, Edward A. Fox, and the CTRnet team
ELISQ Seminar Qatar National Library 20 May 2015 Introduction by Edward A. Fox Professor, Computer Science, Virginia Tech Blacksburg, VA USA
Teaching Big Data Through Problem-Based Learning Richard Gruss, Business Information Technology, Virginia Tech Tarek Kanan Software Engineering Department.
GFURR seminar Can Collecting, Archiving, Analyzing, and Accessing Webpages and Tweets Enhance Resilience Research and Education? Edward A. Fox, Andrea.
WVU Electronic Institutional Document Repository Digital Wave of the Future.
CTRnet Digital Library for Disaster Information Services Seungwon Yang 1, Andrea Kavanaugh 1, Nádia P. Kozievitch 4, Lin Tzy Li 1,4,5, Venkat Srinivasan.
Information Storage and Retrieval(CS 5604) Collaborative Filtering 4/28/2016 Tianyi Li, Pranav Nakate, Ziqian Song Department of Computer Science Blacksburg,
Big Data Processing of School Shooting Archives
CS6604 Digital Libraries Global Events Team Final Presentation
Collection Management Webpages
Collection Management
Launch, Persevere, and Collaborate
Zenodo Data Archive Irtiza Delwar, Michael Culhane, John Sizemore, Gil Turner Client: Dr. Seungwon Yang Instructor: Dr. Edward A. Fox CS 4624 Multimedia,
Text Classification CS5604 Information Retrieval and Storage – Spring 2016 Virginia Polytechnic Institute and State University Blacksburg, VA Professor:
VR4GETAR CS4624: Multimedia, Hypertext and Information Access
VCCWS: Virginia Center for Civil War Studies
Clustering and Topic Analysis
Virginia Tech Blacksburg CS 4624
Clustering tweets and webpages
Pathways Web CS4624 Multimedia, Hypertext, and Information Access
CS 5604 Information Storage and Retrieval
Event Focused URL Extraction from Tweets
Team FE Final Presentation
Collection Management Webpages Final Presentation
CS6604 Digital Libraries IDEAL Webpages Presented by
Information Storage and Retrieval
News Event Detection Website Joe Acanfora, Briana Crabb, Jeff Morris
Paleontology Topic Trends
Tweet URL Analysis Guoxin Sun, Kehan Lyu, Liyan Li
Katrina Database SearchKat
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Blacksburg to Guatemala Archive
Python4ML An open-source course for everyone
Presentation transcript:

Problem Based Learning To Build And Search Tweet And Web Archives Richard Gruss Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21 May 2015 for Qatar National Library

Integrated Digital Event Archiving and Library (IDEAL)

IDEAL provided data for 2 courses: Over 11 terabytes of webpages and about 1 billion tweets Natural disasters (e.g., earthquakes, storms, floods) Man-made disasters (e.g., protests, terrorism, conflicts) Community / government events

Problem-Based Learning (PBL) A single question drives and organizes the learning activities. Learning is done “Just-In-Time.” Emphasis is placed on the process rather than the product. The task of the instructor is to provide a relevant and authentic question and serve as a facilitator.

CS 4984: Computational Linguistics, Fall 2014 Undergraduate capstone course Driving Question: “What is the best summary that can be automatically generated for your type of event?” 7 teams, all performing the same analysis on different collections of text Institutional Repository entries from the teams:

CS 4984: Computational Linguistics Text Collections

CS 4984: Computational Linguistics Unit Objectives

CS 4984: Computational Linguistics Sample Results

CS 5604: Information Retrieval Introductory graduate level course Driving Question: “How can we best build a state-of-the-art IR system in support of a large digital library project?” 8 teams, all performing different tasks along a processing pipeline Institutional Repository entries from the teams:

CS 5604: Information Retrieval

Team Responsibilities

CS 5604: Information Retrieval Indexing/analyzing Tweets into Solr

CS 5604: Information Retrieval Custom Solr Search Field weights Custom result list processing

CS 5604: Information Retrieval Search Performance (first 1000 results)

Acknowledgements US National Science Foundation, DUE US National Science Foundation, IIS