Validation of Ebola LOD

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Ontology Classifications Acknowledgement Abstract Content from simulation systems is useful in defining domain ontologies. We describe a digital library.
DT211/3 Internet Application Development JSP: Processing User input.
Mouse Movement Biometrics, Pace University, Fall'20071 Mouse Movement Biometrics Fall 2007 Capstone -Team Members Rafael Diaz Michael Lampe Nkem Ajufor.
Trials and Tribulations of creating DDI Codebooks at the University of Guelph A.Michelle Edwards and Carol Perry, Data Resource Centre, University of Guelph.
Project Group Assignment System CS616 Team 9 Kim Doyle, Susan Kroha, Arunima Palchowdhury, Wei Xu Client: Dr. Charles Tappert.
Mouse Movement Project Customer: Larry Immohr Professor: Dr. Charles Tappert Team: Shinese Noble Anil Ramapanicker Pranav Shah Adam Weiss.
Xpantrac connection with IDEAL Sloane Neidig, Samantha Johnson, David Cabrera, Erika Hoffman CS /6/2014.
Industrial Project (234313) Final Presentation “App Analyzer” Deliver the right apps users want! (VMware) Students: Edward Khachatryan & Elina Zharikov.
CINET Registry Mentor Dr. Keith Bisset (NDSSL, VBI) Aditya Agashe, Harshal Ganpatrao Hayatnagarkar and Sarang Joshi Presentation #8 6 May 2014 CS 6604.
UWG 2013 Meeting PO.DAAC Web Services Demo. What are PO.DAAC Web Services?
Tweets Metadata May 4, 2015 CS Multimedia, Hypertext and Information Access Department of Computer Science Virginia Polytechnic Institute and State.
VIRGINIA TECH BLACKSBURG CS 4624 MUSTAFA ALY & GASPER GULOTTA CLIENT: MOHAMED MAGDY IDEAL Pages.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Juanita Cano City of Sacramento Spring 2014 Geography 375.
Apache PIG rev Tools for Data Analysis with Hadoop Hadoop HDFS MapReduce Pig Statistical Software Hive.
Display Page (HTML/CSS)
CAA Database Overview Sinéad McCaffrey. Metadata ObservatoryExperiment Instrument Mission Dataset File.
William Perry U.S. Geological Survey Western Ecological Research Center Geography 375 Final Project May 22, 2013.
PROTEIN IDENTIFIER IAN ROBERTS JOSEPH INFANTI NICOLE FERRARO.
Information Storage and Retrieval(CS 5604) Collaborative Filtering 4/28/2016 Tianyi Li, Pranav Nakate, Ziqian Song Department of Computer Science Blacksburg,
Big Data Processing of School Shooting Archives
Big Data Analytics and Machine Intelligence Capability Team
DB Programming – Basic analysis
Tutoring Overview.
Michael Liu, Andrew Chuba, Divya Sengar, James Wong, Alan Kai
CSC 321: Data Structures Fall 2016
Systems Planning and Analysis
Rdoc2vec Jake Clark, Austin Cooke, Steven Rolph, Stephen Sherrard
Common Crawl Mining Team: Brian Clarke, Tommy Dean, Ali Pasha, Casey Butenhoff Manager: Don Sanderson (Eastman Chemical Company) Client: Ken Denmark.
CSC 321: Data Structures Fall 2015
System Design.
Background Check Website for R4 OpSec, LLC
Zenodo Data Archive Irtiza Delwar, Michael Culhane, John Sizemore, Gil Turner Client: Dr. Seungwon Yang Instructor: Dr. Edward A. Fox CS 4624 Multimedia,
Final Project: Read from a csv file and write to a database table
New developments on the LHCb Bookkeeping
Temple Analytics Challenge
Virginia Tech Center for Drug Discovery Website Migration and Redesign
VR4GETAR CS4624: Multimedia, Hypertext and Information Access
Trail Study Kevin Cianfarini, Shane Davies, Marshall Hansen, Andrew Eason … CS4624: Multimedia, Hypertext, and Information Access Instructor: Dr. Edward.
Virginia Tech Blacksburg CS 4624
Tweet Collections Multimedia, Hypertext, and Information Access
Chapter 15 QUERY EXECUTION.
Maptivity Conor O’Neill, Kaz Eslami, Cody Douglass
Tracking Theatre/Cinema Production Experience
Hey everyone, I’m Sunny …harsh caroline xavier
Soo Park and Janine Aquino
Multimedia Database Virginia Polytechnic Institute and State University Blacksburg, VA CS 4624 Multimedia, Hypertext and Information Access Client.
Social Interactome Recommender Team Final Presentation
Predict Protein Sequence by Fuzzy-Association Rules
Collection Management Webpages Final Presentation
Tracking FEMA Kevin Kays, Emily Maier, Tyler Leskanic, Seth Cannon
Twitter Equity Firm Value
LucidWorks: Vectorize Workflow Module
News Event Detection Website Joe Acanfora, Briana Crabb, Jeff Morris
Text Transformation May 5th, 2015 CS Multimedia/Hypertext
Paleontology Topic Trends
E190Q – Final Project Presenation
Social Interactome Recommender Team
Adam Lech Joseph Pontani Matthew Bollinger
Brian Kotek INDUS Corporation
Data Management Innovations 2017 High level overview of DB
Spreadsheets, Modelling & Databases
E-Collaboration Supply Chain Interlock (ESI) The Collaboration Tool of your choice. SPEX presentation March, 2007.
Overview of Workflows: Why Use Them?
Objective- To graph a relationship in a table.
Final Project Geog 375 Daniel Hewitt.
Visual Attributes in Video
Building Cherwell Searches
Python4ML An open-source course for everyone
Presentation transcript:

Validation of Ebola LOD Team Members: Jonathan Downs Yash Pant Instructor: Dr. Edward Fox Client: S.M.Shamimul Hasan School: Virginia Tech - Blacksburg, VA Date: 4/28/2016

Background Ebola epidemic of 2014 Largest database of Ebola data Large amount of data that needs to be verified Validate data

Project Goal Goal: To create script that allows a user to validate data in the Ebola database Gather data sources that can validate the data in the database Input CSV file of data into script Run script to compare input CSV and Ebola database values Produce output CSV, showing the result of the comparison Input RDF file of Ebola database into script

RDF

Prototype - Overview Written in Python Focused on specific dataset (time-series data in Guinea) Focused on one validation file

Prototype - Algorithm Matched every row in an input CSV file to one data point in the RDF database Searched for data points by looking at country, date, and parameter Compared the value at each data point to validate it Example of row in CSV:

Prototype - Sample Results

Final Script - Overview Final script generalizes prototype to take various input sources Allows a user to input any relevant CSV file as an input, using a GUI Delivers an output CSV showing a comparison of input and Ebola DB

Modifications to Prototype - Parsing CSV’s Changed script to parse any CSV inputted CSV to RDF relationship: Predicate Object Subject

Modifications to Prototype - Searching Create a composite key to search the database for data points Take pairs of predicate and object to find subject Validate one predicate (data field) at a time Validating: 0 Cases in Guinea on 8/4/2014 Create composite key with the following pairs of (predicate, object): (Country, Guinea) (Incident Date, 8/4/2014) (Time Series Parameter, New Cases of Probables) Run search based on this key to find the associated RDF subject (data point) Get the number of cases from the subject found

GUI Elements

Sample Results

Sources http://fadyart.com/en/images/stories/rdf.png