Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mapping Existing Data Sources into VIVO Pedro Szekely, Craig Knoblock, Maria Muslea and Shubham Gupta University of Southern California/ISI.

Similar presentations


Presentation on theme: "Mapping Existing Data Sources into VIVO Pedro Szekely, Craig Knoblock, Maria Muslea and Shubham Gupta University of Southern California/ISI."— Presentation transcript:

1 Mapping Existing Data Sources into VIVO Pedro Szekely, Craig Knoblock, Maria Muslea and Shubham Gupta University of Southern California/ISI

2 Outline Problem Current methods for importing data into VIVO Karma approach Demo Conclusions Pedro Szekely http://isi.edu/integration/karma

3 Problem: Data Ingest Data ingest refers to any process of loading existing data into VIVO other than by direct interaction with VIVO's content editing interfaces. Typically this involves downloading or exporting data of interest from an online database or a local system of record. VIVO Data Ingest Guide: Pedro Szekely http://isi.edu/integration/karma

4 Current Methods for Importing Data into VIVO Pedro Szekely http://isi.edu/integration/karma

5 VIVO Provided Ingest Methods Writing SPARQL Queries Convert external data (e.g., CSV) into RDF Map data onto VIVO ontology Construct SPARQL query  VIVO RDF Harvester Data Ingest Option 1: Convert data into predefined CSV format Supports limited set of data fields Option 2: Edit existing XSL scripts for your data = Programming Pedro Szekely http://isi.edu/integration/karma

6 Example Data People Organizations Positions Pedro Szekely http://isi.edu/integration/karma

7 VIVO Data Ingest Guide http://www.vivoweb.org/data-ingest-guide Step #1: Create a Local Ontology Data Ingest Menu Step#2: Create Workspace Models Step#3: Pull External Data File into RDF Step# 4: Map Tabular Data onto Ontology Step#5: Construct the Ingested Entities Step#6: Load to Webapp Pedro Szekely http://isi.edu/integration/karma

8 VIVO Data Ingest Guide http://www.vivoweb.org/data-ingest-guide Step #1: Create a Local Ontology Data Ingest Menu Step#2: Create Workspace Models Step#3: Pull External Data File into RDF Step# 4: Map Tabular Data onto Ontology Step#5: Construct the Ingested Entities Step#6: Load to Webapp Pedro Szekely http://isi.edu/integration/karma

9 VIVO Ontology Pedro Szekely http://isi.edu/integration/karma

10 VIVO Data Ingest Guide http://www.vivoweb.org/data-ingest-guide Step #1: Create a Local Ontology Data Ingest Menu Step#2: Create Workspace Models Step#3: Pull External Data File into RDF Step# 4: Map Tabular Data onto Ontology Step#5: Construct the Ingested Entities Step#6: Load to Webapp Pedro Szekely http://isi.edu/integration/karma

11 Step#5: Construct the Ingested Entities Construct { ?person. ?person ?fullname. ?person ?first. ?person ?middle. ?person ?last. ?person ?title. ?person ?phone. ?person ?fax. ?person ?email. ?person ?hrid. } Where { ?person ?fullname. ?person ?first. optional { ?person ?middle. } ?person ?last. ?person ?title. ?person ?phone. ?person ?fax. ?person ?email. ?person ?hrid. } Write the following SPARQL query Constructs the people entities Pedro Szekely http://isi.edu/integration/karma

12 SPARQL Ingest Is Difficult Construct { ?person. ?person ?fullname. ?person ?first. ?person ?middle. ?person ?last. ?person ?title. ?person ?phone. ?person ?fax. ?person ?email. ?person ?hrid. } Where { ?person ?fullname. ?person ?first. optional { ?person ?middle. } ?person ?last. ?person ?title. ?person ?phone. ?person ?fax. ?person ?email. ?person ?hrid. } Construct { ?org. ?org ?deptID. ?org ?name. } Where { ?org ?deptID. ?org ?name. } Construct { ?position. ?position ?year. ?position ?title. ?position ?person. ?person ?position. } Where { ?position ?orgID. ?position ?year. ?position ?title. ?position ?posthrid. ?person ?perhrid. FILTER((?posthrid)=(?perhrid)) } Construct { ?position. ?position ?year. ?position ?title. ?org ?position. ?position ?org. } Where { ?position ?year. ?position ?title. ?position ?postOrgID. ?org ?orgID. FILTER((?postOrgID)=(?orgID)) } Pedro Szekely http://isi.edu/integration/karma

13 Harvester Data Ingest Program in XSLT Pedro Szekely http://isi.edu/integration/karma

14 Karma Approach KARMA SourcesRDF Pedro Szekely http://isi.edu/integration/karma

15 Overall Karma Effort 15 KARMA Pedro Szekely http://isi.edu/integration/karma

16 Using Karma to Ingest Data into VIVO KARMA Pedro Szekely http://isi.edu/integration/karma

17 Karma Benefits Programming Interactive Easy Fast Pedro Szekely http://isi.edu/integration/karma

18 Karma Workspace Pedro SzekelyModelWorksheets CommandHistory http://isi.edu/integration/karma

19 Karma Models: Semantic Types Pedro Szekely Semantic Types Capture semantics of the values in each column in terms of classes and properties in the ontology the peopleID of a FacultyMemberthe label of an Organization Karma learns to recognize semantic types each time the user assigns one manually http://isi.edu/integration/karma

20 Karma Models: Relationships Pedro Szekely Relationships Capture the relationships among columns in terms of classes and properties in the ontology the relationship between Position and FacultyMember is positionForPerson Karma automatically computes relationships based on the object properties defined in the ontology http://isi.edu/integration/karma

21 Karma Demo Using Karma to ingest data samples from the “Data Ingest Guide” Pedro Szekely http://isi.edu/integration/karma

22 Conclusions Pedro Szekely http://isi.edu/integration/karma

23 Conclusions Generic data-to-ontology-to-RDF mapping tool Easy to use: interactive, no programming Used Karma to populate USC VIVO instance Open source: you can use it too Pedro Szekely http://isi.edu/integration/karma

24 From Simon Gaeremynck, Sakai Foundation Pedro Szekely http://isi.edu/integration/karma

25 More Information http://youtu.be/EQcMc4TrfuE Using Karma to ingest VIVO data http://isi.edu/integration/karma Publications and videos Software download (open source) Contacts: pszekely@isi.edu knoblock@isi.edu Pedro Szekely http://isi.edu/integration/karma


Download ppt "Mapping Existing Data Sources into VIVO Pedro Szekely, Craig Knoblock, Maria Muslea and Shubham Gupta University of Southern California/ISI."

Similar presentations


Ads by Google