The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer Polytechnic Institute Nov 8, 2010
2 Data.gov and World-Wide Open Government Data Activities January 1, 2009 “Openness will strengthen our democracy and promote efficiency and effectiveness in Government.” --- President Obama Putting Government Data online May 21, 2009 January 19, 2010 data.gov.uk online May 21, 2010 data.gov online data.gov relaunch with semantic web featured June30, … Many countries US UK Australia New Zealand …
3 Semantic Web Featured at data.gov data.gov adopted Semantic Web Technolgoies Web-based Mashups Downloadable RDF data billions of triples
4 Data-gov Wiki: Innovations at RPI The Data-gov Wiki explores and educates the use of semantic web technologies, esp. linked data, in producing, processing and utilizing government data from data.gov. The Data-gov Wiki is run by the Tetherless World Constellation at RPI, headed by Professors Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores. 40+ Demos 400+ Datasets Tutorials & Videos
5 The Data-gov Wiki - Architecture Data Web Linked Data Linked Data LGD in RDF Enhancement Conversion Knowledge Provenance … Consume LGD: Linked government data
6 Conversion: From Raw Tabular Data to RDF
7 Enhancement: Linking Open Government Data IDyearPHSY_STsite-idcost site NY site-idLatitudelongitude site Year claims PHSY_ST: state abbreviation ID: unique id cost: unit is million US dollars year: Correlated dataset Complement dataset Metadata (field definition) Metadata (value definition) owl:sameAs DS123:NY
8 Exhibit Visualization API Data.gov CASTNET Ozone (CSV) epa.gov CASTNET Site (CSV) Convert raw dataset into linkable RDF Data MashupWeb Application Mashup Visualization Mashup query multiple RDF dataset via SPARQL end point surf to EPA applications 1 2 drill down for details 3 4 Created by Dominic DiFranzo, PhD student at RPI, Consumption: Mashing up LOGD Data
9 Provenance: Tracking Create, Derive, Revision Events Convert derive create derive revision Access Enhance Version SemDiff
10 TWC LOGD Status: Website Statistics Page Rank=5 Site Traffic –378,128 page hits –28,481 visits –16,041 visitors –4126 cities –34 countries Ranked 6th in Google “data gov” Search Note: the above statistics are about Dataset access not counted.
11 TWC LOGD Status: Part of LOD Cloud We are here
12 Conclusion and Future Work Now – billions of triples –“data + visualization + mashup” –Low-cost solutions –Education New LOGD site is coming –More raw data, catalog, links, –More technologies, RDFa –More tools, services –More demos, tutorials, videos –More domain applications Future Research –Data integration, link, search –Social machine –Provenance, versions, trust –Usability and data quality –Scalability scalable