Presentation on theme: "Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher"— Presentation transcript:
Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher http://www.opengeocode.org/articles/Open%20Data.pptx
Open Data INDEX: What is Open Data? What are Open Data Portals? Data Portals in the US
What is Open Data? Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike. opendatahandbook.org/en/what-is-open-data/ Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. en.wikipedia.org/wiki/Open_data
What are Open Data Portals? A single point of access to open data provided freely for by a – Government: Federal, State, Regional and local Municipal governments. – Institutions: Universities – Organization: International and National standards bodies and NGOs. – Private: Corporations, Individuals
US government agencies have been mandated to make data collected and compiled by US tax dollars accessible to the public. In 2009, the Obama administration launched the Open Government Initiative (OGI) to provide a centralized repository for “public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.” (data.gov). On May 9, 2013, President Obama signed an executive order that made open and machine-readable data the new default for government information. Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government. (whitehouse.gov/open). 5 Open Data Portals in the US
Open Data Portals INDEX: Open Data Portal in North America Open Data Portals in the World
Open Data Portals in Northern America Number of “government” data portals in the United States (~350 10/2014 ) – Federal Government: 50+ Data Portals – State Government: 150+ Data Portals – County Government: 50+ Data Portals – City Government: 100+ Data Portals Number of “government” data portals in Canada (~90 10/2014 ) – Federal Government: 5+ Data Portals – State Government: 20+ Data Portals – County Government: 15+ Data Portals – City Government: 50+ Data Portals
Open Data Portals in the World 212 Countries of 250 Countries worldwide have at least one government open data portal ( 10/2014 ). Largest (non US/CA) Portals ( 10/2014 ): – United Kingdom : 35+ – Spain: 30+ – Italy: 25+ – Australia : 20+ – France: 20+ – Germany: 15+ – Austria: 10+ – Finland: 10+ – Netherlands: 10+ – Brazil: 10+
Data Portal Catalogs INDEX: Data Portal Catalogs OpenGeoCode DataCatalogs Sunlight Foundation
OpenGeoCode - Catalog Largest Catalog / Crowd Sourced Major Categories: Data Portals Transparency Portals GIS/Gazetteer Census/ Demographics View List View by Country Map View Filter by Category CSV Dump of Catalog
DataCatalogs - Catalog Most-Established / Maintained by Curators / Crowd Sourced Build using CKAN 2.0 (open source) View Browse Alphabetically Map View Has Tags, but not yet searchable by.
Sunlight Foundation - Catalog List of US Portals Maintained in GitHub
Data Portal Providers INDEX: Socrata CKAN ESRI – Geospatial Server Development Seed
Primary Data Portal Providers Socrata – Private Strong presence in federal, state and municipal in the US. Hosting Service Interactive Search Developer API
Primary Data Portal Providers CKAN – Open Source Strong presence outside the US. Hosting Interactive Search Developer API
Primary Data Portal Providers ESRI – Geoportal Server Strong presence in GIS/Mapping, Land/Property Free / Open Source Interactive Searc h
ESRI: Geoportal Server – Example Sites Other Sites: Abu Dhabi SDI GeoPortal Australia E-NRIMS Digital Geographic Information Austria Energeo Geoportal Canada Saskatchewan GeoSask Portal GeoPortal Genie Malaysia GeoPortal Poland IKAR Geoportal Portugal National System for Geographic Information (SNIG) Sweden Geodata Portal USA New York Ocean and Great Lakes Ecosystem Conservation Council
ESRI: Geoportal Server – Features Geoportal Catalog Service for GIS Resources – OGC (Open Geospatial Consortium) WS compliant – Publish resources to the geoportal by registering the resource's metadata with the catalog service: datasets, analyses, tools, and web services. Search – Keyword and Location – Clip-Zip-Ship (emails packaged Zipfiles) – Search from ArcGIS applications
Primary Data Portal Providers Development Seed NGO, Strong presence Internationally (UN, World Bank, 3 rd World). Builds integrated systems and tools for open data deployment.
US Census INDEX: Tiger/Line Shapefiles KML Boundary Files
US Census: TIGER/Line Shapefiles Streets/Roads Obtained from County Survey data Street addresses extrapolated Administrative Boundaries Nation, Region State, County, Place, ZCTA (~zip) PUMA, MSA CBSA, Tract Voting Districts School Districts Native American Reservations Infrastructure Railroads
US Census: TIGER/KML Boundary Files NEW for 2013 Administrative Boundaries Nation, Regions State, County, Place, ZCTA (~zip) PUMA, MSA CBSA, Tract Voting Districts School Districts Native American Regions
Data Portal - Portland, OR INDEX: Portland Portal – CivicApps.Org Portland Data Portal – Trimet Portland Data Portal – Other Shapefiles
Portland Data Portal (CivicApps.Org) Local Design Listing of Datasets Selected APIs Street Addresses Business Licenses Crime Statistics Parks / Trees Restaurant Inspections (CSV/Text) Trimet (KML) Boundaries / Bridges / etc (Shapefiles)
Portland Data Portal - Trimet DatasetShapefileKMLCSVWS Trimet Boundary Schedule Detours Fare Zones Park n Ride Rail Lines Rail Stops Arrival Prediction Routes Route Stops Transit Centers
Portland Data Portal – Other Shapefiles Boundaries City / County/ Zip Codes* Urban Renewal Address Points / Streets / Center Lines* Enterprise Zones Local Improvement Districts* Parks Neighborhood Associations* Snow/Ice Routes Business Associations* Watershed Areas Metro Council Districts Footprints / POI Sidewalks / Curbs / Ramps* Guardrails Bicycle Parking* Hospitals Bridges* Libraries Capital Improvement Projects* ITS Signs / Cameras City Halls* Garbage Routes / Leaf Pickup Fire Stations* Parking Meters Schools* Traffic Devices / Signals AND MORE
PDX Crime Analysis - Datasets Datasets for CivicApps.Org – Crime Incidents 2013 – by type, date and time – Trimet Transit Stops – location and coordinate – Business Licenses - business type (NAICS code) Analysis – Look for correlation between transit stops with high crime in vicinity: Presence of Alcohol Establishments Route and Time of Day Application Source and Data: www.opengeocode.org/PDX/crimePDX.zipwww.opengeocode.org/PDX/crimePDX.zip
PDX Crime Analysis - Process –D–D PDX CivicApps.Org ETL Crime Incidents Trimet Stops Business Licenses Automated Extract-Transform-Load ODI Linked CSV Format, CUDE Ontology Analyze Custom Analysis Tool (Java) Cmdline Output CSV KML Output Results
PDX Crime Analysis – High Crime Stops - Filtered to Personal Crime Categories (e.g., assault, robbery, prostitution, drugs). - Filter for Trimet stops with over 300 reported (personal) crimes last year within 1/10 th of a mile. - Alcohol establishments within same radius average 1 to 9. - Concentration around Downtown and Burnside.
PDX Crime Analysis – Low Crime Stops - Filtered to Personal Crime Categories (e.g., assault, robbery, prostitution, drugs). - Filter for Trimet stops with under 100 reported (personal) crimes last year within 1/10 th of a mile. - Alcohol establishments within same radius average 0 to 1.
PDX Crime Analysis – Time Flow - Filtered to Personal Crime Categories (e.g., assault, robbery, prostitution, drugs). - Filter for Trimet stops with over 200 reported (personal) crimes last year within 1/10 th of a mile, between 6am and noon. - Alcohol establishments within same radius average 4 to 8. - Concentration around Downtown Transit Center.
Open Source Tools ETL PDX Usage Usage: pdxETL [-p params] url URL is Civicapps.Org location of CSV file on FTP site: – E.g., ftp://ftp02.portlandoregon.gov/CivicApps/address.zipftp://ftp02.portlandoregon.gov/CivicApps/address.zip – Will automatically download – Extract from ZIP file – Transform fields into our standardized Linked CSV Vocabulary: http://www.opengeocode.org/cude1.2/LinkedCSV-Vocab.php – Export to CSV for loading into database / application – Application and Data: www.opengeocode.org/PDX/pdxETL.zipwww.opengeocode.org/PDX/pdxETL.zip
Open Source Tools ETL PDX Parseable Datasets Supports ETL for PDX datasets in CSV format: – Address Points ftp://ftp02.portlandoregon.gov/CivicApps/address.zipftp://ftp02.portlandoregon.gov/CivicApps/address.zip -e unit_value=ADDRESS:SPC – Building Permitsftp://ftp02.portlandoregon.gov/CivicApps/permits.zipftp://ftp02.portlandoregon.gov/CivicApps/permits.zip -p FC=S,FD=BLDG,FX=permit – Business Licensesftp://ftp02.portlandoregon.gov/CivicApps/business_licenses.zip -e BusinessName=NAME:LEGALftp://ftp02.portlandoregon.gov/CivicApps/business_licenses.zip – Crime Incidentsftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data.zipftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data.zip – Crime Incidents 2004 -ftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data_2004.zip 2013ftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data_2014.zipftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data_2004.zipftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data_2014.zip – Park Finderftp://ftp02.portlandoregon.gov/CivicApps/ParkFinder.zipftp://ftp02.portlandoregon.gov/CivicApps/ParkFinder.zip – Public Artftp://ftp02.portlandoregon.gov/CivicApps/public_art.zip -p FC-S,FD-ARTPftp://ftp02.portlandoregon.gov/CivicApps/public_art.zip – Earthquake (BEECN) http://www.portlandoregon.gov/pbem/article/http://www.portlandoregon.gov/pbem/article/
Join Open Source Project Tasks to Do We invite developers in the Portland area to make community contributions to the ETL PDX tool. – KML format (Trimet datasets) – Conversion of State Plane Coordinates to WGS84 (lat/lon) – Shapefile format (large number of datasets) – [Geo]JSON output – Extend to State of Oregon Data Portal