Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analyzing open gov data with IBM Bluemix Carlos Hoyos.

Similar presentations


Presentation on theme: "Analyzing open gov data with IBM Bluemix Carlos Hoyos."— Presentation transcript:

1 Analyzing open gov data with IBM Bluemix Carlos Hoyos

2 Objective Show how to use Bluemix to import large open gov data and prepare it for analysis or to be used in applications Bluemix is a simple platform that allows anyone to run applications with little coding or deployment effort. Data scientist and developers can use it to ingest and understand open source data.

3 Scenario Analyze parking violations data from NYC. Import it into BigInsights Massage it using BigSheets (a big data spreadsheet) Visualize it in a map Create a heat map to find the “hottest spots” for parking violations. End goal: give users a way to query how likely they are to get a parking violation at a specific location.

4 Scenario (cont.) Correlate parking tickets with parking regulations. Are there streets that have meters but are not frequented by ticketing agents?

5 Step 1 – Prepare the data

6 1.Find data sources in data.gov 2.NYC parking violations are here: http://catalog.data.gov/dataset/parking- violations-issued-fiscal-year-2014-august-2013- june-2014-c1a76 1.Download file

7 Step 2 – Deploy Analytics for Hadoop and import data

8 Getting started Register for a free BlueMix account at ibm.biz/Datafest

9 Deploy IBM Analytics for Hadoop Under Catalog, search for “Analytics” and deploy the IBM analytics for Hadoop package.

10 Deploy IBM Analytics for Hadoop Once deployed, you can launch the service from your dashboard.

11 Upload data to be analyzed Once you launch, go to files, and under user create a folder ‘imports’ Files section 1- Under “user” create new folder 2- call it imports 1- Under “user” create new folder 2- call it imports

12 Upload file Upload the file you downloaded in step 1 1- select upload file 2- select file from your local machine 1- select upload file 2- select file from your local machine

13 Create a new bigsheet workbook (i) 1- select Bigsheets > new workbook 2- Name it “ticket data” and select the file you uploaded 1- select Bigsheets > new workbook 2- Name it “ticket data” and select the file you uploaded

14 Create a new bigsheet workbook (ii) 1- select Bigsheets > new workbook 2- Name it “ticket data” and select the file you uploaded 1- select Bigsheets > new workbook 2- Name it “ticket data” and select the file you uploaded

15 Use a CSV importer 1- Select reader > CSV 2- Select “headers” since file has them 1- Select reader > CSV 2- Select “headers” since file has them

16 Lets explore the data First, lets find out, which precincts have the most tickets. Select add new chart Create a big data heath map.

17 Create a chart 1- Select for the X axis (violations by precinct) 2- Select count occurrences of X axis 3- Run the chart 1- Select for the X axis (violations by precinct) 2- Select count occurrences of X axis 3- Run the chart

18 Visualize the data When it comes to parking tickets, precincts 14 & 19 seem to be the toughest one


Download ppt "Analyzing open gov data with IBM Bluemix Carlos Hoyos."

Similar presentations


Ads by Google