Presentation is loading. Please wait.

Presentation is loading. Please wait.

3 DAYS ON JANUARY 16 th, 17 th & 18 th 2015 Santa Clara Convention Center, 5001 Great America Parkway, Santa Clara, CA 95054, United States.

Similar presentations


Presentation on theme: "3 DAYS ON JANUARY 16 th, 17 th & 18 th 2015 Santa Clara Convention Center, 5001 Great America Parkway, Santa Clara, CA 95054, United States."— Presentation transcript:

1 3 DAYS ON JANUARY 16 th, 17 th & 18 th 2015 Santa Clara Convention Center, 5001 Great America Parkway, Santa Clara, CA 95054, United States

2 Data to Analytics in 2 hours Using Hue, Hive and Tableau Public

3 About me Technology geek and Data Evangelist with deep dive expertise in Big Data, Decision Support and operational based systems. www.itversity.com www.linkedin.com/in/durga0gadiraju/ https://www.youtube.com/c/TechnologyMent or https://www.youtube.com/c/TechnologyMent or

4 Agenda Using NYSE data we will be generating reports and dashboard to perform top down volume analysis for year 2013. Targeted Audience: Architects, developers, analysts and almost every IT professional. You will understand different open source tools, tips and techniques that are available for quick turn around of POCs.

5 Agenda Understanding data and tools Gather data (eoddata.com) Prepare or format data Upload data to HDFS using Hue Process data using Hive Develop reports and dashboard using Tableau Public

6 Understanding data and tools NYSE eod data and company list Understand tools – Apache Hadoop HDFS (Distributed and logical file system) Map Reduce (Distributed batch computing framework) – Apache Hue Web Interface which consolidates all Hadoop eco system tools Useful for developers, testers and analysts – Apache Hive HDFS and Map Reduce based Query Language Used as Database that can complement or replace existing Data Warehouse – Tableau Public Reporting tool

7 Gather data NYSE data Companylist data

8 Prepare or format data NYSE eod data is provided as individual files for each day and hence we will run into too many small files issue. Concatenate the small files into larger ones (best way is to use partition tables) Companylist is delimited by "," and causes some issues, hence change delimiter to "|"

9 Upload data to HDFS using Hue Use File browser Create 2 directories (nyse and companylist) Upload files

10 Process data using Hive Create 2 tables one for NYSE eod data and company list Create user defined function to transform date to sortable date format Develop, execute the query and validate the results Create stage table

11 Develop reports and dashboard using Tableau Public Download data using Hue Determine granularity for the report, we need to compute monthly volume, per stock ticker in each of the sectors Develop reports and dashboard using Tableau public – Filters, Calculated Fields and other features will be covered

12 Thank You http://www.itversity.com https://www.linkedin.com/in/durga0gadiraju https://github.com/dgadiraju https://www.youtube.com/c/TechnologyMentor https://twitter.com/itversity https://www.facebook.com/itversity


Download ppt "3 DAYS ON JANUARY 16 th, 17 th & 18 th 2015 Santa Clara Convention Center, 5001 Great America Parkway, Santa Clara, CA 95054, United States."

Similar presentations


Ads by Google