Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data Analytics: HW#3

Similar presentations


Presentation on theme: "Big Data Analytics: HW#3"— Presentation transcript:

1 Big Data Analytics: HW#3
By J. H. Wang May 16, 2017

2 Programming Exercise: MapReduce
Goal: A MapReduce program for analyzing the check-in records Input: Check-in records in social networking site foursquare Check-in records: <user_id, venue_id, checkin_time> Venues: <venue_id, category, latitude, longitude> Output: analysis results (to be detailed later)

3 Output Detailed analysis including the following tasks:
Lists the top checked-in venues (most popular) Lists the top checked-in users Lists the most popular categories Lists the most popular time for check-ins (in time slots in hours, for example, 7:00-8:00 or 18:00-19:00) All of the above results *should* be sorted by the number of check-ins You also have to output the efficiency (running time) of each task

4 Note on Programming Exercises
Programming exercises can be done as a team (at most two persons per team) You can use any programming language in Hadoop or Spark to implement Java, Scala, Python, or R

5 Optional functions More analysis Further extension
Using other attributes: latitude, longitude, … Further extension Automatically integrating more information by your program venue names and information from the <venue_id>, <latitude>, <longitude>, … More information about venue_id: Geographic coordinates: for example, converting latitude, longitude to address

6 Homework Submission For implementation projects, please submit a compressed file containing: Your cluster environment setup How many PCs, what spec, network setup, … Your source codes Documentation on how to compile, install, or configure the environment. Due: 3 weeks (Jun. 6, 2017)

7 Evaluation In completion of each of the four tasks, you get part of the scores The efficiency of implemented algorithm will also be counted Optional functions will get extra credits You might need to demo if the program was unable to run

8 Thanks for Your Attention!


Download ppt "Big Data Analytics: HW#3"

Similar presentations


Ads by Google