Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014.

Similar presentations


Presentation on theme: "Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014."— Presentation transcript:

1 Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014

2 Spatial Data vs. Spatial Big Data Traditional Spatial DataSpatial Big Data Simple use cases U of M gopher-way mapReal time map of traffic using Waze user- generated content ExamplesPoint data: restaurants Graph: static roadmaps Check-ins Temporally detailed roadmaps VolumeGigabytes of roadmapsTemporally detailed maps can reach 10 13 items per year; GPS traces from smart phone VarietyRaster, vector, graphMoving objects, temporal graph, frequently updated satellite/UAV imagery, lidar/laser, geo-located tweets about disasters VelocityLimited velocity (census-decade)High velocity (Show near real-time map of 400 million tweets/day related to disasters) Dimensionality2D, 3DTime dimension Smart phone GPS-trace data size estimation Time(date, clock), Location(x,y,z), Metadata 64 Bytes 10 min: 64 X (160 X 10 6 ) X (6 X 24) -> 1.5 TB per day 10 sec: 90 TB per day 1 sec: 900 TB per day, close to 1 PB

3 Big Data vs. Spatial Big Data Big DataSpatial Big Data ExamplesFacebook/Twitter posts Google search terms Geo-located tweets and posts Open Street Map Data TypesText keywords Web logs GPS traces; geo-located social platform posts Temporally detailed roadmaps Frequently collected satellite/UAV imagery QuestionsGoogle brain: Does an image contain a cat? Are there any hotspots of recent disaster- related tweets? Where? Representative Computational Paradigms Hadoop Hashing Sub-problem optimization (learning) Spatial Hadoop, GIS in Hadoop Declustering Spatial partitioning spatial queries: partitioning data skew; boundary objects

4 Relationship between data volume and use-case complexity n3n3 n2n2 n log n n log n 1 10 6 10 9 10 12 10 15 Cloud computer (10 9 MIPS) Cluster (10 6 MIPS) Laptop (10 3 MIPS) Use-case complexity Volume of dataset (n) 1Query using hash map log nSearch: binary search nMap check-ins from Facebook n2n2 Distance between point-pairs M. Evans, D. Oliver, K. Yang, X. Zhou, S. Shekhar. Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities. Springer, 2014. (Book chapter)

5 Use cases of SBD - Vestas Wind Systems Improve wind turbine placement for optimal energy output IBM BigInsights software + IBM "Firestorm" (#53 on the Top500 supercomputer) 2.8 petabytes; 20+ petabytes over the next four years Analysis time: weeks -> less than 1 hour Base resolution of wind data grids: 27x27km to 3x3km

6 Big Data Creates Big Jobs McKinsey: a shortage of 140,000 to 190,000 big data professionals by 2018 Gartner: 2 million job openings in the U.S. by 2015

7 Reference M. Evans, D. Oliver, K. Yang, X. Zhou, S. Shekhar. Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities. Springer, 2014. (Book chapter) Big data: The next frontier for innovation, competition and productivity, McKinsey Global Institute, May, 2011 http://www.emarketer.com/Article/Smartphone-Users-Worldwide-Will-Total-175- Billion-2014/1010536 http://www.statista.com/statistics/201182/forecast-of-smartphone-users-in-the-us/ http://www-03.ibm.com/press/us/en/pressrelease/35737.wss http://www.gartner.com/newsroom/id/2207915


Download ppt "Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014."

Similar presentations


Ads by Google