Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014.

Similar presentations


Presentation on theme: "Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014."— Presentation transcript:

1 Yiqun Xie, Yongbo Chen CSCI 5715 – Fall /21/2014

2 Spatial Data vs. Spatial Big Data Traditional Spatial DataSpatial Big Data Simple use cases U of M gopher-way mapReal time map of traffic using Waze user- generated content ExamplesPoint data: restaurants Graph: static roadmaps Check-ins Temporally detailed roadmaps VolumeGigabytes of roadmapsTemporally detailed maps can reach items per year; GPS traces from smart phone VarietyRaster, vector, graphMoving objects, temporal graph, frequently updated satellite/UAV imagery, lidar/laser, geo-located tweets about disasters VelocityLimited velocity (census-decade)High velocity (Show near real-time map of 400 million tweets/day related to disasters) Dimensionality2D, 3DTime dimension Smart phone GPS-trace data size estimation Time(date, clock), Location(x,y,z), Metadata 64 Bytes 10 min: 64 X (160 X 10 6 ) X (6 X 24) -> 1.5 TB per day 10 sec: 90 TB per day 1 sec: 900 TB per day, close to 1 PB

3 Big Data vs. Spatial Big Data Big DataSpatial Big Data ExamplesFacebook/Twitter posts Google search terms Geo-located tweets and posts Open Street Map Data TypesText keywords Web logs GPS traces; geo-located social platform posts Temporally detailed roadmaps Frequently collected satellite/UAV imagery QuestionsGoogle brain: Does an image contain a cat? Are there any hotspots of recent disaster- related tweets? Where? Representative Computational Paradigms Hadoop Hashing Sub-problem optimization (learning) Spatial Hadoop, GIS in Hadoop Declustering Spatial partitioning spatial queries: partitioning data skew; boundary objects

4 Relationship between data volume and use-case complexity n3n3 n2n2 n log n n log n Cloud computer (10 9 MIPS) Cluster (10 6 MIPS) Laptop (10 3 MIPS) Use-case complexity Volume of dataset (n) 1Query using hash map log nSearch: binary search nMap check-ins from Facebook n2n2 Distance between point-pairs M. Evans, D. Oliver, K. Yang, X. Zhou, S. Shekhar. Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities. Springer, (Book chapter)

5 Use cases of SBD - Vestas Wind Systems Improve wind turbine placement for optimal energy output IBM BigInsights software + IBM "Firestorm" (#53 on the Top500 supercomputer) 2.8 petabytes; 20+ petabytes over the next four years Analysis time: weeks -> less than 1 hour Base resolution of wind data grids: 27x27km to 3x3km

6 Big Data Creates Big Jobs McKinsey: a shortage of 140,000 to 190,000 big data professionals by 2018 Gartner: 2 million job openings in the U.S. by 2015

7 Reference M. Evans, D. Oliver, K. Yang, X. Zhou, S. Shekhar. Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities. Springer, (Book chapter) Big data: The next frontier for innovation, competition and productivity, McKinsey Global Institute, May, Billion-2014/


Download ppt "Yiqun Xie, Yongbo Chen CSCI 5715 – Fall 2014 10/21/2014."

Similar presentations


Ads by Google