Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIG DATA Challenges & Opportunities Search Feeling Lucky Lei Chen InternetPictures Clips Maps News Shop Email more 1.

Similar presentations


Presentation on theme: "BIG DATA Challenges & Opportunities Search Feeling Lucky Lei Chen InternetPictures Clips Maps News Shop Email more 1."— Presentation transcript:

1 BIG DATA Challenges & Opportunities Search Feeling Lucky Lei Chen InternetPictures Clips Maps News Shop Email more 1

2 Outline Background InternetPictures Clips Maps News Shop Email more Big data is term acknowledging the exponential growth, availability and use of … Challenges Big data proposes ground challenges on data capture, storage, analysis … Opportunities Many applications can be benefited from Big data … 2 BIG DATA Outline Background Challenges Opportunities

3 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 3 We are capturing more data Satellite imagery, mobile station, distributed sensor networks, geographical plotting … Super exponential growth in data volume Copyright belongs to Data Analysis Challenges, JSR-08-142, Dec

4 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 4 We are using more data Intelligent transportation Digital health care

5 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 5 We need quick processing of the data Volcano monitor Hurricane moving path predication

6 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 6 We are exploring the unknowns with different means of data measurements Ocean science Exploring the universe

7 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 7 We are discovering new rules from data The well-formed. eigenfactor project visualizes information flow in science. This diagram shows the citation links of the journal Nature. Copyright belongs to http://well- formed.eigenfactor.org

8 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 8 Defining Big Data Wiki: Big data are datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analytics and visualizing. Gartner(2011): Big data is a popular term used to acknowledge the exponential growth, availability and use of information in the data-rich landscape of tomorrow.

9 Background InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities 9 Features of Big Data 3V: Variety, Velocity and Volume

10 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges Opportunities Network Topology Applications Storage (Reliability, Scalability, Availability) Data Model (Interpretation, representation) Data Processing (Processing lang, optimization, Visualization) Data Extraction (Acquisition, Integration, Representation )

11 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 11 Data model challenges Volume Scale up, scale out, and scale in Velocity Interactive properties to facilitate processing Variety Simple but unified to adapt heterogeneity Existing data models are not satisfactory Functionality vs. Simplicity

12 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 12 Storage challenges Storage concerns: Reliability: data is safe and trustable Availability: data is accessible Scalability: data operation performance does not decay along with data size growth However, the CAP theorem is the bottleneck. No one-for-all solution exists

13 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 13 CAP Theorem Consistency Availability Partition tolerance Storage challenges

14 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 14 Storage challenges 14 ACID vs. BASE RDBMS Atomic Consistent Isolated Durable NoSQL Basically Available Soft-state Eventually consistent C P A BigTable HyperTable HBase MongoDB Redis Scalaris etc. RDBMS Dynamo CouchDB Cassandra SimpleDB Tokyo Cabinet Riak Voldemot etc.

15 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 15 Management challenges 15 Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data Gartner(2011) Big data management Indexing & Partition Functionality Adaption to new requirement and new component Flexibility

16 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 16 Management challenges 16 E.g., Indexing over big data Volume Variety Large volume of data captured very time unit Requires Distributed adaptive index Leads to Significant cost on meta data exchange Data captured from different sources Requires Distributed adaptive index Leads to Ambiguity on indexing the same object

17 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 17 Challenges on processing 17 New query language (algebra) DesiredSacrifices & Overhead FlexibilityComplexity in data modeling Relational supportingPoor scalability Uncertain supportingPoor scalability and significant computing overhead ScalabilityLess functionality Efficiency & EffectivenessPoor scalability

18 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 18 Challenges on processing 18 New computing paradigm for processing Distributed Computing Paradigm Limitations Message PassingPoor scalability and fault tolerance Unified Access Invalidated efficiency over large computing nodes MapReducePoor functionality

19 Challenges InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 19 Challenges on processing 19 New optimization methodology Load BalanceData Locality High ParallelismMerging Cost Less Network I/OReplicated Computing

20 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 20 Why Big Data? 20 We are empowered to learn knowledge and process information more accurately, effectively and efficiently. Natural Science Study Fundamental Scientific Research Social CivilizationDaily Life Big Data

21 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities Big Data for natural science study E.g., natural disaster forecasting and management FloodEarthquakeExtreme Weather Fore- casting Manage ment Meteorological data Geographic data Population, transportation, urban design data Economic data

22 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 22 Big Data for fundamental scientific research E.g., Bio informatics and medicine The mutual promotion relation between the gene technology and the clinical medicine

23 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities Big Data for social civilization Light-speed information spreading & enormous knowledge Quick events detection Easy collaboration Wandering where to get a real good cup of coffee ? JUST tweet your question!!

24 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 24 Big Data for daily life 24 Our life can be much easier more data… E.g., trip planning Travel to Beijing::Request 3-day stay Budget< 1000$ Forbidden City 10am Meeting every day Real world incidents Traffic jam Luggage delay Bad weather Predefine Updating Adaptive agenda

25 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities 25 Opportunity highlights 25 Volume o Capture, store and analyze data help us better understand the world Velocity o Guaranteed effective & efficient data processing Variety o Handling heterogeneous sources of data Considering all the challenges and constraints, perhaps there is no one-for-all solution However, application dependent Big Data solutions are promising

26 Opportunities InternetPictures Clips Maps News Shop Email more BIG DATA Outline Background Challenges. Data Model. Storage. Management. Processing Opportunities. Applications Applications 26 Heterogeneous data management Search doctors Search universities (undergoing) … Data Integration Data Extraction ~500,000 doctors & ~30,000 hospitals from 50+GB source OLAP Query Processing Integrated Database Web pages on the Internet Hospital databases Search results from general- purpose search engines News / rumors Search Doctors


Download ppt "BIG DATA Challenges & Opportunities Search Feeling Lucky Lei Chen InternetPictures Clips Maps News Shop Email more 1."

Similar presentations


Ads by Google