Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding Big Data

Similar presentations


Presentation on theme: "Understanding Big Data"— Presentation transcript:

1 Understanding Big Data
Mr. Sriram

2 Objectives What is Big Data ? Applications of Big Data
Challenges in handling Big Data Big Data Analytics Application of BDA in real world More use cases for Big Data Analyze Limitations And Solutions Of Existing Data Analytics Architecture What is Distributed Data File System?

3 What is Big Data What is Big Data ? Applications of Big Data
Challenges in handling Big Data What comes under Big Data Characteristics of Big Data / Types of Data Unstructured data is exploding heavily Big Data Analytics Application of BDA in real world More use cases for Big Data Big Data Customers

4 What is Big Data? Lots of data / Huge data (More than 1 Petabytes)
Big data is the term for a collection of data sets, so large and complex that it becomes difficult to process using traditional data processing applications Big data is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates. The size of big data is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization

5 What is Big Data?

6 How big is Big Data? Lots of data / Huge data (More than 1 Petabytes)
1 Peta byte=>1000 ZB 1zeta byte=> 1 billion tera byte 1 tera byte=> 1000 GB Facebook Facebook Insights provides developers and website owners with access to real-time analytics related to Facebook activity across websites with social plugins, Facebook Pages, and Facebook Ads. Using anonymized data, Facebook surfaces activity such as impressions, click through rates and website visits. i.e., 30+ PB per day

7 Applications of Big Data
A primary goal for looking at big data is to discover repeatable business patterns. Everyday System/Enterprises generates huge amount of data from terabytes to petabytes of information in the world. Big data examples : Google processes about 24 petabytes of data per day The experiments in the Large Hadron Collider produce about 15 petabytes of data per year. The 2009 movie Avatar is reported to have taken over 1 petabyte of local storage at Weta Digital for the rendering of the 3D CGI effects Amazon handles 20 million Customer clicks stream user Data per day to recommend the products Stock Market generates about one terabyte of new trade data per day to perform the stock trading analytics to determine trends for optimal trades 300 billions of s sent every day . Services analyze this data to find spams

8 Challenges in handling Big Data
Difficulties – Capture, storage, search, sharing, analytics, visualizing data Data Storage – Physical storage, Acquisition, Space & Power costs Data Management – Skills, People, Time Data Processing (Information and Content management)

9 What comes under Big Data?
Big data involves the data produced by different devices and applications. Given below are some of the fields that come under the umbrella of Big Data. Black Box Data : It is a component of helicopter, airplanes, and jets, etc. It captures voices of the flight crew, recordings of microphones and earphones, and the performance information of the aircraft Social Media Data : Social media such as Facebook and Twitter hold information and the views posted by millions of people across the globe Stock Exchange Data : The stock exchange data holds information about the ‘buy’ and ‘sell’ decisions made on a share of different companies made by the customers Power Grid Data : The power grid data holds information consumed by a particular node with respect to a base station. Transport Data : Transport data includes model, capacity, distance and availability of a vehicle Search Engine Data : Search engines retrieve lots of data from different databases

10 Characteristics of Big Data
Volume – Size of data Velocity – Speed of data Variety – Different types of data Veracity – Trustworthiness of data Value – Talking about business into Money Variability – Data is not constant Visualization – Talks about report generation Categories / Types of Data Structured Data Data from Enterprise systems such as ERP, CRM E.g., Tables, Relational Data Semi Structured Data XML Files, body E.g., XML, documents with table Unstructured Data Audio, Video, Images, Archived documents E.g., Raw text, images, audio and video Streaming data E.g., YouTube, Tweets Temporal data E.g., OLAP/OTP Data including trends and activities in time Geospatial data E.g., Regions, Tracks, Shape

11 Unstructured Data is Exploding heavily
90 % of the world’s data was generated in the last few years Due to the advent of new technologies, devices, and communication means like social networking sites, the amount of data produced by mankind is growing rapidly every year. By 2020, IDC (International Data Corporation) , predicts the number will have reached 40,000 EB or Zettabytes (ZB) The world’s information is doubling every two years. By 2020, there will be 5,200 GB of data for every person on the earth.

12 Big Data Analytics

13 Big Data Analytics.. In Big Data Analytics (BDA), the user is typically trying to discover new facts that no one in the enterprise knew before. Helps in enterprise information management and decision making. The characteristics common to the technologies identified with BDA: The perception that traditional data warehousing processes are too slow and limited in scalability The ability to converge data from multiple data sources, both structured and unstructured The realization that time to information is critical to extract value from data sources that include mobile devices, RFID, sensors etc.,

14 Applications of Big Data Analytics in the real world

15 More Use Cases for Big Data
Research & Development Use customer insights to eliminate unnecessarily costly features and add features which has a higher value for the customer. Improve gross margins After-Sales Support Obtain real-time input on emerging defects and adjust the production process immediately. R&D operations could use these data for redesign, new product development Police departments Target crime hotspots and prevent crime waves Public utilities Usage of data from sensors on water & sewer usage Detect leaks and reduce water consumption Electric power utilities Smart meters to better manage resources and avoid blackouts

16 Big Data Customers

17 Big Data Customers..

18 Big Data Customers..

19 Hidden Treasure

20 Analyze Limitations And Solutions Of Existing Data Analytics Architecture

21 Limitations of Existing Data Analytics Architecture

22 Solutions of Existing Data Analytics Architecture

23 Distributed Data File System

24 What is Distributed Data File System?
Reading 1 TB from 1 Machine = 1024 * 1024 = 10,48576 / 4 = 262,144/100 = /60 = Minutes to Read Reading 1 TB from 10 Machine = 1024 * 1024 = 10,48576 / 4 = 262,144/100 = /60 = /10 = Minutes to Read To speed up the data, put 1/10th of data in different machines with commodity hardware with 4 I/O channel with each 100 MB/s. Result:- Totally 1/10th of the time to process the data.

25 File System Types of File system in Hadoop
LFS Local file system (local) DFS distributed file system (server) HDFS FS (cluster)

26 Logos Lab

27 Thank You !!!!!!!!!!!


Download ppt "Understanding Big Data"

Similar presentations


Ads by Google