Presentation is loading. Please wait.

Presentation is loading. Please wait.

MIS 3500 Instructor: Bob Travica Newer DB Topics 2015.

Similar presentations


Presentation on theme: "MIS 3500 Instructor: Bob Travica Newer DB Topics 2015."— Presentation transcript:

1 MIS 3500 Instructor: Bob Travica Newer DB Topics 2015

2 Big Data  3 big V:  Volume: terabytes (15 zeroes), petabytes (18 zeroes)  Variety: Social media, communications, sensors everywhere*, Internet of Things, video feeds, GPS… Implication: various formats  Velocity: wired and wireless continuous feeds 2

3 Goals and Uses  Goals:  Integrate data on the same object across sources (Customer, Citizen etc.; spatial mashups)  Analysis: Existing patterns, Predictive analysis  Application domains:  Monitoring for business & other purposes (sensors)  Marketing (relationship mktg., Sentiment analysis is social media…)  Energy grid management  Transportation networks management  Health (analysis of cancer cell behavior and of patient vital signs)  Science (human genome)  Policy analysis (United Nations’ system for predicting social problems) 3

4 Big Data Tasks 4

5  Machine-generated data (sensors); automatic creation and transfer *  Home appliances (security, energy consumption, heating, food, entertainment)  Monitoring/Control (cars, athletic equipment, machinery, appliances)*  Example: Smart power grid** 5 Smart meter; Internet & Wi-Fi connectivity

6 Technologies  Hadoop (framework for file system and processing of large datasets on server clusters)*  Machine learning – automated construction of models to fit data (instead of hypothesis testing as with DW and Analytics)  Open source  Notable developers: Yahoo, Facebook, Yahoo!, Google, Microsoft 6 Microsoft Azure-based Hadoop

7 7 DATA PROCESSING

8  A database for Big Data  Distributed, non-relational, scalable  Based on Google’s BigTable * 8 Row Key (reversed URL)Time StampColumn Key – “Anchor” (Family) + URLpart (Qualifier) "com.cnn.www"t9anchor:cnnsi.com = "CNN" "com.cnn.www"t8anchor:my.look.ca = "CNN.com" Row KeyTime StampColumn Key – “Contents” + keyword in tagged content "com.cnn.www"t6 contents:html = " … ​ " "com.cnn.www"t5 contents:html = " … ​ " "com.cnn.www"t3 contents:html = " … ​ " DATA are cites of “CNN*” Referencing sites DATA are webpages Compressed. There can be any Number of unbound Contents Columns. All columns put together make a “BigTable”.

9 NoSQL – Not Only SQL 9

10 Modern Database environments 10

11 Modern Database environments 11


Download ppt "MIS 3500 Instructor: Bob Travica Newer DB Topics 2015."

Similar presentations


Ads by Google