+ Big Data. + Chapter Objectives Learn the basic concepts of Big Data, structured storage, and the MapReduce process Learn the basic concepts of data.

Slides:



Advertisements
Similar presentations
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence.
Advertisements

Data Modeling and the Entity-Relationship Model
Database Design Chapter Five DATABASE CONCEPTS, 6th Edition
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall COS 236 Day 25.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
The database approach to data management provides significant advantages over the traditional file-based approach Define general data management concepts.
Management Information Systems, Sixth Edition
Database Processing Applications and Business Intelligence Chapter Seven DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 3 rd Edition.
Chapter 9 Business Intelligence Systems
Database Systems: Design, Implementation, and Management Tenth Edition
Chapter 9 Competitive Advantage with Information Systems for Decision Making © 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke.
The Hierarchy of Data Bit (a binary digit): a circuit that is either on or off Byte: 8 bits Character: each byte represents a character; the basic building.
Database Management: Getting Data Together Chapter 14.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall COS 346 Day 26.
Chapter 14 The Second Component: The Database.
Database Processing for Business Intelligence Systems
Big Data, Data Warehouses, and Business Intelligence Systems
David M. Kroenke and David J. Auer Database Processing—12 th Edition Fundamentals, Design, and Implementation Chapter One: Introduction KROENKE AND AUER.
Getting Started Chapter One DATABASE CONCEPTS, 7th Edition
Lead Black Slide. © 2001 Business & Information Systems 2/e2 Chapter 7 Information System Data Management.
Advanced Topics Chapter Seven DAVID M. KROENKE’S DATABASE CONCEPTS, 2 nd Edition.
Chapter 13 The Data Warehouse
Big Data, Data Warehouses, and Business Intelligence Systems Chapter Eight DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 7 th Edition.
Chapter 12 Designing Distributed and Internet Systems
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 2-1 David M. Kroenke’s Chapter One: Why DB? Database Processing: Fundamentals,
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Chapter 9 Business Intelligence and Information Systems for Decision Making.
© 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke Slide 1 Chapter 9 Competitive Advantage with Information Systems for Decision Making.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
Fundamentals of Information Systems, Fifth Edition
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Chapter 14 Sharing Enterprise Data David M. Kroenke Database Processing © 2000 Prentice Hall.
Advanced Topics Chapter Seven DAVID M. KROENKE’S DATABASE CONCEPTS, 2 nd Edition.
Chapter 11 Business Intelligence Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 11-1.
1 Data Warehouses BUAD/American University Data Warehouses.
CSS/417 Introduction to Database Management Systems Workshop 4.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 7 Information System Data Management.
Chapter 5 Database Processing. Neil uses software to query a database, but it has about 25 standard queries that don’t give him all he needs. He imports.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Business Intelligence Systems Appendix J DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 6 th Edition.
1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Copyright © 2004 Pearson Education, Inc. Slide 2-1 Data Models Data Model: A set.
Data resource management
Next Back MAP 3-1 Management Information Systems for the Information Age Copyright 2002 The McGraw-Hill Companies, Inc. All rights reserved Chapter 3 Data.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Department of Industrial Engineering Sharif University of Technology Session# 9.
Digital Planet: Tomorrow’s Technology and You Chapter 7 Database Applications and Privacy Implications Copyright © 2012 Pearson Education, Inc. publishing.
KROENKE and AUER - DATABASE CONCEPTS (3 rd Edition) © 2008 Pearson Prentice Hall 6-1 Chapter Objectives Understand the need for and importance of database.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 17 Sharing Enterprise Data.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Chapter Twelve: Big Data, Data Warehouses, and Business.
David M. Kroenke and David J. Auer Database Processing Fundamentals, Design, and Implementation Appendix J: Business Intelligence Systems.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
Managing Data Resources File Organization and databases for business information systems.
James A. Senn’s Information Technology, 3rd Edition
Database Processing Applications and Business Intelligence
David M. Kroenke and David J
Chapter 13 The Data Warehouse
Big DATA.
Presentation transcript:

+ Big Data

+ Chapter Objectives Learn the basic concepts of Big Data, structured storage, and the MapReduce process Learn the basic concepts of data warehouses and data marts Learn the basic concepts of dimensional databases Learn the basic concepts of business intelligence (BI) systems Learn the basic concepts of Online Analytical Processing (OLAP) KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-2

+ Big Data The rapidly expanding amount of data being stored and used in enterprise information systems Search tools Google Bing Web 2.0 social networks Facebook LinkedIn Twitter KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-3

+ Storage Capacity Terms KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-4 Figure 8-1: Storage Capacity Terms

+ Heather Sweeney Designs Review: Database Design KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-5

+ Business Intelligence Systems Business intelligence (BI) systems are information systems that Assist managers and other professionals in the analysis of current and past activities and in the prediction of future events. Do not support operational activities, such as the recording and processing of orders. These are supported by transaction processing systems. Support management assessment, analysis, planning and control. BI systems fall into two broad categories: Reporting systems that sort, filter, group, and make elementary calculations on operational data. Data mining applications that perform sophisticated analyses on data; analyses that usually involve complex statistical and mathematical processing. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-6

+ The Relationship Among Operational and BI Applications KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-7 Figure 8-3: The Relationship Between Operational and BI Applications

+ Characteristics of Business Intelligence Applications KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-8 Figure 8-4: Characteristics of Business Intelligence Applications

+ Components of a Data Warehouse KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-9 Figure 8-5: Components of a Data Warehouse

+ Problems with Operational Data “Dirty Data” Example – “G” for Gender Example – “213” for Age Missing Values Inconsistent Data Example – data that has changed, such as a customer’s phone number KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-10

+ Problems with Operational Data (Continued) Nonintegrated Data Example – data from two or more sources that need to be combined Incorrect Format Example – time data in hours when needed in minutes Too Much Data Example – An excess number of columns KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-11

+ ETL Data Transformation Data may need to be transformed for use in a data warehouse. Example {CountryCode  CountryName} “US”  “United States” Example address to domain  “somewhere.com” KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-12

+ Characteristics of a Data Mart KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-13 Figure 8-6: Data Warehouses and Data Marts

+ Enterprise Data Warehouse (EDW) Architecture Combines the data warehouse structure and the data mart structures shown above Expensive to create, staff and operate Smaller organizations use subsets of the EDW architecture KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-14

+ Dimensional Databases A non-normalized database structure used for data warehouses May use slowly changing dimensions Values change infrequently Phone Number Address Use a Date or Time dimension KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-15 Figure 8-7: Characteristics of Operational and Dimensional Databases

+ Star Schema KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-16 Figure 8-8: The Star Schema

+ HSD-DW Star Schema KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-17 Figure 8-9: The HSD-DW Star Schema

+ Two-Dimensional Matrix KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-18 Figure 8-13: The Two-Dimensional ProductNumber–CustomerID Matrix

+ Three-Dimensional Matrix KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-19 Figure 8-14: The Three-Dimensional Time–ProductNumber–CustomerID Cube

+ Conformed Dimensions and the Extended HSD-DW Schema KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-20 Figure 8-15: The Extended HSD-DW Star Schema

+ OnLine Analytical Processing (OLAP) OnLine Analytical Processing (OLAP) is a technique for dynamically examining database data. OLAP uses arithmetic functions such as Sum and Average. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-21

+ OLAP Reports OLAP systems produce an OLAP report, also know as an OLAP cube. The OLAP report uses inputs called dimensions. The OLAP report calculates outputs called measures. Excel PivotTables can be used to create OLAP reports. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-22

+ SQL Query for OLAP Data KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-23

+ SQL View for OLAP Data KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-24

+ Excel PivotTable OLAP Report I KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-25 Figure 8-16: OLAP ProductNumber by City Report

+ Excel PivotTable OLAP Report II KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-26 Figure 8-17: OLAP ProductNumber by City, Customer, and Year Report

+ Excel PivotTable OLAP Report III KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-27 Figure 8-18: OLAP City by ProductNumber, Customer, and Year Report

+ Distributed Database Processing A database is distributed when it is: Partitioned Replicated Both partitioned and replicated This is fairly straightforward for read-only replicas, but it can be very difficult for other installations. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-28

+ Type of Distributed Databases KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-29 Figure 8-19: Types of Distributed Databases

+ Type of Distributed Databases (Cont’d) KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-30 Figure 8-19 Types of Distributed Databases (Cont’d)

+ Object-Relational Database Management Object-oriented programming (OOP) is based on objects, and OOP is now used as the basis of many computer programming languages: Java VisualBasic.Net C++ C# KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-31

+ Objects Object classes have Identifiers Properties These are data items associated with the object. Methods These are programs that allow the object to perform tasks. The only difference between entity classes and object classes are the methods. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-32

+ Object Persistence Object persistence means that values of the object properties are storable and retrievable. Object persistence can be achieved by various techniques. A main technique is database technology. Relational databases can be used, but require substantial programming. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-33

+ OODBMS Object-Oriented DBMSs (OODBMSs) have been developed. Never achieved commercial success It would be too expensive to transfer existing data from relational and other legacy databases. The OODBMSs were, therefore, not cost justifiable. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-34

+ Object-Relational DBMSs Some relational DBMS vendors have added object-oriented features to their products. Example: Oracle These products are known as object-relational DBMSs and support object-relational databases. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Educations, Inc. Publishing as Prentice Hall 6-35

+ The NoSQL Movement I The NoSQL movement is a movement to using non-relational databases. These databases are often described as structured storage. KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 7-36

+ The NoSQL Movement II One implementation is as a distributed, replicated database that is described in this chapter. Example: Apache CassandraCassandra Used for Facebook Used for Twitter KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 7-37

+ The NoSQL Movement III Another implementation is based on XML document structures as described in this Chapter. Example: dbXMLdbXML XML database typically support: W3C XQuery standard W3C XPath standard KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 7-38

+ Generalized Structured Storage: A Column KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-39 (a) A Column Figure 8-20: A Generalized Structured Storage System

+ Generalized Structured Storage: A Super Column KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-40 (b) A Super Column Figure 8-20: A Generalized Structured Storage System (Cont’d)

+ Generalized Structured Storage: A Column Family KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-41 Figure 8-20: A Generalized Structured Storage System (Cont’d) (c) A Column Family

+ The MapReduce Process KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 8-42 Figure 8-21: MapReduce