Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database and Data Warehouse June 27, 2012. 2 LEARNING GOALS Explain basic concepts of data management. Describe traditional file systems and identify.

Similar presentations


Presentation on theme: "Database and Data Warehouse June 27, 2012. 2 LEARNING GOALS Explain basic concepts of data management. Describe traditional file systems and identify."— Presentation transcript:

1 Database and Data Warehouse June 27, 2012

2 2 LEARNING GOALS Explain basic concepts of data management. Describe traditional file systems and identify their problems. Define database management systems and describe their various functions. Explain how the relational database model works. Explain Object-Oriented databases. Explain Data Warehouse, Data Mart

3 3 What is a database? Collection of related files containing records on people, places, or things. Databases make data easy to access and manage. Customers Info Accounts InfoEmployees Info Access and Management tools

4 4 Basic Concepts of Data Management Database: Collection of data organized in different containers Table 1Table 2Table 3 Report Form 1 Acc #:_______ Name:_______

5 5 Basic Concepts of Database systems Table – Two-dimensional structure composed of rows and columns Field – Like a column in a spreadsheet Field name – Like a column name in a spreadsheet – Examples: AccountID, Customer, Type, Balance Field values – Actual data for the field Record – Set of fields that describe an entity (a person, an account, etc.) Primary key – A field, or group of fields, that uniquely identifies a record AccountIDCustomerTypeBalance John SmithChecking$ Linda MartinSaving$ Paul GrahamChecking$ Accounts table Each table has:  Fields  Records  1 Primary key

6 6 Basic Concepts in Data Management A Primary key could be a single field like in these tables  A Primary key could be a composite key, i.e. multiple fields AccountIDCustomerTypeBalance John SmithChecking$ Linda MartinSaving$ Paul GrahamChecking$ Primary key

7 7 Traditional File Systems System of files that store groups of records used by a particular software application Simple but with a cost – Inability to share data – Inadequate security – Difficulties in maintenance and expansion – Allows data duplication (e.g. redundancy) Application 1 Program 1 File 1 File 2 File 3 Program 2 File 1 File 2 File 3 Application 2 Program 1 File 1 File 2 File 3 Program 2 File 1 File 2 File 3

8 8 Traditional File System Anomalies Insertion anomaly – Data needs to be entered more than once if located in multiple file systems Modification anomaly – Redundant data in separate file systems – Inconsistent data in your system Deletion anomaly – Failure to simultaneously delete all copies of redundant data – Deletion of critical data

9 Database Advantages Database advantages from a business perspective include – Increased flexibility Handling changes quickly and easily – Increased scalability and performance Scalability: how the DB can adapt to increased demand – Reduced information redundancy & inconsistency – Increased information integrity (quality) – Increased information security

10 10 Database Management System (DBMS) Combination of software and data for – Collecting, storing and managing data in a database environment. A DBMS includes: – Database – Database engine (for accessing and modifying the DB content) – Data Manipulation Language Application 1 Program-1 Program-2 Application 2 Program-1 Program-2 DBMS

11 Software through which users and application programs interact with a database Database Management System (DBMS)

12 12 DBMS Functions Store data (in tables) on secondary storage Transform data into information (reports,..) Provide user with different logical views of actual database content Provide security: password authentication, access control – DBMSs control who can add, view, change, or delete data in the database ID Name Amt 01 John Linda Paul Physical view ID Name 02 Linda Name Amt Paul ID Name Amt 01 John Linda 3.00 Logical views

13 13 DBMS Functions (cont.) Allow multi-user access – Control concurrency of access to data – Prevent one user from accessing data that has not been completely updated When selling tickets online, Ticketmaster allows you to hold a ticket for only 2 minutes to make your purchase decision, then the ticket is released to sell to someone else – that is concurrency control

14 14 Types of DBMSs Desktop – Designed to run on desktop computers – Used by individuals or small businesses – Requires little or no formal training – Does not have all the capabilities of larger DBMSs – Examples: Microsoft Access, FileMaker, Paradox Desktop Server / Enterprise Handheld

15 15 Types of DBMSs (Cont.) Server / Enterprise – Designed for managing larger and complex databases by large organizations – Typically operate in a client/server setup – Either centralized or distributed Centralized – all data on one server – Easy to maintain – Prone to run slowly when many simultaneous users – No access if the one server goes down Distributed – each location has part of the database – Very complex database administration – Usually faster than centralized – If one server crashes, others can still continue to operate. – Examples: Oracle Enterprise, DB2, Microsoft SQL Server

16 16 Types of DBMSs (Cont.) Handheld – Designed to run on handheld devices – Less complex and have less capabilities than Desktop or Server DBMSs – Example: Oracle Database Lite, IBM’s DB2 Everywhere.

17 17 Database Models Database model = a representation of the relationship between structures (e.g. tables) in a database Common database models – Flat file model – Relational model (this one is the most common) – Object-oriented database model – Hierarchical model – Network model

18 18 Flat File Database  Stores data in basic table structures  No relationship between tables  Used on PDAs for address book

19 19 Relational Model Multiple tables related by common fields Uses controlled redundancy to create fields that provide linkage relationships between tables in the database – These fields are called foreign keys – the secret to a relational database – A foreign key is a field, or group of fields, in one table that is the primary key of another table

20 20 Object-Oriented DBMS Needed for multimedia applications that manage images, voice, videos, graphics, etc. in addition to numbers and characters. Popular in Web applications Slower compared to relational DBMS for processing large number of transactions Hybrid object-relational DBMS are emerging

21 Hierarchical Database Model Data is organized into a tree-like structure using parent-child relationships. Created in the 1960s by IBM Limited to storing data in one-to-many relationships – One parent segment to many child segments Not very flexible

22 Network Model Developed in 1969 Many-to-many relationships between entities Any record may be linked to any other record Highly flexible but also highly complex Rarely used

23 23 Data Warehouse Many organizations need internal, external, current, and historical data Data Warehouse are designed to, typically, store and manage data from operational transaction systems, Web site transactions, etc.

24 Data Warehouse Fundamentals Data warehouse – a logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making tasks The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes 24

25 Data Warehouse Fundamentals Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse.

26 26 Data Mart Subset of data warehouses that is highly focused and isolated for a specific population of users Example: Marketing data mart, Sales data mart, etc.

27 Database vs. Data Warehouse Databases contain information in a series of two-dimensional tables In a Data Warehouse and data mart, information is multidimensional, it contains layers of columns and rows 27

28 Multidimensional Analysis Data mining – the process of analyzing data to extract information not offered by the raw data alone Data-mining tool – uses a variety of techniques to find patterns and relationships in large volumes of information and infers rules that predict future behavior and guide decision making Data-mining tools include: query tools, statistical tools, intelligent agents, etc. 28

29 Information Cleansing or Scrubbing An organization must maintain high-quality data in the data warehouse Information cleansing or scrubbing – a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information Information cleansing or scrubbing, first, occurs during ETL. Then, when the data is in the Data Warehouse using Information cleansing or scrubbing tools. 29

30 30 Summary Questions Notes 1)What is a database, a table, a field, a record, a primary key, a composite key? 2) What are the problems with traditional file systems? 3) What are the major functions of a DBMS? 4)(a) Name some Desktop DBMSs. (b) Name some Enterprise DBMSs. (c) Handheld DBMSs 5)Describe hierarchical database model, network model 6)What are the differences between Flat File, Relational, and Object- oriented database models? 7)What is Data warehouse? Data Mart? 8)What is Extraction, transformation, and loading (ETL)? What is data- mining? What is Information cleansing or scrubbing?


Download ppt "Database and Data Warehouse June 27, 2012. 2 LEARNING GOALS Explain basic concepts of data management. Describe traditional file systems and identify."

Similar presentations


Ads by Google