Managing Data Resources

Slides:



Advertisements
Similar presentations
C6 Databases.
Advertisements

Management Information Systems, Sixth Edition
Managing Data Resources
7.1 © 2004 by Prentice Hall Management Information Systems 8/e Chapter 7 Managing Data Resources 7 7 MANAGING DATA DATARESOURCES Chapter.
Managing Data Resources
7.1 © 2006 by Prentice Hall 7 Chapter Managing Data Resources.
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT
Organizing Data & Information
Managing Data Resources
Managing Data Resources
Information Technology in Organizations
SESSION 7 MANAGING DATA DATARESOURCES. File Organization Terms and Concepts Field: Group of words or a complete number Record: Group of related fields.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
7.1 © 2006 by Prentice Hall 7 Chapter Managing Data Resources.
Chapter 3 Foundations of Business Intelligence: Databases and Information Management.
Managing Data Resources
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
7.1 Copyright © 2005 Pearson Education Canada Inc. Management Information Systems, Second Canadian Edition Chapter 7: Managing Data Resources MANAGING.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Essentials of Management Information Systems, 6e Chapter 7 Managing Data Resources 7.1 © 2005 by Prentice Hall Managing Data Resources Chapter 7.
6-1 DATABASE FUNDAMENTALS Information is everywhere in an organization Information is stored in databases –Database – maintains information about various.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Organizing Data and Information AD660 – Databases, Security, and Web Technologies Marcus Goncalves Spring 2013.
CHAPTER 8: MANAGING DATA RESOURCES. File Organization Terms Field: group of characters that represent something Record: group of related fields File:
7.1 Managing Data Resources Chapter 7 Essentials of Management Information Systems, 6e Chapter 7 Managing Data Resources © 2005 by Prentice Hall.
311: Management Information Systems Database Systems Chapter 3.
7.1 © 2004 by Prentice Hall Management Information Systems 8/e Chapter 7 Managing Data Resources 7 7 MANAGING DATA DATARESOURCES Chapter.
6 Chapter Databases and Information Management. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
7.1 © 2003 by Prentice Hall 7 7 MANAGING DATA DATARESOURCES Chapter.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Storing Organizational Information - Databases
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
MANAGING DATA DATARESOURCES Chapter. 7.2 File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1)Bit: Smallest unit.
Chapter 5 Data Resource Management. 2 I. Why do organizations store data?  Data resources must be structured and organized in some logical manner so.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Managing Data Resources 9 Th Edition. Problems with the Traditional File Environment Data redundancy and inconsistency: the presences of duplicate data.
Data resource management
Managing Data Resources File Organization and databases.
7.1 © 2006 by Prentice Hall Managing Data Resources Md. Rashedul Hasan.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Managing Data Resources Lecture 5 Managing Data Resources Lecture 5.
Foundations of Business Intelligence: Databases and Information Management.
Chapter 6.  Problems of managing Data Resources in a Traditional File Environment  Effective IS provides user with Accurate, timely and relevant information.
2/20: Ch. 6 Data Management What is data? How is it stored? –Traditional management storage techniques; problems –DBMS.
6-1 Copyright © 2013 Pearson Canada Inc. Databases and Information Management CHAPTER SIX.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
3.1 © 2006 by Prentice Hall 1 Chapter Managing Data Resources.
6.1 © 2010 by Prentice Hall 4 Chapter Databases and Information Management Databases and Information Management.
Managing Data Resources File Organization and databases for business information systems.
Chapter : 6 Database & Information Management MIS205.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Managing Data Resources
Intro to MIS – MGS351 Databases and Data Warehouses
Databases and Information Management
Data Resource Management
Databases and Data Warehouses Chapter 3
Data Resource Management & Business Intelligence
Basic Concepts in Data Management
MANAGING DATA RESOURCES
Managing data Resources:
Managing data Resources:
File Organization Terms & Concepts
MANAGING DATA RESOURCES
The Database Environment
Chapter 3 Database Management
Managing data Resources:
Presentation transcript:

Managing Data Resources

File Organization Terms and Concepts ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single character Field: Group of words or a complete number Record: Group of related fields File: Group of records of same type

File Organization Terms and Concepts (Continued) ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT File Organization Terms and Concepts (Continued) Database: Group of related files Entity: Person, place, thing, event about which information is maintained Attribute: Description of a particular entity Key field: Identifier field used to retrieve, update, sort a record

ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT The Data Hierarchy Figure 7-1

ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Entities and Attributes Figure 7-2

Problems with the Traditional File Environment ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Problems with the Traditional File Environment Data Redundancy and Inconsistency: Data redundancy: The presence of duplicate data in multiple data files so that the same data are stored in more than one place or location Data inconsistency: The same attribute may have different values.

ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Problems with the Traditional File Environment (Continued) Program-data dependence: The coupling of data stored in files and the specific programs required to update and maintain those files such that changes in programs require changes to the data Lack of flexibility: A traditional file system can deliver routine scheduled reports after extensive programming efforts, but it cannot deliver ad-hoc reports or respond to unanticipated information requirements in a timely fashion.

Problems with the Traditional File Environment (Continued) ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Problems with the Traditional File Environment (Continued) Poor security: Because there is little control or management of data, management will have no knowledge of who is accessing or even making changes to the organization’s data. Lack of data sharing and availability: Information cannot flow freely across different functional areas or different parts of the organization. Users find different values of the same piece of information in two different systems, and hence they may not use these systems because they cannot trust the accuracy of the data.

ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT Traditional File Processing Figure 7-3

Database Management System (DBMS) THE DATABASE APPROACH TO DATA MANAGEMENT Database Management System (DBMS) Software for creating and maintaining databases Permits firms to rationally manage data for the entire firm Acts as interface between application programs and physical data files Separates logical and design views of data Solves many problems of the traditional data file approach

The Contemporary Database Environment THE DATABASE APPROACH TO DATA MANAGEMENT The Contemporary Database Environment Figure 7-4

THE DATABASE APPROACH TO DATA MANAGEMENT Components of DBMS: Data definition language: Specifies content and structure of database and defines each data element Data manipulation language: Used to process data in a database Data dictionary: Stores definitions of data elements and data characteristics

THE DATABASE APPROACH TO DATA MANAGEMENT Sample Data Dictionary Report Figure 7-5

THE DATABASE APPROACH TO DATA MANAGEMENT Types of Databases: Relational DBMS Hierarchical and network DBMS Object-oriented databases

THE DATABASE APPROACH TO DATA MANAGEMENT Relational DBMS: Represents data as two-dimensional tables called relations Relates data across tables based on common data element Examples: DB2, Oracle, MS SQL Server

THE DATABASE APPROACH TO DATA MANAGEMENT The Relational Data Model Figure 7-6

Three Basic Operations in a Relational Database: THE DATABASE APPROACH TO DATA MANAGEMENT Three Basic Operations in a Relational Database: Select: Creates subset of rows that meet specific criteria Join: Combines relational tables to provide users with information Project: Enables users to create new tables containing only relevant information

The Three Basic Operations of a Relational DBMS THE DATABASE APPROACH TO DATA MANAGEMENT The Three Basic Operations of a Relational DBMS Figure 7-7

THE DATABASE APPROACH TO DATA MANAGEMENT Hierarchical and Network DBMS Hierarchical DBMS: Organizes data in a tree-like structure Supports one-to-many parent-child relationships Prevalent in large legacy systems

A Hierarchical Database for a Human Resources System THE DATABASE APPROACH TO DATA MANAGEMENT A Hierarchical Database for a Human Resources System Figure 7-8

THE DATABASE APPROACH TO DATA MANAGEMENT Hierarchical and Network DBMS Depicts data logically as many-to-many relationships

THE DATABASE APPROACH TO DATA MANAGEMENT The Network Data Model Figure 7-9

THE DATABASE APPROACH TO DATA MANAGEMENT Hierarchical and Network DBMS Disadvantages: Outdated Less flexible compared to RDBMS Lack support for ad-hoc and English language-like queries

THE DATABASE APPROACH TO DATA MANAGEMENT Object-Oriented Databases: Object-oriented DBMS: Stores data and procedures as objects that can be retrieved and shared automatically Object-relational DBMS: Provides capabilities of both object-oriented and relational DBMS

Physical design: Detailed description of business information needs CREATING A DATABASE ENVIRONMENT Designing Databases: Conceptual design: Abstract model of database from a business perspective Physical design: Detailed description of business information needs

Designing Databases: (Continued) CREATING A DATABASE ENVIRONMENT Designing Databases: (Continued) Entity-relationship diagram: Methodology for documenting databases illustrating relationships between database entities Normalization: technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems

An Unnormalized Relation for ORDER CREATING A DATABASE ENVIRONMENT An Unnormalized Relation for ORDER Figure 7-10

Normalized Tables Created from ORDER CREATING A DATABASE ENVIRONMENT Normalized Tables Created from ORDER Figure 7-11

An Entity-Relationship Diagram CREATING A DATABASE ENVIRONMENT An Entity-Relationship Diagram Figure 7-12

Distributing Databases CREATING A DATABASE ENVIRONMENT Distributing Databases Centralized database: Used by single central processor or multiple processors in client/server network There are advantages and disadvantages to having all corporate data in one location. Security is higher in central environments, risks lower. If data demands are highly decentralized, then a decentralized design is less costly, and more flexible.

Distributed database: CREATING A DATABASE ENVIRONMENT Distributed database: Databases can be decentralized either by partitioning or by replicating Partitioned database: Database is divided into segments or regions. For example, a customer database can be divided into Eastern customers and Western customers, and two separate databases maintained in the two regions. (sharding)

CREATING A DATABASE ENVIRONMENT Duplicated database: The database is completely duplicated at two or more locations. The separate databases are synchronized in off hours on a batch basis. Regardless of which method is chosen, data administrators and business managers need to understand how the data in different databases will be coordinated and how business processes might be effected by the decentralization.

Distributed Databases CREATING A DATABASE ENVIRONMENT Distributed Databases Figure 7-13

Ensuring Data Quality: CREATING A DATABASE ENVIRONMENT Ensuring Data Quality: Corporate and government databases have unexpectedly poor levels of data quality. National consumer credit reporting databases have error rates of 20-35%. 32% of the records in the FBI’s Computerized Criminal History file are inaccurate, incomplete, or ambiguous. Gartner Group estimates that consumer data in corporate databases degrades at the rate of 2% a month.

Ensuring Data Quality: (Continued) CREATING A DATABASE ENVIRONMENT Ensuring Data Quality: (Continued) The quality of decision making in a firm is directly related to the quality of data in its databases. Data Quality Audit: Structured survey of the accuracy and level of completeness of the data in an information system Data Cleansing: Consists of activities for detecting and correcting data in a database or file that are incorrect, incomplete, improperly formatted, or redundant

Data Warehousing and Data Mining DATABASE TRENDS Data Warehousing and Data Mining Data warehouse: Supports reporting and query tools Stores current and historical data Consolidates data for management analysis and decision making

Data mart: Data mining: DATABASE TRENDS Data mart: Subset of data warehouse Contains summarized or highly focused portion of data for a specified function or group of users Data mining: Tools for analyzing large pools of data Find hidden patterns and infer rules to predict trends

Graph Databases

NEO4J (Graphbase) A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Attach properties (key-value pairs) on nodes and relationships Relationships connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. A graph database can be thought of as a key-value store, with full support for relationships. http://neo4j.org/

NEO4J

NEO4J

NEO4J

NEO4J

NEO4J

NEO4J Properties

NEO4J Features Dual license: open source and commercial Well suited for many web use cases such as tagging, metadata annotations, social networks, wikis and other network-shaped or hierarchical data sets Intuitive graph-oriented model for data representation. Instead of static and rigid tables, rows and columns, you work with a flexible graph network consisting of nodes, relationships and properties. Neo4j offers performance improvements on the order of 1000x or more compared to relational DBs. A disk-based, native storage manager completely optimized for storing graph structures for maximum performance and scalability Massive scalability. Neo4j can handle graphs of several billion nodes/relationships/properties on a single machine and can be sharded to scale out across multiple machines Fully transactional like a real database Neo4j traverses depths of 1000 levels and beyond at millisecond speed. (many orders of magnitude faster than relational systems)

Transactions 1. Debit 100 TL to Groceries Expense Account 2. Credit 100 to Checking Account UPDATE account1 SET balance=balance-500; UPDATE account1 SET balance=balance+500; A transaction is simply a number of individual queries that are grouped together. Transactions provide an "all-or-nothing" proposition, stating that each work-unit performed in a database must either complete in its entirety or have no effect whatsoever.

Transactions four conditions (ACID) to which transactions need to adhere Atomicity: The queries that make up the transaction must either all be carried out, or none at all should be carried out Consistency: Refers to the rules of the data. During the transaction, rules may be broken, but this state of affairs should never be visible from outside of the transaction. Isolation : Simply put, data being used for one transaction cannot be used by another transaction until the first transaction is complete. Connection 1: SELECT balance FROM account1; Connection 2: SELECT balance FROM account1; Connection 1: UPDATE account1 SET balance = 900+100; Connection 2: UPDATE account1 SET balance = 900-100; 4. Durability: Once a transaction has completed, its effects should remain, and not be reversible.