Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Data Management

Similar presentations


Presentation on theme: "Introduction to Data Management"— Presentation transcript:

1 Introduction to Data Management
Ed Green Senior Lecturer - IST © Ed Green Penn State University All Rights Reserved

2 Introduction to Data management
Topics History Data structures File structures Definitions Database types Why database? Processing overview 4/23/2017 Introduction to Data management

3 Introduction to Data management
History Earliest computing No files All data on cards First files Introduction of magnetic tape storage Sequential Disk files Random access Databases Form of random access disk files Invented to solve a specific business problem Many types and many companies 4/23/2017 Introduction to Data management

4 Introduction to Data management
General Electric Very first database management system 1963 Switchgear Division Philadelphia 3-person team Charlie Bachman John Lyon Ted Seiter Critical business success Identified by Computer Products Department as potential commercial opportunity Initial marketing – 1965 as IDS Integrated Data Store Second marketing – 1974 as IDS/II Currently supported by Bull Computers Successor to original developer 4/23/2017 Introduction to Data management

5 Introduction to Data management
IBM (1) Parallel work to GE Santa Theresa Labs Similar business problem Recognized business opportunity Followed GE to market Circa 1966 Initial product – IMS Information Management System Currently supported by IBM Most widely used database management system – ever! 4/23/2017 Introduction to Data management

6 Introduction to Data management
IBM (2) Investigative work at Santa Theresa – Dr. Ted Codd IMS Sufficient for database storing and updating Insufficient for database retrieval Data redundancy  significant inaccuracies and less than optimal decision making New theoretical approach to database architecture Elimination of redundancy Establishment of logical relationships (versus physical relationships Two-dimensional tabular structure Critical requirements to define database Relational database management product Initially marketed as SQL/DS – circa 1972 Emerged as DB2 – early 1980’s Acquired and integrated Informix – late 1990’s DB2 currently in 9th major release (Version 9) Pre-eminent DBMS currently on market 4/23/2017 Introduction to Data management

7 Introduction to Data management
Ingres First independent relational database management company First query language (QUEL) Independent; out of sync with ANSI and FIPS standards groups Eventually adopted standard query language (SQL) Initial implementation – late 1970’s Remains supported by Computer Associates 4/23/2017 Introduction to Data management

8 Introduction to Data management
Oracle Emerged in early days of relational database management products Slightly lagged Ingres and IBM Aggressive management and marketing stance Customer-oriented Visionary leadership (Larry Ellison) 11th major release (Oracle 11g) to general availability – 2007 Current version Oracle 11g Version 2 Market leader – installed licenses 4/23/2017 Introduction to Data management

9 Introduction to Data management
Sybase Introduced in mid-1970’s Alternative architecture to DB2, Ingres, and Oracle Intended to support real-time processing needs Visionary leaders (Bob Epstein and Mark Hoffman) Database integration Distributed and replicated databases DBMS products in 8th major release (Version 15) 3rd among relational database management installations 4/23/2017 Introduction to Data management

10 Introduction to Data management
Microsoft DBMS products Access – personal database management product SQL Server – for shared database management Not an original developer Contracted (from Sybase) SQL Server based on Sybase DBMS SQL Server Current release – SQL Server 2008 Strategic product in Microsoft business strategy Windows 2000 and beyond 4/23/2017 Introduction to Data management

11 CCA (Computer Corporation of America)
Initially, small research lab at MIT Federal government contract in early 1960’s Manage intelligence data – satellite, signal, electronic Very large volumes of data Significant data correlation Rapid retrieval Non-standard queries Productized In mid-1960’s as Model 104 Later version in late-1970’s as Model 204 Obsolete as relational database management emerged 4/23/2017 Introduction to Data management

12 Cullinet (Cullinane Database Systems)
First independent database management system software company Founded mid-1970’s Compete with IBM for DBMS software marketplace Product – IDMS Integrated Database Management System Based on IDS Modified to run on IBM OS/360 architecture Leading DBMS for IBM mainframe environment Failed Inability to recognize emergence of relational database management Currently supported by Computer Associates in a “maintenance mode” 4/23/2017 Introduction to Data management

13 And, yes, there are others
Access Adabas Progress Empress mySQL 4/23/2017 Introduction to Data management

14 Introduction to Data management
Data Structures 4/23/2017 Introduction to Data management

15 Introduction to Data management
Basic Data Structures Array – two-dimensional construct consisting of rows and columns (a/k/a table) Homogeneous – all entries of same type Heterogeneous – entries of different types List – a collection of entries arranged sequentially Queue – a list arranged in such a way that the first item placed in the list is the first item to be removed (FIFO algorithm) Stack – a list arranged in such a way that the last item placed on the list is the first item to be removed (LIFO algorithm) Tree – a collection of entries arranged in such a way as to have a clearly defined hierarchy Ring – a collection of entries arranged in such a way as to continuously point to the next entry 4/23/2017 Introduction to Data management

16 Data Structures Representation
Queue Stack List Array Hierarchy Ring 4/23/2017 Introduction to Data management

17 Storage based on data structures - arrays
x A one-dimensional array. Each cell has an (x,1) address y A two-dimensional array. (x,y) Expansion in the y-direction as new rows are added. Each cell has an (x,y) address 4/23/2017 Introduction to Data management

18 Introduction to Data management
Arrays in memory x An array data structure exists in memory as y (x,y) Row 1 Row 2 Row 3 etc . . . 4/23/2017 Introduction to Data management

19 Storage based on data structures – queues and stacks
Item is removed from the top Queue – list structure based on a “first in, first out” algorithm Stack – list structure based on a “last in, first out algorithm Aged items move to top of list for service Aged items move to top of list for service New item is added after the last entry Item is removed from the top Aged items move to bottom of list for service New item is added above the first entry 4/23/2017 Introduction to Data management

20 Linked lists as a data structure concept
Generalized name applicable to Hierarchies Rings Utilizes a pointer to establish the order of the list Pointer – address Identifies next in sequence 4/23/2017 Introduction to Data management

21 Storage based on data structures - hierarchies
Hierarchy – unidirectional data structure based on higher-order precedence Characterized by the concept of persistence Attributes of higher level elements are retained by lower level dependents Explicit dependency Root Leaves subordinate addresses addr data 4/23/2017 Introduction to Data management

22 Storage based on data structures - rings
Head (if headed) Ring – elliptical data structure representation May or may not include concepts of precedence and persistence May be either unidirectional or bidirectional present only if headed present only if bidirectional always present addr data next addr prior addr head addr 4/23/2017 Introduction to Data management

23 Storage with linked lists
CASE 1: Insert a new record between the head and first record CASE 2: Insert a new record between the head and first record with back pointers 4/23/2017 Introduction to Data management

24 Storage with linked lists
data addr subordinate addresses CASE 1: Insert a new record between the root and first record 4/23/2017 Introduction to Data management

25 Introduction to Data management
Custom data types Specific items of data Support problem solution User-defined Represents a processing variable Programmatic structures aligned with data structure types 4/23/2017 Introduction to Data management

26 Introduction to Data management
File Structures 4/23/2017 Introduction to Data management

27 Files and File Structures
Sequential ISAM – Indexed Sequential Access Method IDAM – Indexed Direct Access Method HISAM – Hierarchical Indexed Sequential Access Method HIDAM – Hierarchical Indexed Direct Access Method 4/23/2017 Introduction to Data management

28 Introduction to Data management
Files and Databases File Database – provides the “template” that describes the organization within the file Databases exist within files 4/23/2017 Introduction to Data management

29 Introduction to Data management
Definitions Data – a set of facts Structured or not Ordered or not Organized or not Valuable or not Information - organized presentation of facts that support analysis and/or decision making Database – structured collection of (enterprise) data used for analysis and/or decision making that is required to be maintained Database Management System – the (executive) software that manages the interface between a user’s application and the database File Management System – that component of the operating system that manages the programmatic interface to a computer’s file system 4/23/2017 Introduction to Data management

30 Components and Architectures
4/23/2017 Introduction to Data management

31 Database System Components
Application Database Management System 4/23/2017 Introduction to Data management

32 Database System Functions
DDL DML DCL Database Application Management System SQL Create database Create tables Create supporting structures Read data (stored in database) Update data (stored in database) Maintain database structure Enforce rules Control concurrency Provide security Perform backup and recovery Create/process forms Create/transmit queries Create/process reports Execute application logic Control application 4/23/2017 Introduction to Data management

33 Introduction to Data management
Database Components User data Metadata Indexes Overhead data Application Metadata 4/23/2017 Introduction to Data management

34 Introduction to Data management
Data Dictionary Metadata – or, data about the data Comprehensive information source about enterprise data Documents data characteristics and employment Data type Data size Allowable values Where used Authorizations Relationships Ancillary structure May or may not be DBMS provided 4/23/2017 Introduction to Data management

35 Introduction to Data management
Database Models Network Hierarchical Inverted List Relational Object Object Relational 4/23/2017 Introduction to Data management

36 Introduction to Data management
Network Model Characterized by establish a network of data components Master Detail Physical addresses and pointers Next Prior Linkages “Forward” “Backward” Retrieval Random based on “primary” key Direct based on a known value 4/23/2017 Introduction to Data management

37 Network Model – basic structure
Current Address Data Next Prior Master 4/23/2017 Introduction to Data management

38 Introduction to Data management
Hierarchical Model Characterized by establish a network of data components Root Leaf Physical addresses and pointers Linkages Downward from root Retrieval Sequential search from root Random based on “primary” key value Direct based on a known value 4/23/2017 Introduction to Data management

39 Introduction to Data management
Hierarchical Model Current Address Data Next Address(es) Root Leaves 4/23/2017 Introduction to Data management

40 Introduction to Data management
Inverted List Model Coarse Index Identifies records with specified key values Few records with a given value Data sets of files containing records Runtime Data Dictionary Identifies records with specified key values Many records with a given value Bit map Fine Index Describes data items Identifies location Implied relationships 4/23/2017 Introduction to Data management

41 Introduction to Data management
Relational Model (1) Tabular format Columns Rows Cell Intersection of a row and column Relationships based on data values Points to specific row(s) 4/23/2017 Introduction to Data management

42 Introduction to Data management
Relational Model (2) Index Index Identifies row with specific key value(s) Identifies row with specific key value(s) Table 1 Table 2 Relationship 4/23/2017 Introduction to Data management

43 Introduction to Data management
Relational Model Programs Databases/DBA Presentation of data Preparation and processing of data Enforcement of business and integrity rules Management of access to data 4/23/2017 Introduction to Data management

44 Introduction to Data management
Object Model Programs Databases/DBA Presentation of data Preparation and processing of data Enforcement of business and integrity rules Management of access to data 4/23/2017 Introduction to Data management

45 Relational versus Object
Purchase Order Employee Materials Purchase Order Contract In an object database, the data is stored as the business object itself, encapsulated from the software so that access is via an interface language In a relational database, the data is stored in several structures and related to the business object via software 4/23/2017 Introduction to Data management

46 Introduction to Data management
Object-Relational Hybrid Relational model Object model Evolved from competing camps Technology state-of-the-practice Marketplace realities 4/23/2017 Introduction to Data management

47 Introduction to Data management
Why database? Storage management and efficiency – cost containment Retrieval speed Information accuracy Redundancy reduction 4/23/2017 Introduction to Data management

48 Database Processing Overview
4/23/2017 Introduction to Data management

49 DBMS Interfacing Operating System I/O DBMS Executive Buffer Pool
Physical I/O performed by OS Operating System I/O DBMS Executive Buffer Pool DBMS Work Area Database Database Schema Schema provides view of enterprise data Application Program Work Area 4/23/2017 Introduction to Data management

50 Retrieving a Logical Record from Disk Storage
DBMS determines data exists; identifies location Application requires data stored on database Application issues a data retrieval request to DBMS DBMS issues I/O interrupt to OS OS determines physical location of data DB buffer within DBMS is assigned OS issues “get I/O” instruction Retrieval Data retrieved from disk & transferred to DB buffer via OS DBMS is re-engaged by OS Application program notified that data is available 4/23/2017 Introduction to Data management

51 Storing a New Logical Record
DBMS determines data does not exist Application needs to store data database Application issues a data retrieval request to DBMS DBMS creates “template” of logical record DBMS returns control to application program DB buffer within DBMS is assigned DBMS issues I/O interrupt to OS Application program builds new logical record DBMS determines appropriate storage location Application indicates new logical record can be written Application issues data storage request to D BMS OS determines physical location of data Application program notified that data has been stored DBMS is re-engaged by OS Data is transferred from buffer; physically written on disk OS issues “put I/O” instruction 4/23/2017 Introduction to Data management

52 Changing Data in an Existing Logical Record
Application issues data storage request to D BMS Retrieval Changes are made using image in buffer Application determines readiness to store this data Application determines that data in a record already existing on the database is to have some of its data changed Executes the retrieval process DBMS re-validates appropriate storage location DBMS issues I/O interrupt to OS OS determines physical location of data OS issues “put I/O” instruction Application program notified that data has been stored Data is transferred from buffer; physically written on disk DBMS is re-engaged by OS 4/23/2017 Introduction to Data management


Download ppt "Introduction to Data Management"

Similar presentations


Ads by Google