What’s “File Processing”? The “old” way of doing things; still often used in practice. Separate information stored on separate files.
File Processing Example: SalesProductionMarketing Knows how many of Products A, B, and C have been sold. File stores Prod. Name, Production Schedule, and Sales. Knows how much of Products A, B, and C have been produced. File stores Prod. Name, Production Schedule, and Number Produced. Knows the price of Products A, B, and C. File stores Prod. Name and Product Price.
Any problems here? Duplication (redundancy). Inconsistency. Does anyone know how much money we made? No integration. Set format. Data dependence. Y2K!!
Database Management Database Management System (DBMS) Provides one integrated repository for data to be stored and queried. Standards for data can be defined and enforced. Reports and queries are easy (er). SQL, etc.
Database Management Ex.: Database Prod. Name Production Schedule Sales Number Produced Product Price DBMS SalesProductionMarketing (App. Progs)
Another Look ( thanks to John Gallaugher, Boston College) Database a collection of related data. Usually organized according to topics: e.g. customer info, products, transactions Database Management System (DBMS) –a program for creating & managing databases; ex. Oracle, MS- Access, Sybase DBMS - the program. Manages interaction with databases. database - the collection of data. Created and defined to meet the needs of the organization. Client - makes requests of the DBMS server request response Server - responds to client requests
A Simple Database File/Table Customers Field/Column 5 shown: CUSTID, FIRST, LAST, CITY, STATE Record/Row 5 shown: one for each customer
A More Complex Example Entry & Maintenance is complicated redundant data exists, increases chance of error, complicates updates/changes, takes up space
Normalize Data: Remove Redundancy One Many Customer Table Transaction Table
Key Terms Relational DBMS manages databases as a collection of files/tables in which all data relationships are represented by common values in related tables (referred to as keys). a relational system has the flexibility to take multiple files and generate a new file from the records that meet the matching criteria (join). SQL - Structured Query Language Most popular relational database standard. Includes a language for creating & manipulating data.
Using SQL for Querying SQL (Structured Query Language) Data language English-like, nonprocedural, very user friendly language Free format Example: SELECTName, Salary FROMEmployees WHERESalary >2000
Structures Hierarchical: The old way. “Tree”. Access elements by moving down tree. One-to-many. Network: Criss-cross patterns. Many-to-many. Relational: a common element relates “tables” to one another. Permits “ad hoc”. Object-oriented: “objects” have data, processes, and properties “encapsulated” in them.
Database Structures Dept A B C EmpnoDept 1A 2B 3C Relational Structure Network Structure Hierarchical Structure Relation
Pros and Cons Speed ==> Ad Hoc Flexibility ==> Relat. Net. Hier. Obj.
Data Dictionaries The Data Dictionary A reference work of data about data (metadata) compiled by the systems analyst to guide analysis and design. As a document, the data dictionary collects, coordinates, and confirms the meaning of data terms to various users throughout the organization. Documentation, Elimination of data redundancy Validate the data flow diagram for completeness and accuracy Provide a starting point for developing screens and reports Determine contents of data stored in files Develop the logic for data flow diagram processes Uses of the Data Dictionary
Data Flow Diagrams (“DFD”) Data Flow Process File or Data Store Source or Entity
1 2 3 Tenant New Tenant Process Collection Process Delinquent Process Lease D1Tenant File Tenant InfoDFD Example: Apartment Rental Payments Bank Deposit Receipt Ext. Mgr Cash Report D1Tenant File Unpaid Charges Delinquency Report Tenant Info Delinquencies Copy of lease Notice
Dept. ProjectsDept. Employee Entity Relationship Diagrams works on “one” “many”“zero” * Name Title Address * Project Deadline Resources
New Names, Same Ideas Data Mining, OLAP Data Warehousing
Data Mining automated information discovery process, uncovers important patterns in existing data can use neural networks or other approaches. Requires ‘clean’, reliable, consistent data. Historical data must reflect the current environment. e.g. “What are the characteristics that identify when we are likely to lose a customer?” OLAP is user-driven discovery
Warehouses & Marts Data Warehouse a database designed to support decision-making in an organization. It is batch-updated and structured for fast online queries and exploration. Data warehouses may aggregate enormous amounts of data from many different operational systems. Data Mart –a database focused on addressing the concerns of a specific problem or business unit (e.g. Marketing, Engineering). Size doesn’t define data marts, but they tend to be smaller than data warehouses.
Data Warehouses & Data Marts TPS & other operational systems Data Warehouse Data Mart (Marketing) Data Mart (Engineering) 3rd party data = query, OLAP, mining, etc. = operational clients