Introduction to Databases

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Introduction to Databases
Introduction to Databases
Introduction to Databases
Ch1: File Systems and Databases Hachim Haddouti
Introduction to Databases
Beyond data modeling Model must be normalised – purpose ? Outcome is a set of tables = logical design Then, design can be warped until it meets the realistic.
Introduction to Databases Transparencies
Chapter 1 INTRODUCTION TO DATABASE.
WFM-6103: Hydrologic Information System (HIS) Akm Saiful Islam Lecture-5: Database Management System April-October, 2006 Institute of Water and Flood Management.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Databases and Database Management Systems
Chapter 1 Introduction to Databases
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Introduction to Databases
Introduction to Databases Transparencies 1. ©Pearson Education 2009 Objectives Common uses of database systems. Meaning of the term database. Meaning.
IST Databases and DBMSs Todd S. Bacastow January 2005.
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
Information storage: Introduction of database 10/7/2004 Xiangming Mu.
CS370 Spring 2007 CS 370 Database Systems Lecture 2 Overview of Database Systems.
Copyright © 2003 by Prentice Hall Module 4 Database Management Systems 1.What is a database? Data hierarchy and data organization Field, record, file,
Chapter 1 Introduction to Databases Pearson Education ©
Web-Enabled Decision Support Systems
Database Management Systems Accounting Information Systems, 5 th edition James A. Hall COPYRIGHT © 2007 Thomson South-Western, a part of The Thomson Corporation.
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Concepts and Terminology Introduction to Database.
Databases and Database Management Systems
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Announcements. Data Management Chapter 12 Traditional File Approach  Structure Field  Record  File  Fixed All records have common fields, and a field.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
Chapter 4c, Database H Definition H Structure H Parts H Types.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
Chapter 1 Introduction to Databases © Pearson Education Limited 1995, 2005.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
FILES AND DATABASES. A FILE is a collection of records with similar characteristics, e.g: A Sales Ledger Stock Records A Price List Customer Records Files.
1 Introduction to Databases. 2 Examples of Database Applications u Purchases from the supermarket u Purchases using your credit card u Booking a holiday.
1 Chapter 1 Introduction to Databases Transparencies.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
IS 4424: Developing the Entreprise Database Frederic Adam.
Introduction to Databases Transparencies © Pearson Education Limited 1995, 2005.
1 Geog 357: Data models and DBMS. Geographic Decision Making.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Structured Query Language used for defining and manipulating data in Relational DBs aimed at: –reducing training costs –increasing productivity –improve.
Introduction to Databases Transparencies
Introduction to Databases
Introduction to Databases Transparencies
Databases and DBMSs Todd S. Bacastow January
An Introduction to database system
Database Management System
Introduction to Databases Transparencies
Introduction to Databases
Introduction to Databases
Chapter 4 Relational Databases
Introduction to Database Management System
Basic Concepts in Data Management
Physical Database Design
Introduction to Databases
MANAGING DATA RESOURCES
Data Model.
Introduction to Databases
Introduction to Databases Transparencies
Chapter 17 Designing Databases
Presentation transcript:

Introduction to Databases Data Organisation Definition Data modelling SQL DBMS functions

Basics of data Organisation: DATA HIERARCHY (four categories) Fields = represent a single data item Records = made up of a related set of fields describing one instance of an entity File / Table = a set of related records - as many as instances (occurrence) in the set Database = a collection of related files

Example of data structure Fields Name First name Telephone Sampras Pete 45 25 65 65 Healy Margaret 25 58 96 63 Clinton Bill 12 25 28 89 Henry Thierry 25 78 85 85 Records + Other files =>complete data Structure = DB File / Table

Database: Definition. "A collection of interrelated data stored together with controlled redundancy, to serve one or more applications in an optimal fashion; the data is stored so that it is independent of the application programs which use it; a common and controlled approach is used in adding new data and in modifying existing data within the database."

Definition - closer look A collection of interrelated data stored together with controlled redundancy to serve one or more applications in an optimal fashion the data is stored so that it is independent of the application programs which use it a common and controlled approach is used in adding new data and in modifying existing data within the database.

Advantages of Databases: data are independent from applications - stored centrally data repository accessible to any new program data are not duplicated in different locations programmers do not have to write extensive descriptions of the files These save enough money and time to offset the extra costs of setting and maintaining DBs

Disadvantages of DBs: Data are more accessible so more easily abused Large DBs require expensive hardware and software specialised / scarce personnel is required to develop and maintain large DBs People / business units may object to “their” data being widely available in a DB

Characteristics of DBs… High concurrency (high performance under load) Multi-user (read does not interfere with write) Data consistency – changes to data don’t affect running queries + no phantom data changes High degree of recoverability (pull the plug test)

ACID test Atomicity Consistency Isolation Durability All or nothing Preserve consistency of database Transactions are independent Once committed data is preserved

DataBase Management System (DBMS): program that makes it possible to: create use maintain a database It provides an interface / translation mechanism between the logical organisation of the data stored in the DB and the physical organisation of the data

Using a database: Two main functions of the DBMS : Query language - for people who are not programmer (greatest advantage of DB) Data manipulation language - for programmers who want to modify the links between data elements within the DB Also, Host Language - the language used by programmers to develop the rest of the application - eg: Visual Basic for Applications (VBA) / Oracle developer 2000

Different types of DBs: creating the DB = specifying the links between data items different types of relationships can be specified - ie different logical views they correspond to three main types of DBMSs: Hierarchical DBs Network DBs Relational DBs Object Oriented DBs

Hierarchical DBs: data item are related as “Parent” and “Child” in a tree-like structure “parent” means data item is higher in the tree than “child” and connected to it one “parent” can have more than one “child”, but one “child” can only have one “parent” most common platform = IBM’s Information Management System (IMS)

Example Customers Payments Orders Currency Items Unit of packaging Substitution Product Very fast retrieval

Undesirable side effects: Insertion of record: dependent record cannot be added without a parent eg: units of packaging cannot be added without linkage to an existing item Deletion of record: deletion of a parent deletes all children deleting an existing item will delete its replacement items Impossible to have two parents = trouble

Network DBs: same as parent and children in Hierarchical DB, but children can have more than one parent It is also possible to link items upwards to other items parents practically, it means that the DBMS is more flexible for data retrieval

Example Suppliers Customers Payments Orders Currency Items Unit of packaging Substitution Product

Relational DBs: Data items stored in tables Specific fields in tables related to other field in other tables (joint) infinite number of possible viewpoints on the data (queries) Highly flexible DB but overly slow for complex searches Oracle, SyBase, Ingres, Access, Paradox for Windows...

Describing relationships Attempt at modelling the business elements (entities) and their relationships (links) Can be based on users’ descriptions of the business processes Specifies dependencies between the data items Coded in an Entity-Relationship Diagram (ERD)

Types of Relationships one-to-one: one instance of one data item corresponds to one instance of another one-to-many: one instance to many instances many-to-many: many instance correspond to many instances Also some relationships may be: compulsory optional

Example Student registering system What are the entities? What type of relationship do they have? Draw the diagram

Entity Relationship Diagram

Next step - creating the data structure Few rules - a lot of experience Can get quite complex (paramount for the speed of the DB) Tables must be normalised - ie redundancy is limited to the strict minimum by an algorithm In practice, normalisation is not always the best

Data Structure Diagrams Describe the underlying structure of the DB: the complete logical structure Data items are stored in tables linked by pointers attribute pointers: data fields in one table that will link it to another (common information) logical pointers: specific links that exist between tables Tables have a key If an attribute seems to belong to a relationship rather than an attribute, it may mean an associative entity must be added

* compulsory attributes 0 optional attributes ORDER order number Item description Item Price Quantity ordered Customer number Item number Customer Customer number Customer name Customer address Customer balance Customer special rate 1 2 3 4 Item Item number Item description Item cost Quantity on hand * compulsory attributes 0 optional attributes

Definitions Entity Attributes Instance(s) Domain Key (candidate primary and foreign)

Definitions Relationship Ordinality Cardinality Associative Entity

Some test questions Is it a bird is it a plane? Is it an entity or an attribute?

Normalisation Process of simplifying the relationships amongst data items as much as possible (see example provided - handout) Through an iterative process, structure of data is refined to 1NF, 2NF, 3NF etc. Reasons for normalisation: to simplify retrieval (speed of response) to simplify maintenance (updates, deletion, insertions) to reduce the need to restructure the data for each new application

First Normal Form design record structure so that each record looks the same (same length, no repeating groups) repetition within a record means one relation was missed = create new relation elements of repeating groups are stored as a separate entity, in a separate table normalised records have a fixed length and expanded primary key

Second Normal Form Record must be in first normal form first each item in the record must be fully dependent on the key for identification Functional dependency means a data item’s value is uniquely associated with another’s only on-to-one relationship between elements in the same file otherwise split into more tables

Third normal form to remove transitive dependencies when one item is dependent on an item which is dependent from the key in the file relationship is split to avoid data being lost inadvertently this will give greater flexibility for the design of the application + eliminate deletion problems in practice, 3 NF not used all the time - speed of retrieval can be affected

Beyond data modeling Model must be normalised – purpose ? Outcome is a set of tables = logical design Then, design can be warped until it meets the realistic constraints of the system Eg: what business problem are we trying to solve? – see handout [riccardi p. 113, 127] Update anomalies Each item should appear only once + you ask many good questions

Realistic constraints Users cannot cope with too many tables Too much development required in hiding complex data structure Too much administration Optimisation is impossible with too many tables Actually: RDBs can be quite slow!

Key practical questions What are the most important tasks that the DB MUST accomplish efficiently? How must the DB be rigged physically to address these? What coding practices will keep the coding clean and simple? What additional demands arise from the need for resilience and security?

Analysis - Three Levels of Schema External Schema 1 External Schema 2 External Schema … Tables Logical Schema Disk Array Internal Schema

4 way trade-off Security Performance Ease of use Clarity of code

Key decisions Oracle offers many different ways to do things Indexes Backups… Good analysis is not only about knowing these => understanding whether they are appropriate Failure to think it through => unworkable model Particularly, predicting performance must be done properly Ok on the technical side, tricky on the business side

Design optimisation Sources of problems: Network traffic Excess CPU usage But physical I/O is greatest threat (different from physical I/O) Disks still the slowest in the loop Solution: minimise or re-schedule access Also try to minimise the impact of Q4 (e.g. mirroring, internal consistency checks…)

Creating links between the tables use common fields to join tables / queries very easy when data is properly normalised Gives total flexibility in terms of data retrieval Main strength of RDBs (SQL)

Structured Query Language used for defining and manipulating data in Relational DBs aimed at: reducing training costs increasing productivity improve application portability increase application longevity reduce dependency on single vendors enable cross systems communication In practice, SQLs can be a bit different

Querying RDBs with SQL use a form of pseudo english to retrieve data in a view (which looks like a table) syntax is based on a number of “clauses” Select: specifies what data elements will be included in the view From: lists the tables involved Where: specifies conditions to filter the data specific values sought links between tables

Example with one table find the name and address of customer number 1217

Example with a range find the items which are priced between £50 and £15000

Example with two tables find the rep name of all customers

Example with two tables same for customer Robson only

Use of a Search Condition - nested queries find the name and address of the customer who ordered order # 110

Additional syntax Add computation in the “select” statement: select SUM(price) select AVG(price), MAX, MIN, COUNT Simplify comparisons with a BETWEEN clause and LIKE clause (with *, ?) Add sorting instruction after the where clause ORDER BY name (alphabetical) ORDER BY price (ascending) Provide aggregate information by grouping data: GROUP BY customer

find contents (item# and description) of order 110:

find the average price of the cars for sale find the average price of all orders taken so far by customer “Jones”

find how much cash customer “Barry” has generated in total

find the average price of all orders taken so far

Sales Order Processing Oracle Demo Set - Sales Order Processing CUSTOMER TABLE SALES_ORDER TABLE PRODUCT TABLE ITEM TABLE PRICE TABLE

Oracle Demo Set - Employee Data DEPARTMENT TABLE EMPLOYEE TABLE LOCATION TABLE JOB TABLE Oracle Demo Set - Employee Data