Presentation is loading. Please wait.

Presentation is loading. Please wait.

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 1 WFM 5201: Data Management and Statistical Analysis.

Similar presentations


Presentation on theme: "WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 1 WFM 5201: Data Management and Statistical Analysis."— Presentation transcript:

1 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 1 WFM 5201: Data Management and Statistical Analysis Akm Saiful Islam Lecture-07: Database Management System June, 2008 Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET)

2 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 2 Outline Database Management System  Introduction to Databases  File System Vs. Databases  Advantages of using databases  Data Models – Hierarchical, network, relational, object oriented  Overview of Relational Database

3 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 3 Introduction to Databases Information Systems process and manage data. Data Management involves “Capturing”, “Retrieval,” and “Storage” of data. Database Management Systems (DBMSs) are Computer systems that manage data in databases. Today’s DBMSs are based on sophisticated software and powerful computer hardware. Well known DBMS software includes ORACLE, Microsoft SQL Server, Sybase and MySQL(free download) among others.

4 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 4 File Organisation Sequential Files records are stored in a fixed sequence records can only be read in that sequence, starting from the first record records can only be added at the end of the file (append) sequential files are not efficient Indexed Files Use an index to access records in a random fashion. Records can be sorted according to an attribute or preference. (e.g Alphabetically, Ascending, Descending, etc.) Indexed files are efficient, and faster to access.

5 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 5 The File Systems Approach Redundant Data Storage. One file is used in each application. No data sharing. Cross-application transfers are difficult to manage and achieve. File Systems are rarely used for data processing anymore. General Ledger File Personnel File Production Planning File Inventory File Despatch File Order Entry File Invoicing File Payroll File

6 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 6 The Database Approach Compactness. Data is stored in a single logical “place.” Data can be shared and related between applications Data transfer between applications is easier Used for a wide range of applications. General Ledger Personnel Production Planning Inventory Despatch Order Entry Invoicing Payroll

7 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 7 Database Characteristics Amount  Database size depends on the number of records or files it contains. Complexity  Database complexity depends on the number of relations between the files. Volatility  A measure of the changes typically required in a given period of time. Immediacy  A measure of how rapidly changes must be made to data.

8 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 8 Advantages of using a Database Approach Flexible Data Access. DBMSs have various tools to manipulate, query, or report data, such as Structured Query Language (SQL), and Report Generators. Hence:  Selected data is easily retrieved  A DBMS can accommodate different data views for different users Improved Data Integrity. Modern DBMSs consist of various tools and methods to:  ensure that data is correct, consistent, and current  verify data input and check whether data is ‘reasonable’.

9 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 9 Advantages of DBs (continued) Improved Data Security. Tools such as password access, and encryption, ensure that data is not:  deliberately or accidentally damaged or changed  accessed without proper authorisation Data Independence.  Problems arising from the interdependence of data and programs are kept to a minimum. Reduced Data Redundancy.  Single version of the truth.  Efficient data storage.  Efficient time management of Hardware (CPU), programmer(s), analyst(s) and user(s).  Relational DBs use Normalisation to reduce data redundancy.

10 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 10 Advantages of DBs (continued) Ability to Share and Relate Data.  Different user groups can use the same data.  Data in different (physical or logical) parts of the system can be related for a certain application. Standardisation of Data.  In general data items have common names and storage format. Increased Productivity.  The various tools reduce the complexity that is otherwise associated with DB maintenance when changes are required to the system. For example Law changes, Economy Changes, User Changes.

11 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 11 Costs of Database Approach The implementation and use of DBMSs is normally associated with various costs. Such as:  Initial expenses involve planning costs, and consultancy fees.  Computer hardware costs.  Software costs.  Database Administrator costs, and staff training costs.  Conversion costs of an existing system.  Various operational costs.

12 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 12 Data Models 1.Hierarchical 2.Network 3. Relational 4. Object

13 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 13 1. Hierarchical Model Stores data as hierarchically related to each other. Record shape are tree structure. BUET Faculty of Civil Engineering Faculty of Architectural CE WRE URP Archit.

14 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 14 Hierarchical Database Model

15 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 15 Hierarchical Database Model Logically represented by an upside down tree  Each parent can have many children  Each child has only one parent

16 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 16 Hierarchical Model Several records or files are hierarchically related with each other. For example, an organization has several departments, each of which has attributes such as name of director, number of staffs, annual products etc. Each department has several divisions with attributes of name of manager, number of staffs, annual products etc. Then each division has several sections with attributes such as name of head, number of staff, number of PCs etc. WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam

17 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 17 Advantage and Disadvantages of Hierarchical Model Advantages  High speed access to large databases  Easy to update- (to add or delete new nodes) Disadvantages  Links are only possible in Vertical Direction (from top to bottom) but not for horizontal or diagonal unless they have same parents.  For example, it is hard to find what is the relation between URP and DCE from this data model.

18 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 18 2. Network Database Model Doesn’t force data into hierarchical levels Owner/Member relationships:  Owner record type  Member record type Each owner may have one or more member types Each member type and corresponding owner record type form set, which represents relationship

19 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 19 Network Database Model

20 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 20 Network Database Model Each record can have multiple parents  Composed of sets - relationships  Each set has owner record and member record  Member may have several owners  A set represents a 1:M relationship between the owner and the member Figure 1.10

21 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 21 3. Relational Model Based on two important concepts:  Key of relation - one to one, one to many, many to many  Primary attribute – which can’t be duplicate Student ID NameCourseID 1Mr. X001 2Mr. X002 3Mr. Y003 Cour seID TitleCre dit 001RS & GIS in WM3 002Watershed Hydrology3 003Risk Management3 Course table Student Table Course Table * * Many to many relationship

22 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 22 Relational Database Relational database is the most popular model for GIS. For example, the following relational database softwares are widely used. - INFO in ARC/INFO - DBASE III for several PC-based GIS - ORACLE for several GIS uses In a relational model, the following two important concepts should be defined. Key of relation ; a subset of attributes Unique identification ; e.g. the key attributes is a phone directory in a set of last name, first name and address. non redundancy ; any key attribute selected and tabulated should keep the key's uniqueness. e.g. address can not be dropped from telephone address, because there may be many with the same names. Prime attribute : an attribute listed in at least one key. The most important point of the relational database design is to build a set of key attributes with a prime attribute, so as to allow dependence between attributes as well as to avoid loss of general information when records are inserted or deleted.

23 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 23 Relational Database Model

24 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 24 Relational Database Model Figure 1.11

25 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 25 SQL What is it? Structured Query Language Used in ORACLE and other DB systems Non-procedural - i.e. Specify what you want not how to get it SQL - (also pronounced SEQUEL) directly related to the development of the RELATIONAL MODEL by E.F.Codd.

26 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 26 SQL SQL is used to perform query in relations databases. For example, find the name of the student who took more than or equal to 6 credit hour in this term SELECT Student.Name, Course.Credit FROM Student, Course WHERE Student.CourseID = Course.CourseID AND Credit >= 6 The answer is : Mr. X 6

27 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 27 Find the relationship between this two tables in the BUET Library ISBNTitleAuthor 050Applied Hydrology David Maidmen 060IrrigationCheng IDNameISBN 1Mr. P050 2Mr. Q060 3Mr. R070 Book Table Borrow Table One to one Many to Many One to Many ?

28 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 28 Normalization of an Un-normalized Table to relational database

29 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 29 Advantage of Relational Database  Advantages  there is no redundancy.  type of building of an owner can be changed without destroying the relation between type and rate.  a new type of building for example "Clay" can be inserted. (row insert is easy).  Disadvantages  Require a number of tables and relationship  Its difficult to add a new column in the table.

30 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 30 4. Object Databases Current generation systems have a need to handle complex data for complex applications such as  computer aided design  computer aided software engineering  geographic information systems  interactive web sites Relational systems are inadequate for these systems  Why do you think this is?

31 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 31 Object Database Types Object-oriented  extend a programming language such as Java with persistency and a query language Object-relational  extend a current RDBMS (e.g. Oracle) with object-oriented extensions

32 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 32 Object Oriented Model BUET DepartmentsInstitutes CE WRE DCEIWFM URP AIT Faculty, Staff, Students Attributes: Is a Is a = Inheritance Part of = association Part of

33 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 33 Object Oriented Database An Object Oriented model uses functions to model spatial and non-spatial relationships of geographic objects and the attributes. An object is an encapsulated unit which is characterized by attributes, a set of orientations and rules. An object oriented model has the following characteristics. generic properties : there should be an inheritance relationship. abstraction : objects, classes and super classes are to be generated by classification, generalization, association and aggregation. adhoc queries : users can order spatial operations to obtain spatial relationships of geographic objects using a special language.

34 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 34 Example of Object Oriented Model

35 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 35 5. Object-Relational Database Model Object-relational database management systems (ORDBMS):  Combine: Ability of object technology to handle advanced relationship types Data integrity, reliability, and recovery features of relational models  Most popular and powerful of modern database system applications Oracle, Microsoft SQL Server

36 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 36 Object-Relational Database Table

37 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 37 Overview of Relational Database

38 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 38 What is a Relational Database? A database is more than just a collection of information - such as student and course information, faculty and grades. A database is a representation of the people and things your business needs to operate, and the way those people and things relate to each other. A database system supports the business rules defined by the customer.

39 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 39 Logical to Physical Database Design The Entities in the Logical Data Model are translated into Tables in the physical database design The entity attributes become columns of each table in the database  Data type (numeric, character, date)  Business rules for the legal values for the column (the domain of the column)

40 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 40 Data Models A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the given data model. The relational model of data is the most widely used model today.  Main concept: relation, basically a table with rows and columns.  Every relation has a schema, which describes the columns, or fields.

41 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 41 Example: University Database Conceptual schema:  Students(sid: string, name: string, login: string, age: integer, gpa:real)  Courses(cid: string, cname:string, credits:integer)  Enrolled(sid:string, cid:string, grade:string) Physical schema:  Relations stored as unordered files.  Index on first column of Students. External Schema (View):  Course_info(cid:string,enrollment:integer)

42 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 42 Instance of Students Relation Students( sid: string, name: string, login: string, age: integer, gpa: real ) sidnameloginagegpa 53666Jonesjones@cs183.4jones@cs 53688Smithsmith@ee183.2smith@ee 53650Smithsmith@math193.8smith@math

43 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 43 Levels of Abstraction Many external schemata, single conceptual(logical) schema and physical schema.  External schemata describe how users see the data.  Conceptual schema defines logical structure  Physical schema describes the files and indexes used.  Schemas are defined using DDL; data is modified/queried using DML. Physical Schema Conceptual Schema External Schema 1 External Schema 3 External Schema 2

44 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 44 Database Terminology Tables within a relational database hold sets of data using rows and columns Rows (records) appear horizontally in a report, and contain one or more columns Columns (fields) are named data elements and appear vertically in a report Primary Keys identify uniqueness in a row Indexes are created for faster access to the data in the database

45 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 45 Basic Database Concepts Table  A set of related records Name: Barry Harris College: Medicine Tel: 392-5555 Name: Barry Harris u Field u Record –A collection of data about an individual item –A single item of data common to all records

46 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 46 An Example of a Table Records Fields NameGatorLinkPhoneCollege Graffrgraff392-3900Pharmacy Harrisbharris392-5555Medicine Ipswichzipswich846-5656PHHP

47 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 47 Different parts of a database Fields – different types of data (number or text) Records Queries Reports

48 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 48 Concepts pf Relational Database Based on two important concepts:  Key of relation - one to one, one to many, many to many.  Primary attribute – which can’t be duplicate

49 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 49 Primary Key The column or set of columns that provide the uniqueness for the row. A table can have only one primary key. Existing values in primary key columns may not be modified (insert new value and then delete old value) The table of a relationship containing the primary key is called the Parent Table.

50 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 50 Foreign Keys A primary key referenced from another table is called a foreign key For each foreign key value, there must be a row in a table whose primary key has the same value. The foreign key can be made up of one or more columns of a table but must match the primary key it is referencing A table can have any number of foreign keys.

51 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 51 Primary Keys & Foreign Keys NameUserPhoneCollege Graffrgraff392-3900Pharmacy Harrisbharris392-5555Medicine Ipswichzipswich846-5656PHHP To ensure that each record is unique in each table, we can set one field to be a Primary Key field. A Primary Key is a field that that will contain no duplicates and no blank values. Foreign Keys link to data in other tables

52 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 52 Relationship Types One-to-One : relationship is single valued in both directions  A manager manages one department; a department has only one manager. One-to-Many : relationship is multi-valued in one direction - one row in the parent table is associated with many rows in the dependent table.  One department has many employees. Many-to-Many : relationships are multi-valued in both directions. This type of relationship can be expressed in a table with a column for each entity. (crosswalk table)  An employee can work on more than one project, and a project can have more than one employee assigned. Employee, Project, and Employee/Project tables.

53 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 53 Data Integrity For a table to have Domain Integrity  the value of each column of data is meaningful and acceptable in the business environment, and passes all the edits we impose on it. For a table to have Association Integrity  the relationship between two or more columns in that table satisfies a pre-defined business association. For a table to have Referential Integrity  referential constraints between tables must be enforced at all times by the Relational Database Management System

54 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 54 Relational Database Referential Integrity Student_Course *Student_ID (FK) *Course_Number *Course_Ind For Referential Integrity - The foreign key must match a value in the primary key of the parent table, at all times. In this example, the Student table has a *Primary Key - Student_ID. The Student_Course table has a 3 column *Primary Key, and also has a Foreign Key (FK) of Student_ID that references the Student table. There must never be a Student_ID in the Student_Course table that does not exist in the Student table first. Student *Student_ID

55 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 55 Database Options Consumer Flat Files Microsoft Excel - Limit of 65,536 Rows Microsoft Access FileMaker Pro MySQL (Open Source) Postgres (Open Source) Enterprise RDMS Oracle IBM/DB2 MS SQL-server Sybase Informix Lotus Notes MySQL (Open Source) Postgres (Open Source)


Download ppt "WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Slide 1 WFM 5201: Data Management and Statistical Analysis."

Similar presentations


Ads by Google