Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNIT-1 Introduction to DBMS

Similar presentations


Presentation on theme: "UNIT-1 Introduction to DBMS"— Presentation transcript:

1 UNIT-1 Introduction to DBMS
Purpose of Database System Views of data Data Models Database Languages Database System Architecture Database users and Administrator Entity-Relationship model (E-R model ) E-R Diagrams Introduction to relational databases

2 Database Management System (DMBS)
Collection of interrelated data Set of programs to access the data DMBS contains information about a particular enterprise DBMS provides an environment that it both convenient and efficient to use

3 Purpose of Database Systems
Database management systems were developed to handle the following difficulties of typical file-processing systems supported by conventional operating systems: Data redundancy and inconsistency Difficulty in accessing data Data isolation – multiple files and formats Integrity problems Atomicity of updates Concurrent access by multiple users Security problems

4 Purpose of Database Systems
In the early days, database applications were built directly on top of file systems Drawbacks of using file systems to store data: Data redundancy and inconsistency Multiple file formats, duplication of information in different files Difficulty in accessing data Need to write a new program to carry out each new task Data isolation — multiple files and formats Integrity problems Integrity constraints (e.g. account balance > 0) become “buried” in program code rather than being stated explicitly Hard to add new constraints or change existing ones

5 DATABASE APPLICATIONS
Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions

6 Purpose of Database Systems (Cont.)
Drawbacks of using file systems (cont.) Atomicity of updates Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all Concurrent access by multiple users Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies Example: Two people reading a balance and updating it at the same time Security problems Hard to provide user access to some, but not all, data Database systems offer solutions to all the above problems

7 Levels of Abstraction Physical level: describes how a record (e.g., customer) is stored. Logical level: describes data stored in database, and the relationships among the data. type customer = record customer_id : string; customer_name : string; customer_street : string; customer_city : string; end; View level: application programs hide details of data types. Views can also hide information (such as an employee’s salary) for security purposes.

8 Views of Data An architecture for a database system View level View 1
View n Logical level Physical level

9 Instances and Schemas Similar to types and variables in programming languages Schema – the logical structure of the database Example: The database consists of information about a set of customers and accounts and the relationship between them) Analogous to type information of a variable in a program Physical schema: database design at the physical level Logical schema: database design at the logical level Instance – the actual content of the database at a particular point in time Analogous to the value of a variable Physical Data Independence – the ability to modify the physical schema without changing the logical schema Applications depend on the logical schema In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.

10 Data Independence Ability to modify a schema definition in one level without affecting a schema definition in the other levels. The interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others. Two levels of data independence Physical data independence Logical data independence

11 Data Models A collection of tools for describing Data
Data relationships Data semantics Data constraints Relational model Entity-Relationship data model (mainly for database design) Object-based data models (Object-oriented and Object-relational) Semi structured data model (XML) Other older models: Network model Hierarchical model

12 Database Languages Data Definition Language
Specification notation for defining the database schema DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (data about data) Data storage and definition language – special type of DDL in which the storage structure and access methods used by the database system are specified

13 Data Definition Language-DDL
Data Definition Language (DDL) statements are used to define the database structure or schema. Some examples: CREATE to create objects in the database ALTER alters the structure of the database DROP delete objects from the database TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed COMMENT - add comments to the data dictionary RENAME rename an object

14 Data Manipulation Language ( DML )
Language for accessing and manipulating the data organized by the appropriate data model Two classes of languages Procedural – user specifies what data is required and how to get those data Nonprocedural – user specifies what data is required without specifying how to get those data

15 Data Manipulation Language (DML)
Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples: SELECT - Retrieve data from the a database INSERT - Insert data into a table UPDATE - Updates existing data within a table DELETE - deletes all records from a table, the space for the records remain MERGE - UPSERT operation (insert or update) CALL - Call a PL/SQL or Java subprogram EXPLAIN PLAN - explain access path to data LOCK TABLE - control concurrency

16 Data Control Language (DCL)
Data Control Language (DCL) statements. Some examples: GRANT - gives user's access privileges to database REVOKE - withdraw access privileges given with the GRANT command

17 Transaction Management
A transaction is a collection of operations that performs a single logical function in a database application. Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g. power failures and operating system crashes) and transaction failures. Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.

18 Transaction Control (TCL)
Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions. Some examples: COMMIT – save work done SAVEPOINT - identify a point in a transaction to which you can later roll back ROLLBACK - restore database to original since the last COMMIT SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use

19 Storage Management A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. The storage manager is responsible for the following tasks: Interaction with the file manager Efficient storing, retrieving, and updating of data

20 Codd's 12 Rules for RDBMS The Information rule
The Guaranteed Access rule The Systematic Treatment of Null Values rule The Dynamic Online Catalog Based on the Relational Model rule The Comprehensive Data Sublanguage rule The View Updating rule The High-level Insert, Update, and Delete rule The Physical Data Independence rule The Logical Data Independence rule The Integrity Independence rule The Distribution Independence rule The No subversion rule

21 Database System Architecture

22 Database Administrator
Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise’s information resources and needs: Database administrator’s duties include: Schema definition Storage structure and access method definition Schema and physical organization modification Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements

23 Database Users Users are differentiated by the way they expect to interact with the system. Application programmers: interact with system through DML calls. Specialized users: write specialized database applications that do not fit into the traditional data processing framework Sophisticated users: form requests in a database query language. Naive users: invoke one of the permanent application programs that have been written previously

24 The Entity-Relationship Model
The slides for this text are organized into chapters. This lecture covers Chapter 2, on the Entity-Relationship approach to database design. The important issue of how to map from ER diagrams to relational tables is deferred until the relational model and the integrity constraints it supports have been introduced. ER to relational mapping, together with a discussion of the related SQL commands, is discussed in Chapter 3. 1

25 Database Design Process
Requirement collection and analysis DB requirements and functional requirements Conceptual DB design using a high-level model Easier to understand and communicate with others Logical DB design (data model mapping) Conceptual schema is transformed from a high-level data model into implementation data model Physical DB design Internal data structures and file organizations for DB are specified 2

26 Overview of Database Design
Conceptual design: (ER Model is used at this stage.) What are the entities and relationships in the enterprise? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold? A database `schema’ in the ER Model can be represented pictorially (ER diagrams). An ER diagram can be mapped into a relational schema. 2

27 ER Model Basics Entity: Real-world object distinguishable from other objects. An entity is described (in DB) using a set of attributes. Entity Set: A collection of similar entities. E.g., all employees. All entities in an entity set have the same set of attributes. (Until we consider ISA hierarchies, anyway!) Each entity set has a key. Each attribute has a domain. The slides for this text are organized into several modules. Each lecture contains about enough material for a 1.25 hour class period. (The time estimate is very approximate--it will vary with the instructor, and lectures also differ in length; so use this as a rough guideline.) This covers Lectures 1 and 2 (of 6) in Module (5). Module (1): Introduction (DBMS, Relational Model) Module (2): Storage and File Organizations (Disks, Buffering, Indexes) Module (3): Database Concepts (Relational Queries, DDL/ICs, Views and Security) Module (4): Relational Implementation (Query Evaluation, Optimization) Module (5): Database Design (ER Model, Normalization, Physical Design, Tuning) Module (6): Transaction Processing (Concurrency Control, Recovery) Module (7): Advanced Topics Employees ssn name lot 3

28 ER Model Basics Key and key attributes:
Employees ssn name lot Key and key attributes: Key: a unique value for an entity Key attributes: a group of one or more attributes that uniquely identify an entity in the entity set Super key, candidate key, and primary key Super key: a set of attributes that allows to identify and entity uniquely in the entity set Candidate key: minimal super key There can be many candidate keys Primary key: a candidate key chosen by the designer Denoted by underlining in ER attributes The slides for this text are organized into several modules. Each lecture contains about enough material for a 1.25 hour class period. (The time estimate is very approximate--it will vary with the instructor, and lectures also differ in length; so use this as a rough guideline.) This covers Lectures 1 and 2 (of 6) in Module (5). Module (1): Introduction (DBMS, Relational Model) Module (2): Storage and File Organizations (Disks, Buffering, Indexes) Module (3): Database Concepts (Relational Queries, DDL/ICs, Views and Security) Module (4): Relational Implementation (Query Evaluation, Optimization) Module (5): Database Design (ER Model, Normalization, Physical Design, Tuning) Module (6): Transaction Processing (Concurrency Control, Recovery) Module (7): Advanced Topics 3

29 ER Model Basics (Contd.)
name ssn lot Employees since name dname super-visor subor-dinate ssn lot did budget Reports_To Employees Works_In Departments Relationship: Association among two or more entities. e.g., Jack works in Pharmacy department. Relationship Set: Collection of similar relationships. An n-ary relationship set R relates n entity sets E1 ... En; each relationship in R involves entities e1 in E1, ..., en in En Same entity set could participate in different relationship sets, or in different “roles” in same set. 4

30 Key Constraints since lot name ssn dname did budget Manages Consider Works_In: An employee can work in many departments; a dept can have many employees. In contrast, each dept has at most one manager, according to the key constraint on Manages. Employees Departments 1-to-1 1-to Many Many-to-1 Many-to-Many 6

31 Example ER major Department offers An ER diagram represents several assertions about the real world. What are they? When attributes are added, more assertions are made. How can we ensure they are correct? A DB is judged correct if it captures ER diagram correctly. faculty Courses teaches Professor advisor enrollment Students 2

32 Participation Constraints
Does every department have a manager? If so, this is a participation constraint: the participation of Departments in Manages is said to be total (vs. partial). Every Departments entity must appear in an instance of the Manages relationship. since since name name dname dname ssn lot did did budget budget Employees Manages Departments Works_In since 8

33 Weak Entities A weak entity can be identified uniquely only by considering the primary key of another (owner) entity. Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities). Weak entity set must have total participation in this identifying relationship set. name cost ssn pname lot age Employees Policy Dependents 10

34 ISA (`is a’) Hierarchies
name ISA (`is a’) Hierarchies ssn lot Employees As in C++, or other PLs, attributes are inherited. If we declare A ISA B, every A entity is also considered to be a B entity. hourly_wages hours_worked ISA contractid Hourly_Emps Contract_Emps Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (default: disallowed; A overlaps B) Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (default: no; A AND B COVER C) Reasons for using ISA: To add descriptive attributes specific to a subclass. To identify entities that participate in a relationship. 12

35 name Aggregation ssn lot Employees Used when we have to model a relationship involving (entitity sets and) a relationship set. Aggregation allows us to treat a relationship set as an entity set for purposes of participation in (other) relationships. Monitors until started_on since dname pid pbudget did budget Projects Sponsors Departments Aggregation vs. ternary relationship: Monitors is a distinct relationship, with a descriptive attribute. Also, can say that each sponsorship is monitored by at most one employee. 2

36 Conceptual Design Using the ER Model
Design choices: Should a concept be modeled as an entity or an attribute? Should a concept be modeled as an entity or a relationship? Identifying relationships: Binary or ternary? Aggregation? Constraints in the ER Model: A lot of data semantics can (and should) be captured. But some constraints cannot be captured in ER diagrams. 3

37 Entity vs. Attribute Should address be an attribute of Employees or an entity (connected to Employees by a relationship)? Depends upon the use we want to make of address information, and the semantics of the data: If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued). If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, address must be modeled as an entity (since attribute values are atomic).

38 Entity vs. Attribute (Contd.)
from to name Employees ssn lot Works_In4 does not allow an employee to work in a department for two or more periods. Similar to the problem of wanting to record several addresses for an employee: We want to record several values of the descriptive attributes for each instance of this relationship. Accomplished by introducing new entity set, Duration. dname did budget Works_In4 Departments name dname budget did ssn lot Employees Works_In4 Departments Duration from to 5

39 Entity vs. Relationship
First ER diagram OK if a manager gets a separate discretionary budget for each dept. What if a manager gets a discretionary budget that covers all managed depts? Redundancy: dbudget stored for each dept managed by manager. Misleading: Suggests dbudget associated with department-mgr combination. since dbudget name dname ssn lot did budget Employees Manages2 Departments name ssn lot since dname Employees did budget Manages2 Departments ISA This fixes the problem! Managers dbudget 6

40 Binary vs. Ternary Relationships
name Employees ssn lot pname age If each policy is owned by just 1 employee, and each dependent is tied to the covering policy, first diagram is inaccurate. What are the additional constraints in the 2nd diagram? Covers Dependents Bad design Policies policyid cost name Employees ssn lot pname age Dependents Purchaser Beneficiary Better design policyid cost Policies 7

41 Binary vs. Ternary Relationships (Contd.)
Previous example illustrated a case when two binary relationships were better than one ternary relationship. An example in the other direction: a ternary relation Contracts relates entity sets Parts, Departments and Suppliers, and has descriptive attribute qty. No combination of binary relationships is an adequate substitute: S “can-supply” P, D “needs” P, and D “deals-with” S does not imply that D has agreed to buy P from S. How do we record qty? 9

42 Summary of Conceptual Design
Conceptual design follows requirements analysis, Yields a high-level description of data to be stored ER model popular for conceptual design Constructs are expressive, close to the way people think about their applications. Basic constructs: entities, relationships, and attributes (of entities and relationships). Some additional constructs: weak entities, ISA hierarchies, and aggregation. Note: There are many variations on ER model. 11

43 Summary of ER (Contd.) Several kinds of integrity constraints can be expressed in the ER model: key constraints, participation constraints, and overlap/covering constraints for ISA hierarchies. Some foreign key constraints are also implicit in the definition of a relationship set. Some constraints (notably, functional dependencies) cannot be expressed in the ER model. Constraints play an important role in determining the best database design for an enterprise. 12

44 Summary of ER (Contd.) ER design is subjective. There are often many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include: Entity vs. attribute, entity vs. relationship, binary or n-ary relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation. Ensuring good database design: resulting relational schema should be analyzed and refined further. FD information and normalization techniques are especially useful. 13

45 Entity-Relationship Model
Example of entity-relationship model customer-street social-security account-number customer-city customer-name balance depositor customer account

46 Relational Model Example of tabular data in the relational model:

47 Entity Relationship(E-R) Diagram
A graphical representation of the entities and the relationships between them. Entity relationship diagrams are a useful medium to achieve a common understanding of data among users and application developers. In data modeling, an entity-relationship model (ERM) is a representation of structured data; entity-relationship modeling is the process of generating these models. The end-product of the modeling process is an entity-relationship diagram (ERD), a type of Conceptual Data Model or Semantic Data Model.

48 Symbols for E-R Diagram

49 Example for E-R diagram

50 E-R Diagram for Banking System

51 Introduction to Relational Databases
Database – collection of persistent data Database Management System (DBMS) – software system that supports creation, population, and querying of a database Relational Database Management System (RDBMS) Consists of a number of tables and single schema (definition of tables and attributes) Students (sid, name, login, age, gpa) Students identifies the table sid, name, login, age, gpa identify attributes sid is primary key

52 Relational Database A relational database is a collection of data items organized as a set of formally described tables from which data can be accessed easily. A relational database is created using the relational model. The software used in a relational database is called a relational database management system (RDBMS). A relational database is the predominant choice in storing data, over other models like the hierarchical database model or the network model.

53 Relational Database The relational database was first defined in June 1970 by Edgar Codd, of IBM's San Jose Research Laboratory. Terminology

54 Example Table Students (sid: string, name: string, login: string, age: integer, cgpa: real) sid name login age cgpa 50000 Aravind 19 8.3 53666 Jones 18 8.4 53688 Smith 8.2 53650 8.8 53831 Miller 11 6.8 53832 Scott 12 7.0

55 Keys Primary key – minimal subset of fields that is unique identifier for a tuple sid is primary key for Students cid is primary key for Courses Foreign key –connections between tables Courses (cid, instructor, quarter, dept) Students (sid, name, login, age, cgpa)

56 Many to Many Relationships
In general, need a new table Enrolled(cid, grade, studid) Studid is foreign key that references sid in Student table Student Foreign key Enrolled sid name login 50000 Dave 53666 Jones 53688 Smith 53650 53831 Madayan 53832 Guldu cid grade studid Carnatic101 C 53831 Reggae203 B 53832 Topology112 A 53650 History 105 53666


Download ppt "UNIT-1 Introduction to DBMS"

Similar presentations


Ads by Google