Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 DB Environment

Similar presentations


Presentation on theme: "Chapter 2 DB Environment"— Presentation transcript:

1 Chapter 2 DB Environment
Ahmed M. Zeki ITIS 216 ITBIS 385

2 Objectives Purpose of three-level database architecture.
Contents of external, conceptual, and internal levels. Purpose of external/conceptual and conceptual/internal mappings. Meaning of logical and physical data independence. Distinction between DDL and DML. A classification of data models. Purpose/importance of conceptual modeling. Typical functions and services a DBMS should provide. Function and importance of system catalog. Software components of a DBMS. Meaning of client–server architecture and advantages of this type of architecture for a DBMS. Function and uses of Transaction Processing Monitors. / 83

3 Three-Level Architecture
Early proposal for a standard terminology and general architecture for DBSs was produced in 1971 2-level approach with a system view called the schema and user views called subschemas. The American National Standards Institute (ANSI) Standards Planning and Requirements Committee (SPARC) produced a similar architecture but recognized 3 levels including a system catalog. / 83

4 Three-Level Architecture
3 distinct levels at which data items can be described: External level: the way the users perceive the data Conceptual level: provides both the mapping and the desired independence between the external and the internal levels. Internal level: the way the DBMS and the OS perceive the data The data is actually stored using the data structure and file organizations. / 83 ANSI-SPACR 3-level architecture

5 Objectives of the 3-level Architecture
The objective is to separate each user’s view of the DB from the way the DB is physically represented. Why? Each user should be able to access same data, but have different customized view. Each user should be able to change the way they view the data, which should not affect other users. Users should not have to deal directly with physical DB storage details (e.g. indexing, hashing). DBA should be able to change the DB storage structure without affecting the users’ views. DBA should be able change the conceptual structure of the DB without affecting all users. / 83

6 External Level Users’ view of the DB, it describes that part of DB that is relevant to a particular user. Consists of a number of different external views of the DB. Each user has a view of the real world. It includes those entities, attributes, and relationships in the real world that the user is interested in, others may exist but the user will be unaware of. / 83

7 External Level Different views may have different views of the same data day month year or year month day Some views might include derived or calculated data, which is not actually stored in the DB but created when needed. Age is calculated form the DOB Combined from different entities / 83

8 Conceptual Level Community view of the DB, describes what data is stored in DB and relationships among the data. Contains the logical structure of the entire DB as seen by the DBA. It is a complete view of the data requirements of the organization that is independent of any storage considerations. / 83

9 Conceptual Level Represents:
All entities, their attributes, and their relationships The constraints on the data Semantic information about the data Security and integrity information Supports each external view, such that any data available to a user must be contained in, or derivable from, the conceptual level. / 83

10 Conceptual Level Must not contain any storage-dependent details.
Example: Description of an entity should contain only data type of attributes and their length, but not any storage considerations such as the number of bytes occupied. / 83

11 Internal Level Physical representation of the DB on the computer.
Describes how the data is stored in the DB. Covers the physical implementation of the DB to achieve optimal runtime performance and storage space utilization. It covers the data structures and file organizations used to store data on storage devices. / 83

12 Internal Level It interfaces with the OS access methods (i.e. file management techniques for storing and retrieving data records) to: place the data on the storage devices build the indexes retrieve the data It concerns with: Storage space allocation for data and indexes Record descriptions for storage (with stored sizes for data items) Record placement Data compression and data encryption techniques / 83

13 The Physical Level Managed by the OS under the direction of the DBMS. But the function of the OS and DBMS may vary from system to system. Some DBMS take advantage of many of the OS access methods, while others use only the most basic ones and create their own file organizations. / 83

14 The Physical Level The physical level consists of items only the OS knows such as exactly how the sequencing is implemented and whether the fields of internal records are stored as continuous bytes on the disk. / 83

15 Schema, Mappings, & Instances
The overall description of the DB is called the DB schema. / 83

16 Schema, Mappings, & Instances
Types of schema in the DB defined according to the 3-level architecture: External schema (or subschemas): correspond to different views of the data. Conceptual schema: describes all the entities, attributes, and relationships together with integrity constraints. Internal schema: complete description of the internal model, containing the definitions of stored records, the methods of representation, the data fields, and the indexes and storage structures used. There is only one conceptual schema and one internal schema per DB. / 83

17 Schema, Mappings, & Instances
The DBMS is responsible for mapping between these 3 types of schema. The DBMS checks the schemas for consistency i.e. each external schema is derivable from the conceptual schema, and it must use the information in the conceptual schema to map between each external schema and the internal schema. / 83

18 Schema, Mappings, & Instances
The conceptual schema is related to the internal schema through a conceptual/internal mapping. This enables the DBMS to find the actual record or combination of records in physical storage that constitute a logical record in the conceptual schema, together with any constraints to be enforced on the operations for that logical record. DBMS maintains the conceptual/internal mapping. / 83

19 Schema, Mappings, & Instances
The DBMS allows any differences in: entity names attribute names attribute order data types, and so on to be resolved. Each external schema is related to the conceptual schema by the external/conceptual mapping. This enables the DBMS to map names in the user’s view on to the relevant part of the conceptual schema. / 83

20 Schema, Mappings, & Instances
The 2 external views are merged into one conceptual view. The age field has been changed into a DOB. The DBMS maintains the external/conceptual mapping. / 83

21 Schema, Mappings, & Instances
Example: DBMS maps the sNo field of the 1st external view to the staffNo field of the conceptual record. The conceptual level is then mapped to the internal level, which contains a physical description of the structure for the conceptual record. At this level, we see a definition of the structure in a high-level language. The structure contains a pointer (next) which allows the list of staff records to be physical linked together to form a chain. The order of fields at the internal level is different from that at the conceptual level. / 83

22 Schema, Mappings, & Instances
It is important to distinguish between the description of the DB and the DB itself. The description of the DB is the DB schema (or intension). It is specified during the DB design process and is not expected to change frequently. The actual data in the DB may change frequently. Ex: it changes every time we insert details of a new member of staff. The data in the DB at any particular point in time is called a DB instance (or extension). Therefore, many DB instances can correspond to the same DB schema. / 83

23 Data Independence A major objective for the 3-level architecture is to provide data independence, which means that upper levels are unaffected by changes to lower levels. Kinds of data independence: Logical data independence Physical data independence / 83

24 Logical Data Independence
Refers to the immunity of the external schemas to changes in the conceptual schema. Ex: addition or removal of new entities, attributes, or relationships, should be possible without having to change existing external schemas or having to rewrite application programs. Users for whom the changes have been made need to be aware of them, but other users should not be. / 83

25 Physical Data Independence
Refers to the immunity of the conceptual schema to changes in the internal schema. Ex: using different file organizations or storage structure, using different storage devices, modifying indexes, or hashing algorithms, should be possible without having to change the conceptual or external schemas. / 83

26 Physical Data Independence
From the users’ point of view, the only effect that may be noticed is a change in performance. Deterioration in performance is the most common reason for internal schema changes. / 83

27 Physical Data Independence
The following figure illustrates where each type of data independence occurs in relation to the 3-level architecture. / 83

28 Physical Data Independence
The 2-stage mapping in the ANSI-SPARC architecture may be inefficient, but provides greater data independence. For more efficient mapping, the ANSI-SPARC model allows the direct mapping of external schemas on to the internal schema, thus bypassing the conceptual schema. This reduces data independence, so that every time the internal schema changes, the external schema and any dependent application programs may also have to change. / 83

29 DB Languages Data Sublanguage consists of 2 parts:
Called sublanguages because they don’t include constructs for all computing needs such as conditional or iterative statements which are provided by high-level programming languages Data Sublanguage consists of 2 parts: Data Definition Language (DDL) Used to specify the DB schema Allows the DBA or user to describe and name entities, attributes, and relationships required for the application plus any associated integrity and security constraints. Data Manipulation Language (DML) Used to both read and update the DB Provides basic data manipulation operations on data held in the database. / 83

30 DB Languages Many DBMS have a facility for embedding the sublanguage in a high-level language (e.g C++ which is called here a host language. To compile the embedded file: The commands in the data sublanguage are first removed from the host language program Replaced by function calls The prepocessed file is then compiled Placed in an object module Linked with a DBMS-specific library containing the replaced functions Executed when required. / 83

31 DB Languages Most data sublanguages also provide non- embedded or interactive commands that can be input directly from a terminal. / 83

32 Data Definition Language (DDL)
A language that allows the DBA or user to describe and name the entities, attributes, and relationships required for the application, together with any associated integrity and security constraints. The DB schema is specified by the DDL. It is used to define a new one or modify an existing one. It can’t be used to manipulate data. / 83

33 Data Definition Language (DDL)
The result of the compilation of the DDL statements is a set of tables stored in special files collectively called the system catalog (or data dictionary, data directory). The system catalog integrates the metadata, that is data that describes objects in the DB and makes it easier for those objects to be accessed or manipulated. Metadata contains definitions of records, data items and other objects that are of interest to users or are required by the DBMS. / 83

34 Data Definition Language (DDL)
The DBMS normally consults the system catalog before the actual data is accessed in the DB. Theoretically, different DDLs can be identified for each schema in the 3-level architecture: A DDL for the external schemas A DDL for the conceptual schema A DDL for the internal schema But in practice, there is one comprehensive DDL that allows specification of at least the external and conceptual schemas. / 83

35 Data Manipulation Language (DML)
A language that provides a set of operations to support the basic data manipulation operations: Insertion of new data into the DB Modification of data stored in the DB Retrieval of data stored in the DB Deletion of data from the DB Hence, one of the main functions of the DBMS is to support a DML in which the user can construct statements that will cause such data manipulation to occur. / 83

36 Data Manipulation Language (DML)
Data manipulation applies to The external level The conceptual level The Internal level At higher levels, emphasis is placed on ease of use and effort is directed at providing efficient user interaction with the system At the internal level we must define rather complex low-level procedures that allow efficient data access. / 83

37 Data Manipulation Language (DML)
The part of DML that involves data retrieval is called query language: High level special purpose language used to satisfy diverse requests for the retrieval of data held in the DB Query language = DML (commonly but technically incorrect) / 83

38 Data Manipulation Language (DML)
Types of DMLs (based on their underlying retrieval constructs): Procedural DML Specifies how the output of a DML statement is to be obtained Treat records individually Non-procedural DML Describe only what output is to be obtained Operate on sets of records / 83

39 1. Procedural DML A language that allows the user to tell the system what data is needed and exactly how to retrieve the data. The user must express all the data access operations that are to be used by calling appropriate procedures to obtain the information required. / 83

40 1. Procedural DML Procedural DMLs are embedded in a high level programming language that contains constructs to facilitate iteration and handle navigational logic. Network and hierarchical DMLs are procedural / 83

41 2. Non-Procedural DMLs Also called declarative language.
A language that allows the user to state what data is needed rather than how it is to be retrieved. Allow the required data to be specified in a single retrieval or update statement. User specifies what data is required without specifying how it is to be obtained. / 83

42 2. Non-Procedural DMLs The DBMS translates a DML statement into one or more procedures that manipulate the required sets of records. Frees users form knowing how data structures are internally implemented and what algorithms are required. Easier to learn than procedural language. / 83

43 2. Non-Procedural DMLs SQL and QBE (Query By Example) include some form of non-procedural language. / 83

44 4th Generation Languages (4GLs)
An operation that requires hundreds of lines in a 3GL, requires significantly fewer lines in 4GL. 3GL is procedural, 4GL is non-procedural In 4GL the user does not define steps, but defines parameters 4GL relies on 4GL tools 4GL can improve productivity by a factor of ten / 83

45 4th Generation Languages (4GLs)
4GL compasses: Presentation language, such as query languages and report generators Specialty languages, such as spreadsheet and DB languages Application generators that define, insert, update, and retrieve data from the DB to build applications Very high level languages that are used to generate application code SQL and QBE are examples of 4GL / 83

46 4GLs: Form Generators Interactive facility for rapidly creating data input and display layouts for screen forms. Allows users to define what the screen is to look like, what information is to be displayed, colors for screen elements, etc. The better forms generators allows the creation of derived attributes, perhaps using arethmetic operators or aggregates, and the specification of validation checks for data input. / 83

47 4GLs: Report Generators
Facility for crating reports from data stored in the DB. Similar to a query language in that it allows the user to ask questions of the DB and retrieve information from it for a report. Much greater control over what the output looks like. We can let the report generator automatically determine how the output look or we can create our own customized output reports using a special report generator command instructions. / 83

48 4GLs: Report Generators
Types: Language-oriented We enter a command in a sublanguge to define what data is to be included in the report and how the report is to be laid out. Visually oriented We use a facility similar to a forms generator to define the same information. / 83

49 4GLs: Graphics Generators
Facility to retrieve data from the DB and display it as a graph showing trends and relationships in the data. Bar charts Pie charts Line charts Scatter charts / 83

50 4GLs: Application Generators
Facility for producing a program that interfaces with the DB. Reduces the design time Consists of pre-written modules that comprise fundamental functions that most programs use. Modules written in a high-level language Modules constitute a library of functions to choose from. The user specifies what the program is supposed to do, the application generator determines how to perform the tasks. / 83

51 Data Models and Conceptual Modeling
Schema is written using data definition language Unfortunately, this language is too low level to describe the data requirements of an organization in a way that is readily understandable by a variety of users. What we require is a higher-level description of the schema, which is a data model. Data model: an integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization. / 83

52 Data Models and Conceptual Modeling
A model is a representation of a real-world objects and events and their associations. It is an abstraction that concentrates on he essential, inherent aspects of an organization itself. It should provide the basic concepts and notations that will allow DB designers and end users unambiguously and accurately to communicate their understanding of the organizational data. / 83

53 Data Model Components Structural part: construction rules of DB
Manipulation part: defining the types of operation that are allowed on the data (i.e. operations for updating, retrieving, and changing the structure of the DB). Set of integrity constraint: insures the data is accurate. / 83

54 Purpose of Data Model To represent data and to make the data understandable  the DB can easily be designed / 83

55 Levels of Data Models According to the ANSI SPARC Architecture
External data model: to represent each user’s view of the organization, sometimes called the universe of discourse (UoD). Conceptual data model: to represent the logical view that is DBMS-independent. Internal data model: to represent the conceptual schema in such a way that it can be understood. / 83

56 Categories of Data Models
Object-based data model Record-based data model Physical data model } Used to describe data at the conceptual and external level Describes data at the internal level / 83

57 1. Object-based Data Models
Uses concepts such as entities, attributes and relationships. Entity: distinct object (person, event) in an organization that is to be represented in the DB. Attribute: property that describe some aspect of the object that we wish to record. Relationship: association between entities. / 83

58 Common Types of Object-based Data Model
Entity-Relationship Semantic Functional Object Oriented Extends the definition of an entity to include not only the attributes that describe the state of the object but also the actions that are associated with the object, i.e. its behavior. The object is said to encapsulate both state and behavior. / 83

59 2. Record-based Data Models
The DB consists of a number of fixed-format records possibly of different types. Each record type defines a fixed number of fields, each typically of fixed length. Types: Relational data model Network data model Hierarchical data model Majority of modern commercial systems. Provides substantial amount of data independence. Adopt a declarative approach to DB processing, i.e specifying what data is to be retrieved. } Early DB systems were based on either the network or hierarchical data models. Require the user to have knowledge of the physical DB being accessed. Adopt a navigational approach i.e specifying how the data is to be retrieved. / 83

60 2. Record-based Data Models
Used to specify the overall structure of the DB and a higher level description of the implementation. Drawback: They do not provide adequate facilities for explicitly specifying constraints on the data, whereas the object-based model lack the means of logical structure specification but provide more semantic substance by allowing the user to specify constraint on the data. / 83

61 A. Relational Data Model
Based on the concept of mathematical relations. Data and relationships are represented as tables. Each table has a number of columns with a unique name. / 83

62 A. Relational Data Model
There is a relationship between the two tables: a branch office has staff. / 83

63 A. Relational Data Model
It requires that the DB be perceived by the user as table. This perception applies only to the logical structure of the DB, i.e. the external and conceptual level of the ANSI-SPARC architecture. It does not apply to the physical structure of the DB which can be implemented by a variety of storage structures. / 83

64 B. Network Data Model Data is represented as collections of records, and relationships are represented by sets. Records appear as nodes (or segments), and sets as edges in the graph. Most popular Network DBMS is Computer Associates’ IDMS/R. / 83

65 C. Hierarchical Data Model
Restricted type of network model Data is represented as collections of records and relationships are represented by sets. Allows a node to have only one parent Can be represented as a tree graph, with records appearing as nodes (or segments) and sets as edges. Most popular hierarchical DBMS is IBM’s IMS / 83

66 C. Hierarchical Data Model
/ 83

67 Physical Data Models Describe how data is stored in the computer, representing information such as Record structures Record orderings Access paths Most common ones: Unifying model Frame memory / 83

68 Conceptual Modeling Conceptual schema is the heart of the DB
It supports all the external views and is supported by the internal schema. The internal schema is merely the physical implementation of the conceptual schema. The conceptual schema should be a complete and accurate representation of the data requirements of the enterprise. Otherwise, some information will be missing, or incorrectly represented hence difficult to fully implement some external views. / 83

69 Conceptual Modeling Conceptual modeling (or conceptual DB design) is the process of constructing a model of the information use in an enterprise that is independent of implementation details, such as the target DBMS, application programs, programming languages, or any other physical considerations. It is independent of all implementation details, whereas the logical model assumes knowledge of the underlying data model of the target DBMS. / 83

70 Functions of a DBMS Data Storage, Retrieval, and Update
Ability to store, retrieve, and update data in the DB Benefit: DBMS hides the internal physical implementation details such as file organization and storage structure from users / 83

71 Functions of a DBMS A User-Accessible Catalog
To store description of data items and which is accessible to users: Names, types and sizes of data items Names of relationships Integrity constraints on the data Names of authorized users who have access to data Data items that each user can access and types of access allowed, e.g insert, update, delete, or read access External, conceptual, and internal schemas and the mappings between the schemas Usage statistics, e.g. frequencies of transactions and counts on the number of accesses made to objects in DB / 83

72 Functions of a DBMS DBMS System catalog is a fundamental component, many software components rely on it for information. Benefit of a system catalog: Information about data can be collected and stored centrally. This helps to maintain control over the data as a resource. The meaning of data can be defined, which will help other users understand the purpose of the data. Communication is simplified, since exact meanings are stored. The system catalog may also identify the user or users who own or access the data. / 83

73 Functions of a DBMS Benefit of a system catalog (Cont):
Redundancy and inconsistencies can be identified more easily since the data is centralized. Changes to the DB can be recorded. The impact of a change can be determined before it is implemented, since the system catalog records each data item, all its relationships, and all its users. Security can be enforced. Integrity can be ensured. Audit information can be provided. / 83

74 Functions of a DBMS Transaction Support
Mechanism which will ensure either that all the updates corresponding to a given transaction are made or that none of them is made A transaction is a series of actions, carried out by a single user or application program, which accesses or changes the contents of the DB. Ex: Add a new member of staff to the DB Update the salary of a member of staff Delete a property from the register / 83

75 Functions of a DBMS Ex: Delete a member of staff from the DB Reassign the properties that he or she managed to another member of staff If the transaction fails during execution, (e.g computer crash) the DB will be in an inconsistent state, i.e. some changes will have been made and others not. Hence, the changes that have been made will have to be undone to return the DB to a consistent state again. / 83

76 Functions of a DBMS Concurrency Control Services
To ensure that the DB is updated correctly when multiple users are updating the DB concurrently Concurrence access is relatively easy if all users are only reading data, as there is no way that they can interfere with one another. But when 2 or more users are accessing the DB simultaneously and at least one of them is updating data, there may be interference that can result in inconsistencies. / 83

77 Functions of a DBMS T1 Withdrawing $10 T2 Depositing $100 T1 T2 balx t read(balx) 100 t2 read(balx) balx=balx t3 balx=balx-10 write(balx) 200 t4 write(balx) t If T1 and T2 were executed serially, the final balance would be $190 regardless of which was performed first. But they start nearly at the same time and both read $100. T2 then increases balx by $100 to $200 and stores the update in the DB. Meanwhile, T1 decrements its copy of balx by $10 to $90 and stores this value in the DB, overwriting the previous update and thereby losing $100. / 83

78 Functions of a DBMS Recovery Services
To recover the DB in the event of damage Causes of damage: System crash Media failure Hardware or software error causing the DBMS to stop Power failure Result of the user detecting an error during the transaction and aborting the transaction before it completes / 83

79 Functions of a DBMS Authorization Services.
Only authorized users can access the DB Ex: only branch manager is allowed to see salary- related information for staff To protect the DB from unauthorized access, intentional or unintentional (security) / 83

80 Functions of a DBMS Support for Data Communication
DBMS must be capable of integrating with communication software Most users access the DB from workstations connected: directly to the computer hosting the DBMS or Remotely through a network The DBMS receives a communication messages and response in a similar way All transmissions are handled by Data Communication Manager (DCM) DCM is not part of the DBMS but integrated with the DBMS / 83

81 Functions of a DBMS Integrity Services
Both the data in the DB and the changes to the data follow certain rules DB integrity: correctness and consistency of stored data. It is another type of data protection Integrity is concerned with the quality of data Integrity is usually expressed in terms of constraints (consistency rules that the DB is not permitted to violate) / 83

82 Functions of a DBMS Services to Promote Data Independence.
To support the independence of programs from the actual structure of the DB Data independence is achieved through a view or subschema mechanism. Physical data independence is easier to achieve Logical data independence is more difficult (in some systems, any type of change to the logical structure is prohibited) It is easy to add new entity, attribute, and relationship but difficult to remove / 83

83 Functions of a DBMS Utility Services Help the DBA to administer the DB
Some utilities work at the external level and consequently can be produced by the DBA. Others work at the internal level and can be provided by the DBMS vendor Examples Import facilities (from flat files) and export facilities Monitoring facilities: to monitor the DB usage Statistical analysis: to examine performance Index reorganization facilities Garbage collection and reallocation: to remove deleted records physically from the storage devices to consolidate the space released / 83


Download ppt "Chapter 2 DB Environment"

Similar presentations


Ads by Google