Data Modeling Bayu Adhi Tama, ST., MTI. Thanks to Poniah, Elamasri, and Silberkatz.

2 LOGO Review  DDL is usually used by database administrator (DBA) to retrieve or to update data in a DBMS. [T/F]  In a “two-tier architecture”, in order to increase transaction processing, an application Server is used beside a database server. [T/F]  If the database system is not able to handle the complexity of data because of modeling limitations, it’s enough when no DBMS will use. [T/F]

3 LOGO  A model serves two primary purposes:  As a true representation of some aspects of the real world, a model enables clearer communication about those aspects of the real world.  A model serves as a blueprint to shape and construct the proposed structures in the real world.  Data modeling is an integral part of the process of designing and developing a data system.  A data model is a device that : [Poniah]  helps the users or stakeholders understand clearly the database system that is being implemented based on the information requirements of an organization, and  enables the database practitioners to implement the database system exactly conforming to the information requirements.  Data models [Silberkatz]  A collection of tools for describing : data, data relationships, data semantics, and data constraints

4 LOGO  Data model: communication tool and database blueprint.

5 LOGO  Classification of Information Levels:  Conceptual Level. The highest level consisting of general ideas about the information content  External Level. The data model represents the information requirements for the entire set of user groups in the organization

6 LOGO  Process creating data model:  Identify business objects  Identify relationships  Add attributes  Assign identifiers  Incorporate business rules  Validate the data model  Significance of data model quality  Data model completeness  Data model correctness

7 LOGO  Data models characteristic  Involves users  Covers the proper enterprise segments  Uses Accepted Standard Rules and Conventions.  Produces High-Quality Design.  Data system development lifecycle (DDLC)  Starting the process Data-Oriented Approach Development Framework. Initiation Report. Planning Feasibility study  Requirements definition Study overall business operations Observes business processes Understand business units Interview users Determine information requirements Identify data to be collected and stored Establish data access pattern and estimate data volume

8 LOGO  Design  Implementation

9 LOGO Slide 2- 9 History of Data Models  Network Model  Hierarchical Model  Relational Model  Object-oriented Data Models  Object-Relational Models

10 LOGO Slide History of Data Models  Network Model:  The first network DBMS was implemented by Honeywell in (IDS System).  Adopted heavily due to the support by CODASYL (Conference on Data Systems Languages) (CODASYL - DBTG report of 1971).  Later implemented in a large variety of systems - IDMS (Cullinet - now Computer Associates), DMS 1100 (Unisys), IMAGE (H.P. (Hewlett-Packard)), VAX -DBMS (Digital Equipment Corp., next COMPAQ, now H.P.).

11 LOGO Slide Example of Network Model Schema

12 LOGO Network Model: Basic Concepts  Data are represented by collections of records.  Records and their fields are represented as record type  Relationships among data are represented by links  restrictions on links depend on whether the relationship is many-many, many-to-one, or one-to- one.

13 LOGO Slide Network Model  Advantages:  Network Model is able to model complex relationships and represents semantics of add/delete on the relationships.  Can handle most situations for modeling using record types and relationship types.  Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET, etc. Programmers can do optimal navigation through the database.

14 LOGO Slide Network Model  Disadvantages:  Navigational and procedural nature of processing  Database contains a complex array of pointers that thread through a set of records. Little scope for automated “query optimization”

15 LOGO Slide History of Data Models  Hierarchical Data Model:  Initially implemented in a joint effort by IBM and North American Rockwell around Resulted in the IMS family of systems.  IBM’s IMS product had (and still has) a very large customer base worldwide  Hierarchical model was formalized based on the IMS system  Other systems based on this model: System 2k (SAS inc.)

16 LOGO Slide Hierarchical Model  Advantages:  Simple to construct and operate  Corresponds to a number of natural hierarchically organized domains, e.g., organization (“org”) chart  Language is simple: Uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT, etc.  Disadvantages:  Navigational and procedural nature of processing  Database is visualized as a linear arrangement of records  Little scope for "query optimization"

17 LOGO Slide History of Data Models  Relational Model:  Proposed in 1970 by E.F. Codd (IBM), first commercial system in  Now in several commercial products (e.g. DB2, ORACLE, MS SQL Server, SYBASE, INFORMIX).  Several free open source implementations, e.g. MySQL, PostgreSQL  Currently most dominant for developing database applications.  SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2), SQL-99, SQL3, …

18 LOGO Slide History of Data Models  Object-oriented Data Models:  Several models have been proposed for implementing in a database system.  One set comprises models of persistent O-O Programming Languages such as C++ (e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE).  Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in Open OODB).  Object Database Standard: ODMG-93, ODMG-version 2.0, ODMG-version 3.0.

19 LOGO Object Oriented Data Model  An object corresponds to an entity  The object-oriented paradigm is based on encapsulating code and data related to an object into single unit.  The object-oriented data model is a logical data model  Object structure:  A set of variables that contain the data for the object. The value of each variable is itself an object.  A set of messages to which the object responds; each message may have zero, one, or more parameters.  A set of methods, each of which is a body of code to implement a message; a method returns a value as the response to the message

20 LOGO Slide History of Data Models  Object-Relational Models:  Most Recent Trend. Started with Informix Universal Server.  Relational systems incorporate concepts from object databases leading to object-relational.  Exemplified in the latest versions of Oracle- 10i, DB2, and SQL Server and other DBMSs.  Standards included in SQL-99 and expected to be enhanced in future SQL standards.

