Presentation on theme: "UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 5 Database Theory & Practice (1) : Data Modelling."— Presentation transcript:
UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 5 Database Theory & Practice (1) : Data Modelling
What is data modelling? o Data modelling is concerned with the design of the data content and structure of the database. o Data modelling gives us a formal model of an organisation which is achieved through the consolidation of the user requirements specification. o The information gathered from fact-finding (analysis) is appraised and the basic data and data relationships are established. o The result of the data analysis is a representation of the user's view of the data. It is independent of any DBMS software or hardware considerations. o The model documents the structure of and interrelationships between the data. It is presented as a combination of simple diagrams and written definitions.
Why is the data model important? o Leverage - A small change to a data model may have a major impact on the system as a whole. Although in commercial information systems, the programs are far more complex and take much longer to specify and construct than the database, their content and structure are heavily influenced by the database design. Their structure will therefore need to reflect the way the data is organized... in other words, the data model. - A well designed data model can make programming simpler and cheaper. Even a small change to the model may lead to significant savings in total programming cost. o Conciseness - The data model is formal and concise and the time required to review a data model is considerably less than that needed to review functional specifications which could amount to hundreds of pages. o Data Quality - Data is a valuable organizational asset. The data model plays a key role in ensuring good data quality by establishing a common understanding of what data is to be held and how to interpret it.
What makes a good data model? (1) o Completeness - Does the model support all the necessary data? Do we need to record something that is currently omitted? o Non-redundancy - Does the model specify a database in which the same fact could be recorded more than once? Recording the same data more than once (duplication) increases the amount of space needed to store the database, requires extra processes (and processing) to keep the various copies in step, and leads to consistency problems if the copies get out of step. o Enforcement of Business Rules - How accurately does the model reflect and enforce the rules that apply to the business' data? If rules correctly reflect the business requirement and are correctly enforced, the resulting database will be a powerful tool in enforcing correct practice, and in maintaining data quality. o Data Reusability - Will the data stored in the database be reusable for purposes beyond those anticipated in the process model? This requirement is often expressed in terms of its solution: as far as possible, data should be organised independently of any specific application.
What makes a good data model? (2) o Stability and Flexibility - How well will the model cope with possible changes to the business requirements? Can any new data required to support such changes be accommodated in existing tables? Alternatively, will simple extensions suffice? Or will major structural changes be required, with corresponding impact on the rest of the system? - A data model is stable in the face of a change to requirements if it requires little or no modification. Models are more or less stable, depending on the level of change required. - A data model is flexible if it can be readily extended to accommodate likely new requirements with only minimal impact on the existing structure. o Elegance - Does the data model provide a reasonably neat and simple classification of the data? Elegant models are typically simple, consistent, and easily described and summarized. - The difference in development cost between systems based on simple, elegant data models and those based on highly complex ones can be considerable There is a risk that a simple model ends up being complex and brittle as result of incremental business changes over a long period without any rethinking of processes and supporting data.
What makes a good data model? (3) o Communication - How effective is the model in supporting communication among the various stakeholders in the design of a system? Do the tables and columns represent business concepts that the users and business specialists are familiar with and can easily verify? Will programmers interpret the model correctly? o Integration - How will the proposed database fit with the organization's existing and future databases? Even when individual databases are well designed, it is common for the same data to appear in more than one database and for problems to arise in drawing together data from multiple databases. Are the coding schemes and definitions consistent? How easy is it to keep the different versions in step, or to assemble a complete picture?
What makes a good data model? (4) o Conflicting Objectives -The above aims will often conflict with one another. An elegant but radical solution may be difficult to communicate. An elegant model may exclude requirements that do not fit. A model that accurately enforces a large number of business rules will be unstable if some of those rules change. A model that is easy to understand because it reflects the perspectives of the immediate system users may not support reusability or integrate well with other databases. - The goal is to develop a model that provides the best balance among these possibly conflicting objectives. As in other design disciplines, achieving this is a process of proposal and evaluation, rather than a stepby-step progression to the ideal solution. We may not realize that a better solution or trade-off is possible until we see it.
(some) consequences of bad model/design slow data access specific data is hard to find data is contradictory fields hold meaningless data records are hard to update transactions go wrong transactions corrupt data data gets lost and cannot be recovered database can’t scale database design is complex and confusing users have to download much more data than needed security is flawed leading to damage or theft queries cannot be optimized
Database Design Stages & Deliverables (1) o Conceptual, Logical and Physical Data Models - The conceptual data model is a (relatively) technology independent specification of the data to be held in the database. It is the focus of communication between the data modeler and business stakeholders, and it is usually presented as a diagram with supporting documentation. - The logical data model is a translation of the conceptual model into structures that can be implemented using a database management system (DBMS) – usually relational. - The physical data model incorporates any performance considerations and is presented in terms of tables, columns, indexes etc. It will include a specification of physical storage and access mechanisms.
Database Design Stages & Deliverables (2) o The Three-Schema Architecture and Terminology -The internal schema describes how the data will be physically stored and accessed, using the facilities provided by a particular DBMS. it represents the foundations, electrical wiring, and hidden plumbing of the database. -The conceptual schema describes the organization of the data into tables and columns. - The external schemas specify views that enable different users of the data to see it in different ways. It is usual to provide one external schema that covers the entire conceptual and then to provide a number of external schemas that meet specific user requirements.
Where do data models fit in? o Data-Driven Approaches for example Information Engineering (IE) appeared in the late 1970’s and have since evolved into parallel or “blended” approaches. The emphasis was on developing the data model before the detailed process model. o Parallel (Blended) Approaches provides simultaneous modelling of the data and the process models. Supported by CASE products. o Object-Oriented Approaches uses conventional (relational) data models as OO databases are not commonly used. Use of UML. o Prototyping Approaches Rapid Application Development (RAD) have in many cases replaced the traditional waterfall approaches to systems development. Use of data-driven approach to data-modelling. o Agile Methods Backlash against “heavy” methodologies – values software over documentation; shared understanding; pair-programming etc. Data model is developed early in the development process.
Who should be involved in data modelling? o The system users, owners, and/or sponsors will need to verify that the model meets their requirements. o Business specialists (sometimes called subject matter experts or SMEs) may be called upon to verify the accuracy and stability of business rules incorporated in the model. o The data modeler has overall responsibility for developing the model and ensuring that other stakeholders are fully aware of its implications for them. o Process modelers and program designers will need to specify programs to run against the database. They will want to verify that the data model supports all the required processes. o The physical database designer (often an additional role given to the database administrator) will need to assess whether the physical data model needs to differ substantially from the logical data model to achieve adequate performance, and, if so, propose and negotiate such changes. o The systems integration manager (or other person with that responsibility, possibly an enterprise architect, data administrator, information systems planner, or chief information officer) will be interested in how the new database will fit into the bigger picture: are there overlaps with other databases; does the coding of data follow organizational or external standards.
Bibliography / Readings / Home based activities Bibliography -Data Modeling Essentials (3 rd ed.), GC Simpson & GC Witt, Morgan Kaufmann, 2005Data Modeling Essentials -Beginning Database Design Solutions, R Stephens, Wrox, 2009Beginning Database Design Solutions -Data Modeling - A Beginner's Guide, A Oppel, McGraw-Hill Osborne, 2010Data Modeling - A Beginner's Guide Readings -