Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic 3 – Data Modeling Techniques Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy.

Similar presentations


Presentation on theme: "Topic 3 – Data Modeling Techniques Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy."— Presentation transcript:

1 Topic 3 – Data Modeling Techniques Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy

2 What I need to know Modelling TechniquesModelling Techniques –Description and exemplification of normalisation. Creation of un-normalised form (UNF) from complex source document(s).Creation of un-normalised form (UNF) from complex source document(s). First normal form: identifying repeating groups, dealing with repeating groups, problems with first normal form (1NF).First normal form: identifying repeating groups, dealing with repeating groups, problems with first normal form (1NF). Second normal form: functional dependencySecond normal form: functional dependency Third normal form: transitive dependency.Third normal form: transitive dependency.

3 What I need to know Modelling Techniques cont’dModelling Techniques cont’d –Translation of third normal form (3NF) into E/R diagrams: entities (weak entity, strong entity), entities (weak entity, strong entity), relationships (mandatory, optional, strong/weak).relationships (mandatory, optional, strong/weak). –Description and exemplification use of a Data Dictionary. –Description and exemplification of entity/event modelling.

4 What I need to know Modelling Techniques cont’dModelling Techniques cont’d –Entity/Event Matrix: add, modify, delete, read.add, modify, delete, read. –Entity Life Histories: sequence, iteration, selection.sequence, iteration, selection. –Description and exemplification of dataflow modelling using level 0 and level 1 data flow diagrams: system boundary, environment, data flow, physical flow, data store, external entity, process.system boundary, environment, data flow, physical flow, data store, external entity, process.

5 Normalisation Normalisation aims to remove data redundancy by applying rules in a series of stages, splitting tables within the existing database system and creating relationships between them to ensure that the structure of the database is efficient and data can be accurately manipulated. The structure produced by the normalisation process will be efficient and allow data to be easily updated and maintained. The structure will contain a number of related tables.

6 Normalisation Each table will have a number of properties: – –The order of the rows in not significant. The rows do not have to be in any particular order and the order may be changed without loss of information. – –The order of the columns is not significant. The columns of a table may be interchanged without loss of information. – –Each row/column intersection contains only one value. – –Each row of the table must be capable of being uniquely identified by a single attribute or a combination of attributes. This attribute or combination of attributes is referred to as the primary key. The primary key cannot contain a null value. – –Each table is linked to at least one other table in the system. The attribute used to link one table to another is called the foreign key.

7 Normalisation Normalisation is a complex mathematical process. To fully understand it, you must first understand the terms repeating group and functional dependency. A repeating group is defined to be a set of two or more multi-valued attributes. For example: – –a single book title may have several authors – –a single pupil may sit exams in several different subjects A functional dependency occurs when one attribute - or combination of attributes - in a relation uniquely determines another attribute.

8 Normalisation When carried out correctly, the normalisation process identifies an efficient structure for the entities in the database system. Normalisation ensures that: – –all redundant data has been removed from the entities This means that all duplicated values have been removed and the database system will require less memory. As a result, any processing carried out on the data will be more efficient. – –all data anomalies have been removed This means that whenever the data is to be updated, it only needs to be updated once. New details can be added to and deleted from individual tables without affecting details held in other tables. This ensures that the data is easy to maintain.

9 Normalisation All entities created as a result of applying the normalisation process are said to demonstrate: – –entity integrity. This means that the entity has a non-null primary key which has no repeating values. – –referential integrity This means that a foreign key in one entity cannot contain a value that doesn’t already exist as a primary key in the other entity.

10 UNF The most important step in creating UNF is analysing the source document or documents (there may be more than one) in order to determine the attributes that are held in the system. If there is more than one source document, it is best to deal with each document separately. Once the attributes in each document have been fully normalised, any links between the sets of normalised tables can be easily identified. This process of linking sets of normalised entities to create a single normalised system is called consolidation.

11 UNF Creating UNF for a single document – –analyse the details held on the source document carefully – –write down a single list of all attributes that appear in the source document – –identify any attributes that can be calculated using other values in the system – –identify the primary key of the un-normalised attributes by underlining it – –identify any repeating groups of data – –name each UNF entity

12 1NF The most important step in creating 1NF is the removal of repeating groups of data. By removing repeating groups of data to form separate entities, we reduce data duplication and therefore make the database more efficient. We also remove any values that can be calculated since this again reduces the storage that is necessary and improves the efficiency of the database.

13 1NF Creating 1NF from UNF – –Write out list of attributes in main entity leaving behind any repeating group(s). Also leave out any attributes that can be calculated. – –Create a separate list of attributes for each repeating group. If one set of attributes repeats inside another set, take the outer set of repeating attributes out from the main entity before removing the inner set of repeating attributes. – –Identify a primary key of each new entity by underlining it. – –Copy the primary key of any outer repeating group into the inner repeating group and asterisk it to indicate that this is the foreign key link to the outer repeating group. – –Take a copy of the primary key of the main entity into each new entity and asterisk it to indicate that it is also a foreign key. – –Decide whether or not this foreign key is needed to form a unique value for the primary key of each new entity. – –Name each 1NF entity.

14 2NF The most important step in creating 2NF is the removal of non-key attributes that only depend on part of the primary key. – –This means that 2NF is only of concern when the primary key of an entity is a compound primary key - in other words, the primary key is composed of two or more attributes. Where the primary key is a single attribute, the entity is already in second normal form. Attributes that depend on part of the primary key (one entity only) are said to have a partial dependency on the primary key. The aim of 2NF is to create entities where every non-key attribute is fully dependent on the primary key of the entity.

15 2NF Creating 2NF from 1NF – –Copy out any entities with a single primary key. These entities are already in 2NF – –In entities with a compound primary key, identify any attributes that depend on part of the compound primary. – –Write out the attributes in the entity with the compound primary key, leaving behind any attributes that have a partial dependency. – –Mark the part of the primary key that the caused the partial dependency with an asterisk. This is a foreign key link to the new entity you are about to create. – –Create a new entity by copying the part of the compound primary key that is responsible for the partial dependency. This becomes the primary key of the new entity. – –Add the non-key attributes that have a partial dependency to the new entity. – –Name each 2NF entity.

16 3NF The most important step in creating 3NF is the removal of non-key attributes that depend on other non-key attributes. Non-key attributes that depend on other non-key attributes are said to have transitive dependency.

17 3NF Creating 3NF from 2NF – –Write out a list of all attributes in any entities that have one or less non-key attributes. These are already in 3NF. – –Examine other entities carefully in order to identify transitive dependencies. – –Write out the attributes in any entity found to have a transitive dependency, leaving behind any non-key attributes that have a transitive dependency. – –Mark the non-key attribute that is responsible for the transitive dependency with an asterisk to indicate a foreign key. – –Copy this attribute and create new entity. Underline the attribute to indicate that the attribute is the primary key. – –Add the non-attributes involved in the transitive dependency into this new entity. – –Name each 3NF entity.

18 ERD’s Entity Relationship Diagrams (or ERD) are used to illustrate the logical structure of a database system. An ERD is a graphical representation of the entities within the database system and shows how individual entities are related to other entities in the system. Benefits of using an ERD to represent the structure of the database system include: – –It is non-technical and can be easily understood by non- experts. This is important since the analyst must confirm that the representation of the system is correct. – –It is unambiguous as there is only one way of interpreting a well-drawn Entity Relationship Diagram.

19 ERD’s The main components of an ERD are: – –Entities; – –Attributes; – –Relationships; – –Relationship Optionality; – –Barred Relationship; – –Cardinality; – –Transferability.

20 ERD’s Entities – –An entity is a person, object, place or event about which information is collected. It is equivalent to a database table. An entity can be described by its properties or attributes. For example, a STUDENT entity may have attributes such as surname, address, date of birth. – –A single entity represents a group of objects with the same properties. Each single object within an entity is called an entity occurrence.

21 ERD’s There are two different types of entity: strong entities and weak entities. – –A strong entity does not rely on another entity for identification. Instead, it has enough attributes of its own to make a unique primary key. – –A weak entity cannot form a unique primary key on its own. Instead, it must make use of the primary key of another entity to form a unique identifier. In other words, a weak entity depends on another entity to exist. The entity that supports a weak entity by providing a foreign key is often referred to as the owner entity. A weak entity cannot stand alone and would not be queried on its own..

22 ERD’s Attributes – –Attributes are the properties of entities and represent everything that we know about the entity. – –Attributes can be identifiers of the entity (in other words, forming all or part of the primary key of the entity) or descriptors that describe a non-unique characteristic of an entity occurrence.

23 ERD’s Relationships – –A relationship represents an associated between two or more entities. Each relationship in a system is given a name that describes its function. – –The cardinality of a relationship specifies the number of entity occurrences that take part in a particular relationship. – –Relationship cardinality can be: one-to-one (1:1) one-to-many (1:M) – many is show with crow’s feet many-to-many (M:M) – try not to have any of these!

24 ERD’s Relationship Optionality – –A relationship in an Entity Relationship Diagram can be either mandatory or optional. If an instance of an entity must always occur for an entity to be included in a relationship, then it is mandatory. An example of mandatory relationship is the statement "every project must be managed by a single department". Shown with a full line. – –However, if at least one instance of an entity is not required, the relationship is optional. An example of optional relationship is the statement, "employees may be assigned to work on projects". Shown with a dashed line.

25 ERD’s Barred relationships – –In our ERD we do not show foreign keys. Instead we draw a line through our relationship to show that a primary key will be placed in the other entity as a foreign key. Non-transferability – –We use a diamond on our relationship to show that once a relation is made between our two entities it cannot be transferred. Rules for both – –Always show both of these on our intersection entities (hidden).

26 ERD’s Entity Relationship Diagrams can be produced by following the steps listed below: – –Identify all possible entities in the system. Remember that entities are used to store information. – –Identify the attributes in each entity - remember not to include attributes of one entity in another entity. Identify the primary keys of each entity using # - this may not be possible in weak entities (history logs are generally weak entities!) Mandatory attributes have *, and optional attributes o. – –Identify the relationships between entities and define the cardinality and optionality of these relationships. – –Resolve any many-to-many relationships. This may mean that the list of entities and attributes may need to be revised and updated. – –Include barred relationships if a foreign key is within an entity. – –Include your diamonds for non-transferability.

27 Data Dictionary A data dictionary is a catalogue of all data items in a system. The data dictionary stores details and descriptions of all of data items. The data dictionary is usually developed after normalisation has been carried out and helps the analyst in determining system requirements. A data dictionary is used to fully describe all data items that are held in a system.

28 Data Dictionary. The ERD does not give any indication of the type or size of each attribute nor does it show the restrictions that apply to an attribute or where else in the system the attribute is being used. These details are held in a properly developed data dictionary. The details of each data time that must be included in a data dictionary are: – –attribute name – –entity – –type – –size – –validation – –index/key

29 Data Dictionary. Item Name – –It is good practice to make sure that all item names in a data dictionary are unique. One way to do this is to incorporate the name of the entity within the item name. This ensures that any attributes that have more than one entry in the data dictionary (for example, foreign keys) are easily distinguishable. Data Type – –Common data types indicated in a data dictionary include: number (used to store any value that consists of numbers only) text (used to store any value that is made up of characters or characters combined with number) date (used to store any value that represents a calendar date) time (used to store any value that represents a time of the day) auto (used to indicate that the value is a numeric that the system generates automatically)

30 Data Dictionary. Data Size – –The size of a data item refers to the maximum number of characters that will be allowable. This only needs to be considered when the data type is text. – –It is important to consider the size of each data item since the default set by most development tools are often far larger than the sizes necessary. By setting an appropriate maximum size, the analyst is reducing the amount of wasted memory that the system will use when implemented. – –When a data item requires no more than characters, it is extremely inefficient to allow the database software to set a default size of, say 50.

31 Data Dictionary. Foreign Keys – –Note that the data type and size of any foreign keys must match the type and size set for the original primary value. In particular, if a foreign key has a primary value that has been set as an auto value, then the foreign key type must be set as number.

32 Data Dictionary. Validation – –The validation columns of the data dictionary should indicate any restrictions that should be applied to the data item when values are being entered into the system. – –Validation checks automatically check any input values to make sure that they are sensible. Common validation checks that are indicated in a data dictionary include: presence check – can a null value be used range check - values entered are within certain pre- defined upper and lower limits restricted choice -only certain values can be entered lookup - existing (primary) value of any foreign key data item

33 Data Dictionary. Index and Key – –The Index/Key column of the data dictionary is used to indicate whether or not the data item is to be indexed and if it is to be indexed, whether or not the data item is a primary key or a foreign key. – –Indexing Database systems take advantage of indexing to increase their speed. A database index can speed up a query by hundreds or thousands of times. Indexing is the notion of storing data on a hard disk in a particular way in order to locate and retrieve the data as efficiently as possible. – –Keys When a data item is marked as indexed, the analyst must indicate if indexing is due to the fact that the data item is a primary key (PK) or if it is because the data item is a foreign key (FK). Where a foreign key is part of a primary key, both of these facts should be indicated in the data dictionary.

34 Entity Event modelling. There are two types of entity event modelling: –Entity Event Matrix & –Entity Life History.

35 Entity event modeling. Purpose of an Entity Event Matrix – –An Entity Event Matrix is used to record a list of all permissible events that can occur and shows the effect that these events have on the entities within the system. – –In this context, an event is something that triggers a process into updating data within the system. An effect is the change caused by the event such as the creation, deletion or modification of an entity occurrence.

36 Entity event modelling. One single Entity Event Matrix is created for the entire system. This matrix lists all entities in the system across the top of the matrix. The events that occur are listed on separate rows of the matrix. Within the matrix, the effect caused by an event is recorded as follows: – –C - This is used to indicate that the described event causes a new entity occurrence (or new row of the table) to be created within the entity indicated. – –D - This is used to indicate that the described event causes an existing entity occurrence (or existing row of the table) to be deleted from the entity indicated. – –M - This is used to indicate that the described event causes existing data values within the entity indicated to be modified. – –R - This is used to indicate that the described event causes data values held within the entity indicated to be read by the process that is triggered by the event.

37 Entity Event modelling. Creating an Entity Event Matrix – –To create an Entity Event Matrix for a particular system, the following steps should be followed: Create the headings for the matrix remembering that one column is needed to list the events that can occur and a separate column is needed for each entity. Write down a list of events that can occur in the real world and will trigger a process to be carried out within the system. In carrying out the process, entities within are affected in some way. Work through the events in the list one at a time and consider how each entity will be affected by the event described. Use the symbols C, D, M and R to record the effect of each event on individual entities. Remember that one single event can cause several entities to be altered in some way.

38 Entity Life Histories Entity Life Histories:Entity Life Histories: – –An Entity Life History Diagram is used to record the permissible sequence of events that can occur within any one entity. It also indicates which events are repeated and those events that are alternatives. – –Each entity must have a separate Entity Life History Diagram. This means that there may be several Entity Life History Diagrams for any one information system. – –The events shown on an Entity Life History Diagram must match those recorded in the Entity Event Matrix for the system.

39 Entity Life Histories Entity Life Histories:Entity Life Histories: – –Events shown in an Entity Life History (ELH) Diagram are read from left to right. Events on the left of the diagram must occur before those on the right. – –Each branch of an ELH diagram shows a separate category of events. Three different categories of event can be shown in an ELH: creation events modification events deletion events – –Events that cause read activity are not shown in an Entity Life History Diagram. Each level of the diagram shows additional detail of an event or an event category.

40 Entity Life Histories To create an Entity Life History Diagrams for a particular system, the following steps should be followed: – –Create the Entity Event Matrix for the system. – –Consider each entity within the matrix separately. – –Draw a box to represent the entity at the top level of the diagram. – –Look down the column of the Entity Event Matrix that represents the entity and decide which category of events are to be shown on the second level of the ELH. Remember that only 3 categories of events can be shown: Creation, Modification and Deletion. Draw a box for each category present. – –Complete each branch of the ELH separately. Remember to pay attention to the use made of * to indicate repetition and the use of o to indicate alternative events. – –Repeat steps 3 - 5 for each entity in the system.

41 Data Flow Diagrams A data flow diagram is a graphical model of the system that shows the movement of data between the components of the system.A data flow diagram is a graphical model of the system that shows the movement of data between the components of the system. There are two levels to a DFD:There are two levels to a DFD: –Level 0; –Level 1.

42 What is Level 0 DFD? Level 0 shows the main system and what data or objects are passed to/from external entities.Level 0 shows the main system and what data or objects are passed to/from external entities. –If the system is a whole company, external entities are people who do not work in the company being modelled or other companies. –If the system is a department, external entities are other departments or people who do not work in the dept being modelled.

43 What is a Level 1 dfd’s A level 1 dfd is a diagramatic form of showing the main processes, data flows and data stores within the system.A level 1 dfd is a diagramatic form of showing the main processes, data flows and data stores within the system.


Download ppt "Topic 3 – Data Modeling Techniques Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy."

Similar presentations


Ads by Google