Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Information Retrieval and Use Data Analysis & Data Modeling, Relational Data Analysis and Logical Data Modeling Geoff Leese September 2009.

Similar presentations


Presentation on theme: "1 Information Retrieval and Use Data Analysis & Data Modeling, Relational Data Analysis and Logical Data Modeling Geoff Leese September 2009."— Presentation transcript:

1 1 Information Retrieval and Use Data Analysis & Data Modeling, Relational Data Analysis and Logical Data Modeling Geoff Leese September 2009

2 2 Relational Data Analysis n Captures the detailed knowledge of the meaning of the data. n Ensures that the data is logically easy to maintain and extend. ä Data inter-dependencies have been identified ä Ambiguities have been resolved. ä Eliminate unnecessary duplication of data. ä Forms the data into optimum groups. ä Validates the Logical Data Model (LDM).

3 3 Logical Data Modelling n Basic Rules for converting 3NF to a LDM ä Create an entity type for each data relation ä Mark qualifying foreign keys ä Check compound key relations ä Make foreign/primary key relations

4 4 Guidelines for logical modelling n Entity type names are singular nouns, descriptive, concise and organisation specific. n Attribute names are unique descriptive nouns of standard format. n Relationship names are descriptive, precise verb phrases.

5 5 Simple Master - Detail relationships n Where a single foreign key of a relation corresponds to the primary key of another relation n See next slide for example.

6 6 Simple Master - Detail relationships Shows SINGLE primary key at MASTER entity (Organisation) connected to SINGLE foreign key at DETAIL entity (Contact people)

7 7 Multiple level Master - Detail Relationships n Example: five entities

8 8 Identifying Recursive (Unary) Relationships n Is a relation where a foreign key references the same relation. n Example: Employee Employee-number Employee-name Employee-manager-number Employee

9 9 Relationships : Student/Module n At this point we need to identify the data items that describe or identify each entity n Entity attributes are also known as data items n What are the data items associated with the following LDS diagram? Takes StudentModule Is taken by

10 10 The Student Entity TypeAttribute Name Attribute StudentStudent NameJones Street AddressLeek Road TownStoke-on-Trent Post CodeST4 2DE Telephone294303 Takes StudentModule Is taken by

11 11 The Module Entity TypeAttribute Type Attribute ModuleModule NumberCM5111-1 Module NameSSAT Module LeaderA Lecturer Level1 Cats Points10 Takes StudentModule Is taken by student

12 12 The Data Items Takes StudentModule Is taken by student Module Number Module Name Module Leader Level Cats Points Student Name Street AddressTown Post CodeST4 2DE Telephone

13 13 Identifying occurrences of entities n Each occurrence of an entity must be uniquely identified in some way n Imagine the British Gas data base that used only surnames to identify account holders n There would be 100,000 account holders called Jones in this country n Even if we used the given names there would still be considerable duplication n It would be impossible to find the right account by name alone

14 14 Adding a Primary Key Takes StudentModule Is taken by student Module Number Module Name Module Leader Level Cats Points Student Number Student Name Street AddressTown Post CodeST4 2DE Telephone Primary key added

15 15 Relationships: Getting it right Takes StudentModule Is taken by student Takes StudentModule Is taken by student Is this right? The real situation is surely

16 16 Putting it right: Intersection entity Student Number Module number StudentModule Module Number Module Name Module Leader Level Cats Points Student Number Student Name Street AddressTown Post CodeST4 2DE Telephone Stud/Mod We need a link entity - less ambiguity

17 17 Normalisation - steps n Start with a set of un-normalised tables ä Entity/attribute list n Step 1 - remove ambiguity and repeating data n Step 2 - remove shared data

18 18 Normalisation - step 1 n Break down ALL attributes into smallest meaningful parts ä EG student name becomes student surname, student firstname, student title n Remove REPEATED information to form a new table ä EG a course may be composed of MANY modules (but assume that each module is only on one course!) - so form a MODULE table

19 19 Normalisation - step 2 n Remove SHARED data to form new tables ä EG modules may share tutors - so form a TUTORS table.

20 20 Normalisation n FIRST NORMAL FORM - a relation (table) is in 1NF if it contains atomic values and all repeating groups have been removed

21 21 Normalisation n SECOND NORMAL FORM - a relation(table) is in 2NF if it is in 1NF and every non-key attribute is fully dependent on the primary key

22 22 Normalisation n THIRD NORMAL FORM - a relation(table) is in 3NF if it is in 2NF and every non-key attribute is not dependent on any other non-key attribute

23 23 Relational Data Analysis Form n Validates the LDM against the relations. n Consists of: ä Unnormalised Form –attributes ä First Normal Form (1NF) ä Second Normal Form (2NF) ä Third Normal Form (3NF) –Relations –Attributes

24 24 RDA Form Name Date UNF 1NF 2NF 3NF Result relation attribute attributes

25 25 Data Dictionary n lists, for every field in every table ä Tablename ä Fieldname ä Field Type ä Field size (if variable) ä Decimal places (if applicable) ä Description (if required) ä Other significant field properties

26 26 Data Dictionary example

27 27 The domain n Is the “set” of items, and the definition thereof to which an attribute belongs n Define domain once, saves time when defining attributes belonging to it. n For example - Date of Birth, Course Start Date and Enrolment Date all belong to the DATE domain - data type is date/time, format dd/mm/yyyy, non- unique, non-null.

28 28 Further reading n Rolland chapters 3 and 4 n Hoffer chapters 10 and 12 n Kendall & Kendall chapter 17


Download ppt "1 Information Retrieval and Use Data Analysis & Data Modeling, Relational Data Analysis and Logical Data Modeling Geoff Leese September 2009."

Similar presentations


Ads by Google