Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logical Data Modeling – Attributes, Primary Keys, and Identity

Similar presentations


Presentation on theme: "Logical Data Modeling – Attributes, Primary Keys, and Identity"— Presentation transcript:

1 Logical Data Modeling – Attributes, Primary Keys, and Identity
BCHB697

2 Outline The Data Modeling Process Entity Attributes & Types
Determinants and Primary Keys BCHB697 - Edwards

3 The Data Modeling Process
Conceptual Data Modeling Define the entities and their relationships Logical Data Modeling Define each entities’ attributes, incl. types Choose primary key(s) for each entity Relationship details (cardinality, optional) Physical Data Modeling Implementation BCHB697 - Edwards

4 Data-Model Properties
Completeness Non-redundancy “Business” Logic Data Reusability Stability and Flexibility Elegance Communication Integration Data Modeling Essentials (§1.6) BCHB697 - Edwards

5 Entity Attributes Any (and all) details about an entity Person:
Will become columns in database table. Needs definition and type: string, integer, float, (date,...) Optional? More than one value? One fact per attribute, no derivable values, implicit values Person: name, family name, date of birth, age, GU ID #, NetID, address, , phone Course: course number, department, meeting time(s), day(s), start date, end date, director, syllabus, credits, prerequisites, semester, year BCHB697 - Edwards

6 Course Database Attributes
Person: name, family name, date of birth, age, GU ID #, NetID, address, , phone Course: course number, department, meeting time(s), day(s), start date, end date, syllabus, credits, prerequisites, semester, year CourseParticipant: person, course, role BCHB697 - Edwards

7 Blast Database Entities and Attributes
Protein: accession, gi, species, description, length, (sequence) Alignment: bit score, E-value, query protein, reference protein HighScoringPair: ordinal, bit score, E-value, identities, positives, gaps, start position, end position, (aligned sequences), query protein, reference protein, (alignment) BCHB697 - Edwards

8 Attribute Types Basics: integer, float, string Semantic types:
Sometimes: boolean, date, point (lat, long), … Missing values of optional attributes: NULL Multiple values → multiple attributes Semantic types: Identifier – unordered, test for equality only Category – few unordered, discrete values Numeric – ordered, arithmetic, precision (?) Text – ordered (sortable), no arithmetic BCHB697 - Edwards

9 Course Database Attributes
Person: name, family name, date of birth, age, GU ID #, NetID, address, , phone Course: course number, department, meeting time(s), day(s), start date, end date, syllabus, credits, prerequisites, semester, year CourseParticipant: person, course, role BCHB697 - Edwards

10 Blast Database Entities and Attributes
Protein: accession, gi, species, description, length, (sequence) Alignment: bit score, E-value, query protein, reference protein HighScoringPair: ordinal, bit score, E-value, identities, positives, gaps, start position, end position, (aligned sequences), query protein, reference protein, (alignment) BCHB697 - Edwards

11 Identifier Attributes
May be integers or strings: How many identifiers do we need? How many digits/characters in an identifier? System generated: Sequential integer automatically generated by the database for each instance. Administrator assigned: Manual designation (OK for a few) Externally defined: Explicitly provided as entity attribute value Managed by external organization / authority BCHB697 - Edwards

12 Course Database Attributes
Person: name, family name, date of birth, age, GU ID #, NetID, address, , phone Course: course number, department, meeting time(s), day(s), start date, end date, syllabus, credits, prerequisites, semester, year CourseParticipant: person, course, role BCHB697 - Edwards

13 Blast Database Entities and Attributes
Protein: accession, gi, species, description, length, (sequence) Alignment: bit score, E-value, query protein, reference protein HighScoringPair: ordinal, bit score, E-value, identities, positives, gaps, start position, end position, (aligned sequences), query protein, reference protein, (alignment) BCHB697 - Edwards

14 Determinants A determinant is any identifier attribute (or set of identifier attributes) of an entity that determines other attributes’ values. This should be true conceptually, not just for the current set of instances A candidate key of an entity is a determinant that determines all of the entity’s other attribute values. BCHB697 - Edwards

15 Course Database Attributes
Person: name, family name, date of birth, age, GU ID #, NetID, address, , phone Course: course number, department, meeting time(s), day(s), start date, end date, syllabus, credits, prerequisites, semester, year CourseParticipant: person, course, role BCHB697 - Edwards

16 Blast Database Entities and Attributes
Protein: accession, gi, species, description, length, (sequence) Alignment: bit score, E-value, query protein, reference protein HighScoringPair: ordinal, bit score, E-value, identities, positives, gaps, start position, end position, (aligned sequences), query protein, reference protein, (alignment) BCHB697 - Edwards

17 Primary Keys Every entity (that might be referenced) requires one candidate key be designated the primary key. Surrogate for instance identity Must be universal, unique, and stable. Primary key values are used to define entity / instance relationships Foreign keys are identity attributes with an entity’s primary keys as values. Single attribute, integer, primary keys are usually best. BCHB697 - Edwards

18 Course Database Attributes
Person: name, family name, date of birth, age, GU ID #, NetID, address, , phone Course: course number, department, meeting time(s), day(s), start date, end date, syllabus, credits, prerequisites, semester, year CourseParticipant: person, course, role BCHB697 - Edwards

19 Blast Database Entities and Attributes
Protein: accession, gi, species, description, length, (sequence) Alignment: bit score, E-value, query protein, reference protein HighScoringPair: ordinal, bit score, E-value, identities, positives, gaps, start position, end position, (aligned sequences), query protein, reference protein, (alignment) BCHB697 - Edwards

20 Logical data model Entities: Course, Person, CourseParticipant
BCHB594 nje5 rcf57 sg1386 kg737 yk625 yl1009 bm999 ls1340 sls358 ss4218 zsw6 BCHB580 nje5 ker25 sg1386 kg737 bh658 yk625 yl1009 bm999 war36 ls1340 sls358 ss4218 zsw6 mdw83 yw575 my511 BCHB697 nje5 sg1386 bh658 xh61 yk625 yl1009 bm999 ls1340 ss4218 lmw116 zsw6 yw575 Course course_id department number semester year Person person_id name dob GU ID # NetID CourseParticipant course_id person_id role 1 1 course person Entities: Course, Person, CourseParticipant Relationships: Course ← CourseParticipant, Person ← CourseParticipant BCHB697 - Edwards

21 Exercise Navigate to a bioinformatics knowledgebase
UniProt, dbSNP, ClinVar, PDB, … Identify the entities and relationships Identify the attributes of each entity Choose a primary key for each entity Reading: Chapters (4),5,6 (DME) BCHB697 - Edwards


Download ppt "Logical Data Modeling – Attributes, Primary Keys, and Identity"

Similar presentations


Ads by Google