2 Data ModelingData modeling – a technique for organizing and documenting a system’s data. Sometimes called database modeling.Entity relationship diagram (ERD) – a data model utilizing several notations to depict data in terms of the entities and relationships described by that data.
3 ERD SampleIn an entity-relationship diagram, entities are labelled with singular nouns and relationships are labelled with verbs. The relationship is interpreted as a simple English sentence.
4 ERD: EntityEntity – a class of persons, places, objects, events, or concepts about which we need to capture and store data.Persons: agency, contractor, customer, department, division, employee, instructor, student, supplier.Places: sales region, building, room, branch office, campus.Objects: book, machine, part, product, raw material, software license, software package, tool, vehicle model, vehicle.Events: application, award, cancellation, class, flight, invoice, order, registration, renewal, requisition, reservation, sale, trip.Concepts: account, block of time, bond, course, fund, qualification, stock.Teaching Notes:Prompt the students for additional examples. Have them classify their example(s).Obtain a data model from a source other than the textbook. Ask the students to classify the entities.
5 ERD: RelationshipsRelationship – a natural business association that exists between one or more entities.The relationship may represent an event that links the entities or merely a logical affinity that exists between the entities.Teaching NotesExplain that there may be more than one relationship between two entities. You may reinforce this by adding additional relationships to the example (such as “transferred from” (to reflect a relationship where students changed from one curriculum to another).
6 ERD: Relationships Types of Relationships Three types of relationships can exist between entitiesOne-to-one relationship (1:1)One-to-many relationship (1:M)Many-to-many relationship (M:N)
10 ERD: CardinalityCardinality – the minimum and maximum number of occurrences of one entity that may be related to a single occurrence of the other entity.Because all relationships are bidirectional, cardinality must be defined in both directions for every relationship.bidirectional
11 Cardinality Notations Teaching NotesAlthough this figure shows five different options, help students see that there are really only two options for minimum cardinality (0 or 1) and two options for maximum cardinality (1 or many).
12 JRP and Interview Questions for Data Modeling PurposeCandidate Questions (see Table 8-4 in text for a more complete list)Discover system entitiesWhat are the subjects of the business?Discover entity keysWhat unique characteristic (or characteristics) distinguishes an instance of each subject from other instances of the same subject?Discover entity subsetting criteriaAre there any characteristics of a subject that divide all instances of the subject into useful subsets?Discover attributes and domainsWhat characteristics describe each subject?Discover security and control needsAre there any restrictions on who can see or use the data?Discover data timing needsHow often does the data change?Discover generalization hierarchiesAre all instances of each subject the same?Discover relationships?What events occur that imply associations between subjects?Discover cardinalitiesIs each business activity or event handled the same way, or are there special circumstances?Teaching NotesRegardless of whether you use JRP, interviewing, or any other approach for information gathering, these are good questions to ask.Ask students to suggest other questions they could ask. It will help them and you make sure they understand the concepts and their real-world application.Students can ask themselves these questions as they walk though data modeling for class assignments.
13 What is a Good Data Model? A good data model is simple.Data attributes that describe any given entity should describe only that entity.Each attribute of an entity instance can have only one value.A good data model is essentially nonredundant.Each data attribute, other than foreign keys, describes at most one entity.Look for the same attribute recorded more than once under different names.A good data model should be flexible and adaptable to future needs.No additional notes.
14 Data Analysis & Normalization Data analysis – a technique used to improve a data model for implementation as a database.Goal is a simple, nonredundant, flexible, and adaptable database.Normalization – a data analysis technique that organizes data into groups to form nonredundant, stable, flexible, and adaptive entities.No additional notes
15 More Issues on ERD Degree Foreign Key Binary & ternary relationshipForeign KeyPrimary Key (Identification Entity)Normalization
16 Process Modeling and DFDs Process modeling – a technique used to organize and document a system’s processes.Flow of data through processesLogicPoliciesProceduresData flow diagram (DFD) – a process model used to depict the flow of data through a system and the work or processing performed by the system. Synonyms are bubble chart, transformation graph, and process model.Teaching NotesMany, if not most students have drawn or seen process models in the form of program flowcharts.Unfortunately, flowcharts are control-flow process models as opposed to data flow process models. This can cause some students trouble because they want to illustrate structured flow of control (nonparallel processing) in their early DFDs.Most introductory information systems books at least introduce, with one or two examples, DFDs.
17 Data Flow DiagramsA data flow diagram (DFD) shows how data moves through an information system but does not show program logic or processing stepsA set of DFDs provides a logical model that shows what the system does, not how it does it
18 Differences Between DFDs and Flowcharts Processes on DFDs can operate in parallel (at-the-same-time)Processes on flowcharts execute one at a timeDFDs show the flow of data through a systemFlowcharts show the flow of control (sequence and transfer of control)Processes on a DFD can have dramatically different timing (daily, weekly, on demand)Processes on flowcharts are part of a single program with consistent timingNo additional notes
19 Data Flow Diagrams DFD Symbols DFDs use four basic symbols that represent processes, data flows, data stores, and entitiesGane and Sarson symbol setYourdon symbol setSymbols are referenced by using all capital letters for the symbol name
21 Process ConceptsProcess – work performed by a system in response to incoming data flows or conditions. A synonym is transform.Teaching NotesThe nebulous “system environment” was intended to represent the constantly changing reality that characterizes all systems. The trick is to design systems to adapt to such change, or to be easily adapted to such change.Feedback and control is included to monitor the system and adapt to change.
22 Data Flows Data flow – data that is input to or output from a process. A data flow is data in motionA data flow may also be used to represent the creation, reading, deletion, or updating of data in a file or database (called a data store).Conversion NotesMost books do not teach “control flows.” The were initially proposed by Paul Ward in his books that extended structured analysis techniques to cover real- time systems. They are especially useful in contemporary information systems analysis because they are as close as structured analysis gets to illustrating “messages” in an object-oriented world.Teaching NotesMake sure students do not confuse data flows with flowchart arrows. Flowchart arrows are not named because they merely indicate “the next step.” Data flows pass actual data attributes to and from processes.CRUD is a useful acronym from the database world to remember the basic data flows as they relate to data stores: Create, Read, Update (or change), and Delete.One of the most common uses of composite data flows is to combine many reports into a single data flow on a high-level DFD. They can also be used to combine similar transactions on a higher level DFD before differentiating between those flows on lower-level DFDs.Use case diagrams, an object-oriented analysis tool that also describes interfaces are taught in Chapter 7.
23 External AgentsExternal agent – an outside person, organization unit, system, or organization that interacts with a system. Also called an external entity.External agents define the “boundary” or scope of a system being modeled.As scope changes, external agents can become processes, and vice versa.Almost always one of the following:Office, department, division.An external organization or agency.Another business or another information system.One of your system’s end-users or managersNamed with descriptive, singular nounConversion NotesMost books refer to external agents by the name of external entities. Eventually, we expect to borrow the object-oriented term “actors.”Teaching NotesIt is very important to emphasize the external agents on DFDs are not the same as entities on ERDs (from Chapter 7)—especially if the instructor prefers the more traditional term “external entity.”This is true even though you could have both an entity (on an ERD) with the same name as an external agent/entity on a DFD. Consider the entity CUSTOMER and the external agent CUSTOMER:The entity CUSTOMER indicates the requirement to store data about customers.The external agent CUSTOMER indicates the requirement for an interaction (inputs and/or outputs) with customers.It is very important for students to understand that external agents are “processes” outside of the scope of the system or business. As such, as scope “increases,” external agents can become processes. Conversely, if scope “decreases,” processes can become external agents.
24 Data StoresData store – stored data intended for later use. Synonyms are file and database.Frequently implemented as a file or database.A data store is “data at rest” compared to a data flow that is “data in motion.”Almost always one of the following:Persons (or groups of persons)PlacesObjectsEvents (about which data is captured)Concepts (about which data is important)Data stores depicted on a DFD store all instances of data entities (depicted on an ERD)Named with plural nounTeaching NotesEmphasize that a data store contains all instances of a data entity (from the data model). That is why data store names are plurals (as contrasted to data entity names that are singular).Although we don’t prefer it, some analysts designate a data store to contain all instances of several entities and relationships from a data model. For example, an ORDERS data store might include all instances of the data entities ORDER and ORDERED PRODUCT, and all instances of the relationship between ORDER and ORDERED PRODUCT—We prefer the simplicity of representing each data entity from the data model as its own data store on the process models.Emphasize that because data stores are shared resources available to many processes, it is acceptable to duplicate them on several DFDs—The duplication does NOT indicate redundant storage (on logical DFDs); it merely represents the sharing of the data store by several processes.
25 Process Decomposition Decomposition – the act of breaking a system into sub-components. Each level of abstraction reveals more or less detail.No additional notes
26 Decomposition Diagrams Decomposition diagram – a tool used to depict the decomposition of a system. Also called hierarchy chart.Teaching NotesDecomposition is a top-down problem-solving approach.It might be useful to point out the numbering scheme. This scheme is common, but we do not like it because if the system is restructured, it forces renumbering all processes.Some instructors like to do a quick example using a small but realistic problem.
27 Data Flow Diagrams Context Diagrams Top-level view of an information system that shows the system’s boundaries and scopeDo not show any data stores in a context diagram because data stores are internal to the systemBegin by reviewing the system requirements to identify all external data sources and destinations
28 Data Flow Diagrams Context Diagrams Record the name of the entities and the name and content of the data flows, and the direction of the data flowsWhat makes one system more complex than another is the number of components, the number of levels, and the degree of interaction among its processes, entities, data stores, and data flows
29 Data Flow Diagrams Conventions for DFDs Each context diagram must fit on one pageThe process name in the context diagram should be the name of the information systemUse unique names within each set of symbols
30 Data Flow Diagrams Conventions for DFDs Do not cross lines Use a unique reference number for each process symbol
31 Data Flow Diagrams Strategies for Developing DFDs A set of DFDs is a graphical, top-down modelWith a bottom-up strategy, you first identify all functional primitives, data stores, entities, and data flowsThe main objective is to ensure that your model is accurate and easy to understand
32 Data Flow Diagrams Strategies for Developing DFDs General rule of thumb is that a diagram should have no more than nine process symbolsTo construct a logical model of a complex system, you might use a combination of top-down and bottom-up strategiesThe best approach depends on the information system you are modeling
33 Illegal Process Spontaneous generation Black Hole Gray Hole Process with no inputsBlack HoleProcess with no outputsGray HoleThe inputs is insufficient to generate the output
37 More Issues on Process Model Natural EnglishStructured EnglishDecision TableUse Cases
38 Test YourselfData flow diagrams show what a system does, not how it does it (T/F).49
39 Test YourselfData flow diagrams show what a system does, not how it does it (T/F).True49
40 Test YourselfThe following symbols are from the _____________ set. Name them:49
41 Test YourselfThe following symbols are from the Gane and Sarson set. Name them:Data StoreProcess49
42 Test Yourself Select the correct example below. A) B) Customer PaymentPaymentApplyPaymentAccountsReceivable49
43 Test Yourself Select the correct example below. A) is correct. An external entity can’t be directly connected to a data store.A)B)CustomerApplyPaymentAccountsReceivable49
44 Test YourselfMatch the terms in the left column to the proper definitions in the right column.1. Black Hole2. SpontaneousGenerationProcess3. Gray HoleA process with at least 1 inputand output, but the input isinsufficient to generate the shownoutput.b. A process that has no outputc. Used to describe an unexplainedgeneration of data or information.49
45 Test YourselfMatch the terms in the left column to the proper definitions in the right column.1. Black Hole2. SpontaneousGenerationProcess3. Gray HoleA process with at least 1 inputand output, but the input isinsufficient to generate the shownoutput.b. A process that has no outputc. Used to describe an unexplainedgeneration of data or information.49