Presentation is loading. Please wait.

Presentation is loading. Please wait.

Organizing Data and Information for Use in Decision Making

Similar presentations


Presentation on theme: "Organizing Data and Information for Use in Decision Making"— Presentation transcript:

1 Organizing Data and Information for Use in Decision Making
(MIS) 22

2 Views of Data The Physical View The Logical View Concerned with
Concerned with how data is represented so that it will be meaningful to users.

3 TABLE in Access FILE in Access
It is helpful to consider data as organized in a hierarchy, from the smallest unit (a bit) to the largest (a database). There is a difference between physical units of data, such as bits, which is how data exists in the computer or in storage, and logical units of data, which is how data is represented so that it will be meaningful to users. Data physically exists in bits, bytes, and words. The smallest logical representation of data used a character, such as an “A” or a “$”. A character usually consists of one byte. A character may be an upper case or a lower case letter, a number,or a symbol. Characters are combined to make a field. A field is a combination of characters describing a characteristic of a business object or activity. For example “employee name” and “social security number” are fields. They are characteristics or attributes describing “employee”. Fields are combined to form a record. A record is a collection of related fields. For example, a student record would contain fields about one student, such as student name, address, SSN, and date of birth. Records are combined to form a file. A file is a collection of related records. For example, a STUDENT file would contain student records. Continuing the previous example, each student record would contain the student’s name, address, SSN, and date of birth. In order to distinguish one record from another, each record contains a record key. A record key is a field that uniquely identifies a record. In our student file, the record key would be social security number. Files are combined to create a database. At the top of the data hierarchy is the largest of logical unit of data, a database. A database is a collection of integrated files that are organized in such a way as to be accessible to multiple applications and users. A database includes data and records and the relationships among them. Bit: a single zero or one

4 E (usually called simply Entity)
It is a concept that relates to the items about which you wish to store information--these “items” can include conceptual ideas such as “philosophers’ thoughts,” people such as “student,” things such as “refrigerators” or places such as “city.” G EMPLOYEES Table

5 Instance An of an Entity Class is called an of that class. Manhattan is an instance of the entity class city. A More A record in Access

6 Attribute A PRIMARY KEY is a field in a database file that C
Employee #, Last Name, First Name, Hire Date, Dept. Number The columns in Access A PRIMARY KEY is a field in a database file that ID Number Social Security Number

7 Four main database models
Hierarchical Network Relational (the ) Object-Oriented

8 Relational Database Model
Uses a series of two-dimensional called to store information relating to a Tables = Relation: A specific entity class Rows = or individual file folders. Columns = Attributes or The relational database model is the widely used database model at the present time.

9 Spreadsheet Data Used to create a database

10 Steps to Create a Relational Database
Define entity classes and primary keys Define relationships among entity classes Define attributes (fields) for each relation Use a data definition language to create the database.

11 Step One: Need a database that keeps track of the following information about your health club: The first name, last name, and id number for each of your members The address (street, city, state) where your members live The gender of each member and what class(es) they are taking The course name, number, and time that each of course meets The first name, last name, and id number for each of your employees. The home phone number for each employee, as well as their salary amount and the date that they were hired.

12 Data Assumptions/Traits
Member Number is unique. Each member will have only one Member Number. There is one Member for each Member Number. At this time, each member can enroll in only one course (at a time) that is offered by your health club. Therefore, each member will have only one class number recorded at any time. (At some point in your assignment, you will make it possible for your members to take more than one course at a time) Class Number is the unique identifier for each class that your health club offers. There is only one course for each class number. Employee Number is the unique identifier for each employee that works at your health club. There is only one employee for each employee number.

13 Data Assumptions/Traits
A member can take only one course at a time. Only one instructor can teach a particular course. Only one section of a course is taught at any one time. A course may be taken by many different students. An instructor can teach more than one course at any one time. THEREFORE WE HAVE THE FOLLOWING ENTITY CLASSES: MEMBERSHIP - primary key is Member Number COURSES - primary key is Class Number EMPLOYEES – primary key is Employee Number

14 Step 2: Define between entity classes.
Using an Entity Relationship (E-R) Model Diagram, we can represent the entity classes and their relationships: Key: Member Number Key: Class Number MEMBERSHIP COURSES 1:M Key: Class Number Key: Employee Number COURSES EMPLOYEES 1:M STEP 3: Define the attributes in each Entity which has been converted into a

15 Normalization A technique used to make complex databases more efficient Break one large Eliminate all repeating groups in records Eliminate Assure that each field in the relation depends only on the primary key of that relation

16

17 Normalized Data: are together by a found in

18 A. of this current example is that each member can
A of this current example is that each member can That is highly inefficient and it is not a good way to run your business.

19 Creating a allows you to store information in that table related to all of the different courses that your members are enrolling in. Class Number from the Membership table and placed in the Enrollment table. Neither Member Number nor Class Number is unique by itself in the Enrollment table. However, you can create a key that together, and that can be your unique identifier.

20 You need a database management system (DBMS) to do this
Step #4: Use a Data Definition Language to create the database You need a database management system (DBMS) to do this

21 Microsoft is a software. (database management system) that. a
Microsoft is a software (database management system) that a that follows the principles found in the Model

22 What Is a Database Management System
An that provides all the necessary capabilities for b files, ex required for making decisions, and formatting the information into structured reports. DIFFERENT FROM A DATABASE 30

23 DATABASE a collection of information that you
A database is actually composed of two parts: 1. The information itself / the files that are logically associated 2. The logical structure of the information which is called the data dictionary. The data dictionary contains the logical properties that describe information in a database.

24 DATABASE MANAGEMENT SYSTEM (DBMS)
the software you use to specify the logical organization for a database and access it. A DBMS contains 5 software components:

25 Database Management System Engine
accepts logical requests from the various other DBMS subsystems, converts them to their physical equivalent, and actually accesses the database and data dictionary as they exist on a storage device. You can work with the information

26 DATA DEFINITION SUBSYSTEM
helps you create and maintain the data dictionary and define the structure of the files in a database. A data dictionary serves the as an . You use this subsystem to define the information logical when you first create a database. Once you’ve created a database, you use this subsystem to fields, fields, or field properties.

27 What might be found in a data dictionary in Access?
F and their , as well as the I are rules that help assure the quality of the information in a database. (Validation Rules in Access) Setting a for a field A birth date can be now or in the past: not in the future. A registration database at your school includes integrity constraints concerning for certain classes. Validation Text: seen in Access.

28 What might be found in a data dictionary in Access?
Input Masks and Field Sizes Formats, Default Values, and Captions With Referential Integrity, you create a feature that will When two different tables are linked/joined by a relationship where referential integrity is enforced, you cannot type data into a linked field if that same data does not already exist in the original table. You

29

30 DATA MANIPULATION SUBSYSTEM
helps you add, change, and delete information in a database and mine it for valuable information. This subsystem is most often the between you as a user and the information contained in a database. Tools in this subsystem include views, report generators, query-by-example tools, and structured query language.

31 DATA MANIPULATION TOOLS
- allows you to see the content of a database file, make whatever changes you want, perform simple sorting, and search to find the location of specific information. Add records. Delete records Sort records. Search for records.

32

33 DATA MANIPULATION TOOLS
REPORT GENERATOR - helps you quickly define formats of reports and what information you want to see in a report. You can specify exactly what you want to see and where you want to see it. Report Wizard Report Design Screen

34

35 DATA MANIPULATION TOOLS
QUERY-BY-EXAMPLE (QBE) TOOL - helps you graphically design the answer to a question. You create this query You specify that you want the database searched for, and the query tool will return records that match the conditions or criteria that you specified. Queries are used to useful

36 Query Design screen (above) and Query Result (below)

37 DATA MANIPULATION TOOLS
STRUCTURED QUERY LANGUAGE (SQL) - a standardized fourth-generation language found in most database environments. SQL is the same as QBE, except that you perform a query by creating a statement instead of pointing, clicking, dragging. SQL is a Uses SELECT FROM WHERE

38 APPLICATION GENERATION SUBSYSTEM
contains facilities to help you develop transaction-intensive applications. This subsystem includes: Tools for creating Application buttons that are used to execute a certain task. Uses a programming language specific to the database management system that you are using.

39

40 DATA ADMINISTRATION SUBSYSTEM
helps you manage the overall database environment by providing facilities for: B Security management W Concurrency control ensures the validity of database updates. Change management allows you to assess the impact of proposed structural changes.

41 One of the goals of a database management system is to provide easy , while at the same time If you have multiple files, each containing the same information, then you will most likely find redundant elements and erroneous data.

42 Relational Database Model
Uses a series of two-dimensional Connects or relates data in different files through the use of a Based on mathematical principles which allow for more logical manipulation of data. Most flexible type of organization.

43 THE CONCEPT OF KEYS A KEY is a field or combination of fields used to identify records so they can be easily retrieved and processed. A PRIMARY KEY is a field in a database file that. I Social You can create relationships between tables/files through /file.

44 Student and Department
Student Entity Student Number Address . Department ID Department Entity Department ID Address . This way, if we wanted to know all the students that belong to a particular department, we could get the listing by joining the two entities on the Department ID value.

45 Relational Database Model
State DMV Database Driver’s license file/table license no. Expiration date Zip State City Street address Name J.Doe 1XYZ234 Car owner file/table Car License No. Name Zip St. City Street address Model year make A405261 J. Doe Moving violation citation file/table Driver’s License No. Fines paid/ not paid Date cited type Citation number 1XYZ234 (MIS) Parking violation citation file/table Fines paid/ not paid Car License Number Date cited type Citation number A405261 62

46 Creating between database files.
C /file. You might have to physically draw the connection between two tables/files. This is usually achieved by drawing a connecting line between two fields.

47 INFORMATION STORED SEPARATELY BUT CAN BE RELATED THROUGH KEY JOINS

48 Another example: which movies are provided by which distributors?

49 Recent Database Developments
The Data Warehouse and Data Mining Object-Oriented Database Management Systems

50 Organizational Databases
Organizational databases are constantly changing as the organizational systems are constantly updating and changing the contents of the organizational database to reflect current business activities. Organizational databases support Constantly being updated and changed. Not very useful for decision making involving the analysis of Organizational databases rarely maintain historical data, just information related to current operations.

51 A logical collection of information gathered from
A logical collection of information gathered from operational databases. Contains historical data that has been extracted from many different operational databases. Historical data is used for decision making.

52 Operational databases support transaction processing (OLTP).
Data warehouses support (OLAP) because historical data is analyzed in order to make D are the software tools used to query information in data warehouse. Use advanced statistical techniques to search for patterns and anomalies in the data. Attempts to find answers to questions the user did not even think to ask.

53 A relational database stores information in a series of .
Data warehouses are , containing Each dimension is an of information. The data warehouse modeled in the above figure compiles information from internal databases or transactional/operational databases and external databases through ETL It then send subsets of information to the data marts through the ETL process Ask your students to distinguish between a data warehouse and a data mart? Ans: A data warehouse has an enterprisewide organizational focus, while a data mart focuses on a subset of information for a given business unit such as finance

54 D perform analysis in data warehouses
C – common term for the representation of multi-dimensional information ( ) Users can slice and dice the cube to drill down into the information Cube A represents store information (the layers), product information (the rows), and promotion information (the columns) Cube B represents a slice of information displaying promotion II for all products at all stores Cube C represents a slice of information displaying promotion III for product B at store 2 CLASSROOM EXERCISE Analyzing Multiple Dimensions of Information Jump! is a company that specializes in making sports equipment, primarily basketballs, footballs, and soccer balls. The company currently sells to four primary distributors and buys all of its raw materials and manufacturing materials from a single vendor. Break your students into groups and ask them to develop a single cube of information that would give the company the greatest insight into its business (or business intelligence) given the following choices: Product A, B, C, and D Distributor X, Y, and Z Promotion I, II, and III Sales Season Date/Time Salesperson Karen and John Vendor Smithson Remember you can pick only 3 dimensions of information for the cube, they need to pick the best 3 Product Promotion These give the three most business-critical pieces of information

55 Applications of Data Mining
M : identifying common characteristics of customers who buy the same products from your company. Customer Churn: predicting which customers are likely to leave your company and go to a competitor. F : identifying which transactions are most likely to be fraudulent. Direct Marketing: identifying the best prospects in order to obtain the highest response rate. Market : trying to understand which products are commonly purchased together. T : trying to reveal differences between one period and another period.

56 By adding a , you can add : 3-D (rows and columns and layers).
Info in an Excel spreadsheet and a relational database (Access) appears in the form of a two dimensional table of rows and columns. By adding a , you can add : 3-D (rows and columns and layers). Creating a 3-dimensional Pivot Table in Excel is a means of conceptually building a data warehouse. Page fields represent the depth layer Pivot Tables can help you see relationships in the data

57

58 The following are summary slides that I don’t plan on covering in class, but I am making available for you to review, if you wish. The Interpreting SQL slides will be seen at a later date (after everyone has made queries in lab)

59 TO SUMMARIZE How we view information:
The physical view of information deals with how information is physically arranged, stored, and accessed on some type of secondary storage device. The logical view of information focuses on how you need to arrange and access information to meet your particular business needs. A database is a collection of information that you organize and access according to the logical structure of that information. The data dictionary contains the logical structure of information in a database.

60 Database Models Relational Database Model
All the data is arranged in a series of related tables. Student, Class, Instructor Tables are linked together by common fields. Object-Oriented Database Model Allows you to store the data and the procedures used to manipulate that data together. Can be used to store text, sound, video, and images. Stores data, computes GPA, and creates a transcript.

61 Normalization Used to create tables in a relational database.
Break one large table into several smaller tables Eliminate redundant data (duplicate copies) With redundancy removed, you only need to make a change once and all linkages are automatically updated.

62 Proper Database Design
Field: Name: Brian Kovar Field: Address: 123 North Main, Manhattan, KS Field: Last Name: Kovar Field: First Name: Brian Field: Address: 123 North Main Field: City: Manhattan Field: State: KS Field: Zip Code:

63 Interpreting SQL Code

64 SQL (The language used to query a database)
S is used to specify the you want to include. F is used to specify the the selected fields are coming from. W is used to used to narrow down the data prior to being displayed. O is used to specify how the records (in ascending or descending order), as well as the sort order. I specifies that two or more tables are and it also specifies the field that forms the between the tables.

65

66 S the following fields from the Employees table: Last Name, Title and Salary.
The used to narrow down the records is where the title is “Sales Representative.” The results should be in descending order (high to low number order) based on salary. All of the fields in the query come from the Employees table.

67 SELECT Max(Employees. Salary) AS MaxOfSalary, Min(Employees
SELECT Max(Employees.Salary) AS MaxOfSalary, Min(Employees.Salary) AS MinOfSalary, Avg(Employees.Salary) AS AvgOfSalary, StDev(Employees.Salary) AS StDevOfSalary FROM Employees; Select the Salary field from the Employees table (4 times). Find the maximum salary, the minimum salary, the average of salaries and the standard deviation of salaries.

68 Select the following fields from the Customers table: Company Name, City and Country.
The criteria used to narrow down the records is where the country begins with U (it does not matter what comes after the U as long as the first letter of the country is a U). All of the fields in the query come from the Customers table.

69 All of the fields in the query come from either the Customers table or the Orders table. The Customers and Orders tables are joined together by the common field of Customer ID (Inner Join signifies this). Select the following fields from the Customers table: Customer ID and Company Name. Select the following fields from the Orders table: Order Date and Shipped Date. The criteria used to narrow down the records is that the company name should match “B’s Beverages”.

70 O is used to specify how the records should be sorted (in ascending or descending order), as well as the sort order. F uses the city field and the records are sorted in alphabetical order by city (order by defaults to ascending order) If there happen to be multiple entries from the same city, a (descending order by last name)


Download ppt "Organizing Data and Information for Use in Decision Making"

Similar presentations


Ads by Google