Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 10 Data and Knowledge Management. Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data.

Similar presentations


Presentation on theme: "Chapter 10 Data and Knowledge Management. Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data."— Presentation transcript:

1 Chapter 10 Data and Knowledge Management

2 Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

3 Data Set of discrete, objective facts about events Business - structured records of transactions Little relevance or purpose

4 Information Message with sender and receiver Meant to change way receiver perceives something Have an impact on his judgment / behavior

5 Data Processing Contextualize - why was data gathered? Categorize - what are its key components? Calculate - analyze mathematically Condense - summarize in more concise form

6 Information Processing Compare - in kind and in time Consequences - how used in decisions / actions Connections - relation to other information Conversation - what other people think about this information

7 Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

8 Database Element Types Structure Models Creation Topology

9 Element Bit, byte, field, record, file, database Entity, attribute, key field Relation Class, object

10 Database Types Business database Geographical information database Knowledge database / deductive database Multimedia database Data warehouse Data marts Multimedia and hypermedia database Object-oriented database

11 Database Structure Data definition language –Schema & subschema Data Manipulation language –Structured Query Language (SQL) –Query By Example (QBE) Data dictionary

12 Database Models Hierarchical –One to many –TPS or routine MIS Network –Many to many –TPS or routine MIS Relational –Normalization –Ad hoc reports or DSS Object-oriented –E-commerce

13 Database Creation Conceptual design –Logical view –Entity-relationship (ER) diagram –Normalization

14 Entity Relationship Diagram Entity: object or concept Relationship: meaning association between objects Attribute: property of an object –Simple & Composite –Single-valued & multi-valued –Derived Key –Primary key –Foreign key

15 Normalization A technique for identifying a true primary key for a relation Types –First normal form: not repeating group –Second normal form: every non-primary-key attribute is fully functionally dependent on the entire primary key –Third normal form: no transit dependency

16 Structured Query Language Select Join

17 SQL DML - SELECT SELECT [DISTINCT|ALL] {* | [col- expr [AS newname]][,...] FROM table-name [alias] [,...] [WHERE condition] [GROUP by colm [, colm] [HAVING condition]] ORDER BY colm [, colm]

18 SQL DML - SELECT SELECT attributes (or calculations: +, -, /, *) FROM relation SELECT DISTINCT attributes FROM relation

19 Examples SELECT stuname FROM student; SELECT stuid, stuname, credit FROM student; SELECT stuid, stuname, credit+10 FROM student; SELECT DISTINCT major FROM student;

20 SQL DML - SELECT SELECT attributes (or * wild card) FROM relation WHERE condition

21 Examples SELECT * FROM student; SELECT stuname, major, credit FROM student WHERE stuid = ‘S114’; SELECT * FROM faculty WHERE dept = ‘MIS’;

22 SELECT - WHERE condition AND OR NOT IN NOT INBETWEEN IS NULLIS NOT NULL LIKE '%' multiple characters LIKE ‘_’ single characters

23 Examples SELECT * FROM faculty WHERE dept = ‘MIS’ AND rank = ‘full professor’; SELECT * FROM faculty WHERE dept = ‘MIS’ OR rank = ‘full professor’; SELECT * FROM faculty WHERE dept = ‘MIS’ NOT rank = ‘full professor’;

24 SELECT * FROM class WHERE room LIKE ‘B_S%’; SELECT * FROM class WHERE room NOT LIKE ‘BUS%’; SELECT productid, productname FROM inventory WHERE onhand BETWEEN 50 and 100;

25 SELECT companyid, companyname FROM company WHERE companyname BETWEEN ‘G’ AND ‘K’; SELECT productid, productname FROM inventory WHERE onhand NOT BETWEEN 50 and 100; SELECT companyid, companyname FROM company WHERE companyname NOT BETWEEN ‘G’ AND ‘K’;

26 SELECT facname FROM faculty WHERE dept IN (‘MIS’, ‘ACT’); SELECT facname FROM faculty WHERE rank NOT IN (‘assistant’, ‘lecture’); SELECT customername FROM customer WHERE emailadd IS NOT NULL;

27 SELECT customername FROM customer WHERE creditlimit IS NULL;

28 SELECT - aggregate functions COUNT (*) COUNT SUM AVG MIN MAX

29 Examples SELECT COUNT(*) FROM student; SELECT COUNT(major) FROM student; SELECT COUNT(DISTINCT major) FROM student;

30 SELECT COUNT(stuid), SUM(credit), AVG(credit), MAX(credit), MIN(credit) FROM student;

31 SELECT - GROUP GROUP BY HAVING

32 Examples SELECT major, AVG(credit) FROM student GROUP BY major HAVING COUNT(*) > 2; SELECT course#, COUNT(stuid) FROM enrollment GROUP BY course# HAVING COUNT(*) > 2;

33 SELECT major, AVG(credit) FROM student WHERE major IN (‘MIS’, ‘ACT’) GROUP BY major HAVING COUNT(*) > 2;

34 SELECT - ORDER BY ORDER BY ORDER BY... DESC

35 Examples SELECT facname, rank FROM faculty ORDER BY facname; SELECT facname, rank FROM faculty ORDER BY rank DESC, facname;

36 SELECT - JOIN Tables Multiple tables in FROM clause MUST have join conditions!!!

37 Examples SELECT stuname, grade FROM student, enrollment WHERE student.stuid = enrollment.stuid;

38 SELECT enrollment.course#, stuname, major FROM class, enrollment, student WHERE class.course# = enrollment.course# AND enrollment.stuid = student.stuid AND facid = ‘F114’ ORDER BY enrollment.course#;

39 SUBQUERY, EXIST, NOT EXIST SELECT s.stuname, major FROM student s WHERE EXIST (SELECT * FROM enrollment e WHERE s.stuid = e.stuid);

40 SELECT s.stuname, major FROM student s WHERE NOT EXIST (SELECT * FROM enrollment e WHERE s.stuid = e.stuid);

41 Database Creation Physical design –Physical view –Data topology (organization) Centralized Distributed database –Replicated database –Partitioned Organization & access method –Sequential file –Indexed sequential file –Direct or random file Security –Logical, physical, and transmitting

42 Selection Criteria User’ needs (type of application) Compatibility Portability Reliability Cost Feature Performance Vendor’s support Others?

43 Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

44 Data Administrator Clean up data definitions Control shared data Manage distributed data Maintain data quality

45 Clean Up Definitions Synonyms / aliases Standard data definitions –Names and formats Data Dictionary –Active –Integrated

46 Control Shared Data Local - used by one unit Shared - used by two or more activities Impact of proposed program changes on shared data Program-to-data element matrix

47 Manage Distributed Data Geographically dispersed –Whether shared data or not Different levels of detail –Different management levels

48 Maintain Data Quality Put owners in charge of data –Verify data accuracy and quality Purge old data

49 Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

50 The DBMS Data Base Management System: software that permits a firm to: –Centralize data –Manage them efficiently –Provide access to applications Such as payroll, inventory

51 DBMS Components Data Definition Language (DDL) Data Manipulation Language (DML) Inquiry Language (IQL) Teleprocessing Interface (TP)

52 Definitions Views: –Physical - how stored –Logical - how viewed and used by users Schema - Overall logical layout of records and fields in a database Subschema: Individual user’s logical portion of database (view)

53 Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

54 Distributing Data Centralized files Fragemented files –Distribute data without duplication –Users unaware of where data located

55 Distributing Data Replicated files –Data duplicated –One site has master file –Problem with data synchronization Decentralized files –Local data autonomy

56 Distributing Data Distributed files –Client / server systems –Stored centrally –Portion downloaded to workstation –Workstation can change data –Changes uploaded to central computer

57 Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

58 Data Warehousing Collect large amounts of data from multiple sources over several years Classify each record into multiple categories –Age –Location –Gender

59 Data Warehousing Rapidly select and retrieve by multiple dimensions –All females in Chicago under 25 years of age Provide tailored, on-demand reports Data mart: a replicated subset of the data warehouse –A functional or regional area

60 Data Mining Fitting models to, or determining patterns from, warehoused data Purposes: –Analyze large amount of data –Find critical points of knowledge –Perform automatic analyses

61 Data Mining Terms Data Visualization Drill-down Analysis –Hierarchical structure –Leads to increasing level of detail Expert System (ES) methodology –e.g., neural networks

62 Applications Finance - fraud detection Stock Market - forecasting Real estate - property evaluation Airlines - customer retention Retail - customer targeting

63 Data Mining Example What type customers are buying specific products? When are the times customers will most likely shop? What types of products can be sold together?

64 Points to Remember Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining

65 Discussion Questions How can a database help an organization? Why normalization is very important for building a database? Do you see any problem of the database in your organization?

66 Discussion Questions What kind of database model is most suitable for –School? –Department store? –Police? Some organizations are hesitant to distribute data. These organizations feel that they may lose control. –Do they lose control? Why? –Could you suggest a “good” tactic? Could Data Mining pose a threat to individual privacy? –Why or why not? –If so, how can we mitigate that threat? –Do the advantages outweigh the disadvantages?

67 Assignment Review chapters 10 Read chapter 8, 9, and 11 Group assignment Research paper


Download ppt "Chapter 10 Data and Knowledge Management. Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data."

Similar presentations


Ads by Google