Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.

Slides:



Advertisements
Similar presentations
Information Systems Today: Managing in the Digital World
Advertisements

Chapter Information Systems Database Management.
Chapter Information Systems Database Management.
Database BA471 – Fall 2007 Dr. V.T. Raja Oregon State University.
Chapter 3 Database Management
Chapter 6 Methodology Conceptual Databases Design Transparencies © Pearson Education Limited 1995, 2005.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Entity-Relationship Model and Diagrams (continued)
Lecture Fourteen Methodology - Conceptual Database Design
Information Technology in Organizations
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Methodology Conceptual Database Design
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Introduction to Data bases concepts
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
Chapter 5 Database Processing.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
1 C omputer information systems Design Instructor: Mr. Ahmed Al Astal IGGC1202 College Requirement University Of Palestine.
CSC2012 Database Technology & CSC2513 Database Systems.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Web-Enabled Decision Support Systems
Chapter 1 Overview of Database Concepts Oracle 10g: SQL
1 Chapter 1 Overview of Database Concepts. 2 Chapter Objectives Identify the purpose of a database management system (DBMS) Distinguish a field from a.
Methodology Conceptual Databases Design
Databases. Database A database is an organized collection of related data.
Information Systems: Databases Define the role of general information systems Describe the elements of a database management system (DBMS) Describe the.
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
Database Organization and Design
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Lecture 2 An Overview of Relational Database IST 318 – DB Admin.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
CSC 240 (Blum)1 Introduction to Database. CSC 240 (Blum)2 Data versus Information When people distinguish between data and information, –Data is simply.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
GUS: 0262 Fundamentals of GIS Lecture Presentation 3: Relational Data Model Jeremy Mennis Department of Geography and Urban Studies Temple University.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Chapter 4 Database Processing Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 4-1.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
Chapter 1Introduction to Oracle9i: SQL1 Chapter 1 Overview of Database Concepts.
Databases Shortfalls of file management systems Structure of a database Database administration Database Management system Hierarchical Databases Network.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
ITGS Databases.
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Concepts 2440: 180 Database Concepts Instructor:
Relational Theory and Design
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
BSA206 Database Management Systems Lecture 2: Introduction to Oracle / Overview of Database Concepts.
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Instructor: Pavlos Pavlikas1 How Data is Stored Chapter 8.
6.1 © 2007 by Prentice Hall Chapter 6 (Laudon & Laudon) Foundations of Business Intelligence: Databases and Information Management.
ASET 1 Amity School of Engineering & Technology B. Tech. (CSE/IT), III Semester Database Management Systems Jitendra Rajpurohit.
Chapter 1: Introduction. 1.2 Database Management System (DBMS) DBMS contains information about a particular enterprise Collection of interrelated data.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
Database Planning Database Design Normalization.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Rationale Databases are an integral part of an organization. Aspiring Database Developers should be able to efficiently design and implement databases.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Databases Chapter 16.
Information Systems Today: Managing in the Digital World
 DATAABSTRACTION  INSTANCES& SCHEMAS  DATA MODELS.
Databases and Information Management
Entity-Relationship Model and Diagrams (continued)
Databases and Information Management
Presentation transcript:

Databases From A to Boyce Codd

What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in which multiple trajectories of interaction with the information are possible. When thinking about “the database” on this level of abstraction, the difference between a collection of discrete objects (e.g., a bunch of books) and a particular structured representation of those objects (e.g., a set of catalog records) are, for example, not so different. Manovich says exactly this.

What is a database? In contrast, in the computational universe, the difference between a set of files and a set of relational tables is significant, in terms of the operations that can be performed on each. For computational purposes, a database is a set of structured data; Brookshear includes the requirement that the data be structured in a multidimensional way, so that it can be presented “from a variety of perspectives.”

What is a database? Are these databases? In what way? The books in my office. The PCL. YouTube. The Internet Movie Database. The Web. Expedia. Epicurious. What is useful about calling any of these things databases or not?

What is a database? Even in the computational universe, a “database” can exist at multiple levels of abstraction: a conceptual model of a database (as presented through entity-relationship diagrams) can be implemented via different logical models (e.g., as different tables in a relational database, or potentially as objects in an object database, or...), and these can be stored and accessed differently (for example, distributed over many servers)...

Data modeling with ER diagrams Entity-relationship diagrams depict a conceptual model (schema) of data by specifying entities (things), attributes (properties of the things) and relationships between the things. ERDs document semantics, not implementation. There are many forms of notation for ERDs (Chen, the optional reading, is an early one).

Data modeling with ER diagrams There is no “right answer” when determining entities, attributes, and relationships; different entities might be defined, or entities might be defined as relationships instead. It can be difficult to predict consequences of one choice vs. another. Similar decisions and consequences occur when translating a conceptual data model to a logical model for implementation (and then in the actual implementation).

Data modeling with ER diagrams Let’s take a simple example to play with ERDs. We want to model the idea that students take courses and instructors teach courses. One way to do this is to have three entities: students, courses, and instructors.

Simple ER example #1

Data modeling with ER diagrams We could also do this by having two entities: courses and people. In this model, people would have two roles, instructor and student. Chen has an example of a marriage as a relationship between two people, with roles for husband and wife (Chen was writing in 1976). Chen notes that a marriage can also be modeled as an entity.

Simple ER example #2

Data modeling with ER diagrams If we are interested in capturing other information about students and instructors besides their names, then we probably want to separate them (e.g., we might want to track how many credits students have and how many requirements they have completed, and these don’t apply to instructors). If there was overlap between students and instructors, we might have some redundancy going on. We might want to keep this in mind.

Back to simple ER example #1

Data modeling with ER diagrams Say we want to add the idea of grades to this model. Instructors assess student performance in courses by assigning them grades. How might we model grades? Are they entities? Are they attributes of some existing entity? Are they are relationship between entities? Let’s think about it.

Does this model work?

From ERD to database Databases can be implemented in different ways. Brookshear describes relational databases and object-oriented databases. In each case, the database would be implemented in a database management system (DBMS), which would hide details of the actual data composition and storage and such from application programs that use the data.

Relational databases In a relational database, entities are described as rows in a table. The table is called a relation. The table columns are attributes. Each table row is also called a tuple.

A relation Here is a relation that contains information about peer reviewers for an academic conference. There are some problems with this relation.

A relation One immediate problem is that we have multiple values for a single attribute. That’s going to make it hard to figure out how many papers are really assigned to Barb Chen (is that one number or two numbers), or to reassign papers, etc.

Revised relation We can fix this by making additional tuples so that no cell has multiple values. But then we have some redundancy in the relation; every time we assign a new paper, we need to input that name, school and city information as well. Redundancy can cause problems with data integrity.

Revised relations To fix this redundancy problem, we need to split up the table into multiple tables and relate each table with a “foreign key” that links back to the original table. Then we can also include more information about the papers, which is probably necessary anyway.

Revised relations

There’s still a problem with that original table, though; the city attribute isn’t really about the reviewer, is it? It’s about the school. It might not be likely that schools will change cities, but it’s still good practice to keep all the attributes in a table about the entity specified by the primary key (in this case, the reviewer).

Revised relations

So that’s good. But there was still redundancy in those Paper and Expertise tables. Also we have deletion problems...if a paper isn’t assigned to a reviewer, does that mean it can’t exist in the database? How would we fix that?

Revised relations

Normalization Wowee, that’s a lot of tables there! Do we really want to do that? Yes. And no. On the one hand, it’s good to minimize data redundancy, because this can lead to problems with updating. On the other hand, performing lots of Join operations to put information together again can be inefficient, from a performance perspective.

Database operations Databases are powerful because we can reassemble data in myriad ways. Basic relational database operations include Select, Project, and Join.

Database operations Select extracts rows (tuples) from a relation. Project extracts columns (attribute values). Join combines multiple relations into a new relation.

Query languages Database query languages, such as SQL, may implement the basic operations in different ways. A SQL statement may perform all three operations—Join, Select, and Project.

Very large databases For tremendous datasets such as Facebook, Twitter, and so on (which include not only things like posts but also relationships between items and user behavior), it’s hard to develop comprehensive models that align with the dynamic data environment. Plus, queries with lots of tables are slow.

Very large databases Hadoop is a set of technologies to manage extremely large, hetereogeneous, dynamic data environments. Primary Hadoop technologies include HDFS, a system for distributing data across multiple computers, and MapReduce, software for distributing processing on the distributed data. There are a variety of associated technologies emerging to facilitate processing of Hadoop data.

So...what then? What if I learn MySQL and it becomes obsolete? Of course it will become obsolete! But that doesn’t mean it’s foolish to learn it. Also, it’s important to remember where MySQL lies in the realm of “the database.” For example, conceptual models may be stable where logical models and implementation details change.