Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 440: Database Management Systems 1: Introduction.

Similar presentations


Presentation on theme: "CS 440: Database Management Systems 1: Introduction."— Presentation transcript:

1 CS 440: Database Management Systems 1: Introduction

2 Welcome to CS440! Arash Termehchy – Assistant professor in the school of EECS – Just moved here from Illinois – Usable data exploration systems. Your turn: – Name, field, DB background

3 Data management Modeling a large number of entities and relationships. – Called structured data – Formal (logical) model Maintaining them on computational devices – Servers in the cloud, sensor networks, … – Keep them organized according to model – Cope with failures –…–…

4 Data management Exploring entities and relationships efficiently, easily, and effectively – Where are the more affordable apartments in Portland? – Who is the most similar person to Alan? – How a virus will likely to spread in a population? Make an informed and effective decision

5 Why study data management? Data is everywhere: – Business: financial analytics, … – Social: social network, data sharing, … – Personal: map apps, … – Science: spread of diseases, …

6 Data management is valuable According to McKinsey & Companys: – $300 billion potential annual value to US health care – 250 billion potential annual value to Europes public sector – 60% potential increase in retailers operating margins Data science is transforming the way we make decisions, make scientific discovery, … – Analyzing genetic data to find cures for diseases.

7 Data management is challenging According to McKinsey & Company: – 30 billion data items shared on Facebook every month – 235 TB collected by the Library of Congress – 40% growth in the global data each year 90% of worlds data was generated in the last two year! Big data: huge, heterogeneous, evolving

8 We study these challenges How to get what we like from the data easily, effectively, and efficiently?

9 Why should we learn these subjects? Isnt sufficient to know SQL? – Let companies that make database management systems to worry about these issues. No! You will end up with: – A query that takes hundreds of hours to finish! – A database that contains negative salaries!

10 Why should we learn these subjects? Managing conventional data requires more: – Tuning databases, developing efficient data exploration programs, … You may face unconventional data management scenarios – The data may be a big graph that is constantly evolving. You may use data management ideas in your own work.

11 Prerequisites Good programming skills CS 261 and CS 275 or equivalent Contact instructor if you are not sure.

12 Readings Required: – Database Systems: The Complete Book, Hector Garcia Molina, Jeffry Ullman, and Jennifer Widom – Notes on the course website for subjects not covered by the textbook.

13 Readings Recommended: – Database Management Systems, Ragu Ramakrishnan and Johannes Gehrke – Foundations of Databases, Serge Abiteboul, Richard Hull, and Victor Vianu Other useful readings on the course website.

14 Grading Scheme Assignments 40% Project 60%

15 Assignments Written assignments – To understand the main concepts and methods. – Should be done individually. Start soon!

16 Project A database centric application – Data Engineering effort. Advanced feature – Different from CS 275 – Easier search (keyword search) – Nice visualization – …

17 Project Group 2 – 4 – Practice how to work in groups Project definition is due in the third week of the class! – The data, application, and scope – 5% of total grade

18 Basic Concepts Database management system (DBMS): – A piece of software that simplifies and facilitates data management and exploration. Database content – Data – Schema: information about data, meaning of the data Salary: Schem a 10 Data Age: 10

19 Physical Data Independence Independence from physical details – File system, operating system, hardware,.. Data models – The way that we see real-world data. – Relational data model: everything is a relation. Declarative query language: SQL – Say what, not how

20 Relational Database Management 20 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes

21 Topics 21 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes

22 Topics 22 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes

23 Topics 23 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Modeling data and asking questions: Relational Model & Languages

24 Topics 24 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Organizing the data: Database design

25 Topics 25 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Keeping the data clean and meaningful: Integrity constraints

26 Topics 26 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Doing more than asking queries: Stored procedures, ORM

27 Topics 27 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Storing data in files: Storage Management

28 Topics 28 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Finding data in a big file really fast: Data access methods

29 Topics 29 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Translating complex queries to read & write: Query execution & optimization

30 Topics 30 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Coping with failure: Transaction Management

31 Topics 31 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Tuning

32 Relational Database Management 32 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes

33 Conceptual Design High level data model – Describe information in the database without worrying about implementation issues ER model is the most popular tool for conceptual design – Invented by Peter Chen in 1976 – Provides an easy-to-use language: pictures We review the basic stuff 33

34 ER Model/ Diagram 34 address namessn Person buys sells employs Publisher Book titlecategory address name price

35 ER Model Entity Set – An entity is distinctive real world object: cs540 textbook – An entity set is a collection of entities Attribute – Belongs to an entity – Does not contain any other attribute: atomic – Atomic data types: string, integer, real, … 35 Book titlecategory price PublisherBook

36 Relationship Describe relationships between entity sets Do not exists without entities May have attributes 36 Person employs Publisher Person employs Publisher startdate

37 Relationship Multiplicity One to one: – publisher - manager Many to one – book – publisher Many to many – publisher – person 37

38 Multi-way Relationships Relationships between more than two entity sets Each entity set has a different role in the relationship 38 Purchase Book Person Store buyer seller

39 ER Model: Keys Attribute(s) that uniquely identify entities – No standard way to annotate: usually underlined. Each entity set must have a key – Why? Relationships may also have keys 39 address namessn Person

40 Topics 40 Conceptual Design Physical Storage Schema Entity Relationship(ER) Model Relational Model Files and Indexes Modeling data and asking questions: Relational Model & Languages

41 Relational Model Relational model defines data organization and data retrieval/manipulation operations It is easier to implement than ER model It captures more details about the data 41

42 42 An Example Title Price Category Year MySQL $102.1 computer 2001 Cell biology $ biology 1954 French cinema $53.99 art 2002 NBA History $63.65 sport 2010 tuples Attribute names Relation name Book:

43 Relational Model Attributes – Atomic values – atomic types: string, integer, real, date, … Each relation must have keys – Attributes without duplicate values – A relation does not contain duplicate tuples. Reordering tuples does not change the relation. Reordering attributes does not change the relation. 43

44 Database Schema vs. Database Instance Schema of a Relation – Names of the relation and their attributes. – E.g.: Person (Name, Address, SSN) – Types of the attributes – Constraints on the values of the attributes Schema of the database – Set of relation schemata – E.g.: Person (Name, Address, SSN) Employment(Company, SSN) 44

45 Database Schema vs. Database Instance Schema: Book(Title, Price, Category, Year) Instance: 45 Title Price Category Year MySQL $102.1 computer 2001 Cell biology $ biology 1954 French cinema $53.99 art 2002 NBA History $63.65 sport 2010

46 Example Schema Beers(name, manf) Bars(name, addr, license) Drinkers( name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar) 46


Download ppt "CS 440: Database Management Systems 1: Introduction."

Similar presentations


Ads by Google