Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU.

Similar presentations


Presentation on theme: "1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU."— Presentation transcript:

1 1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU

2 2 CENG 351 Data Management: What is it about ?  managing data as a valuable resource  development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle

3 3 CENG 351 Data Management: What is it about ?  Database  Database Management Systems

4 4 CENG 351 Hardware Operating System DBMS File system Application Where do File Structures fit in ?

5 5 CENG 351 File Structures: What is it about ?  Storage of data  Organization of data  Access to data  Processing of data

6 6 CENG 351 Data Structures vs. File Structures Both involve: –Representation of Data + –Operations for accessing data Difference: –Data structures: deal with data in main memory –File structures: deal with data in secondary storage

7 7 CENG 351 Computer Architecture Main Memory (RAM) Secondary Storage data transfer data is manipulated here data is stored here - Semiconductors - Fast, expensive, volatile, small - disks, tape - Slow,cheap, stable, large

8 8 CENG 351 Introduction to Database Systems Slides from: Ramakrishnan & Gehrke, Elmasri

9 9 CENG 351 What Is a DBMS?  Database: A very large, integrated collection of data.  Data: Known facts that can be recorded and have an implicit meaning  Models real-world enterprise (also called a mini- world). – Entities (e.g., products, sales) – Relationships (e.g., 512 IPAD3 has been sold CEPA office between August 2012-November 2012)  A Database Management System (DBMS) is a software package designed to store and manage databases.

10 10 CENG 351 Why Study Databases?  Shift from computation to information – at the “low end”: scramble to webspace (a mess!) – at the “high end”: scientific applications  Datasets increasing in diversity and volume. – Digital libraries, interactive video, Human Genome project, EOS-Earth Observation System project –... need for DBMS exploding  DBMS encompasses most of CS – OS, languages, theory, “AI”, multimedia, logic

11 11 CENG 351 Why Use a DBMS?  Data independence and efficient access.  Reduced application development time.  Data integrity and security.  Uniform data administration.  Concurrent access, recovery from crashes.

12 12 Slide 1-12 Types of Databases and Database Applications  Numeric and Textual Databases  Multimedia Databases  Geographic Information Systems (GIS)  Data Warehouses  Real-time and Active Databases …

13 13 CENG 351 Data Models  A data model is a collection of concepts for describing data.  A schema is a description of a particular collection of data, using the given data model.  The relational model of data is the most widely used model today. –Main concept: relation, basically a table with rows and columns. –Every relation has a schema, which describes the columns, or fields.

14 14 CENG 351 Levels of Abstraction  Many external schemata, single conceptual(logical) schema and single physical schema.  External schemata describe how users see the data.  Conceptual schema defines logical structure  Physical schema describes the files and indexes used.  Schemas are defined using DDL; data is modified/queried using DML. Physical Schema Conceptual Schema External Schema 1 External Schema 3 External Schema 2

15 15 CENG 351 Example: University Database  Conceptual schema: – Students(sid: string, name: string, login: string, age: integer, gpa:real) – Courses(cid: string, cname:string, credits:integer) – Enrolled(sid:string, cid:string, grade:string)  Physical schema: – Relations, stored as unordered files. – Index, on first column of Students.  External Schema (View): – Course_info(cid:string,enrollment:integer)

16 16 CENG 351 Instance of Students Relation Students( sid: string, name: string, login: string, age: integer, gpa: real ) sidnameloginagegpa 53666Jonesjones@cs183.4jones@cs 53688Smithsmith@ee183.2smith@ee 53650Smithsmith@math193.8smith@math

17 17 CENG 351 Data Independence Applications insulated from how data is structured and stored.  Logical data independence: Protection from changes in logical structure of data.  Physical data independence: Protection from changes in physical structure of data. *One of the most important benefits of using a DBMS is Concurrency Control

18 18 CENG 351 Structure of a DBMS  A typical DBMS has a layered architecture.  Each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery

19 19 CENG 351 Concurrency Control  Concurrent execution of user programs is essential for good DBMS performance.  Because disk accesses are frequent, and relatively slow, it is important to keep the CPU busy by working on several user programs concurrently.  Interleaving actions of different user programs can lead to inconsistency: e. g., check is cleared while account balance is being computed is an inconsistency.  DBMS ensures such problems don’t arise: users can assume as if they are using a single- user system.

20 20 CENG 351 Transaction: An Execution of a DB Program  Key concept is transaction, which is an atomic sequence of database actions (reads/ writes).  Each transaction, executed completely, must leave the DB in a consistent state, if DB is consistent when the transaction begins.  Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints.

21 21 CENG 351 Ensuring Atomicity  DBMS ensures atomicity (all- or- nothing property) even if system crashes in the middle of a Tx.  Idea: Keep a log (history) of all actions carried out by the DBMS while executing a set of Tx:  Before a change is made to the database, the corresponding log entry is forced to a safe location. (OS support for the log is often inadequate.)  After a crash, the effects of partially executed transactions are undone using the log.

22 22 CENG 351 Summary ❖ DBMS used to maintain and query large datasets. ❖ Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. ❖ Levels of abstraction give data independence. ❖ A DBMS typically has a layered architecture.

23 23 When not to use a DBMS  Main inhibitors (costs) of using a DBMS: –High initial investment and possible need for additional hardware. –Overhead for providing generality, security, concurrency control, recovery, and integrity functions.  When a DBMS may be unnecessary: –If the database and applications are simple, well defined, and not expected to change. –If there are stringent real-time requirements that may not be met because of DBMS overhead. –If access to data by multiple users is not required.

24 24 When not to use a DBMS  When no DBMS may suffice: –If the database system is not able to handle the complexity of data because of modeling limitations –If the database users need special operations not supported by the DBMS.


Download ppt "1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU."

Similar presentations


Ads by Google