Persistence & Database Management Systems

Persistence & Database Management Systems
Lecture 15 Persistence & Database Management Systems

Data Storage and Persistence
Data in RAM: Data saved in virtual memory is gone, when a process ends The memory space is free to be used by other processes currently being executed. Data Storage: Data in storage remains unchanged beyond the end of the process lifecycle RTE provides services to processes to support data storage Examples: HDD, SD-Cards, CD, DVD, Blu-Ray, etc…

Persistence: Persistent Data Storage
Data that is stored in some storage mechanism In addition, it can be shared amongst multiple executing processes Examples: File System – in every data storage location Databases – Relational Databases, and Document Oriented Databases In other words: SQL and NoSQL databases We will introduce an introduction to SQL databases

The File System Maps names to storage locations:
Uses a tree to support the hierarchical structure / is root - the starting point Types of file systems: FAT(12, 16, 32), NTFS, exFAT, ext(2, 3, 4) Used by: Windows x86, WindowsNT, Flash-Drives, Linux Types of entries in Linux file system: File Directory (folder) Symbolic Links (short-cuts)

The File System The need behind a file system: Benefits:
Data stored would be one large body of data – a sequence by bytes No way to tell where one piece of information stops and the next begins Benefits: Data separated into pieces - each piece given a name Mapping names to storage locations The information can be isolated and identified Controls how data is stored and manipulated File system provides an interface to manipulate the data File System Interface: Accessing files using a stream object - In C: FILE*, in Java: FileReader Manipulating and locking files – through systems calls open, close, delete, create, get_files

Relational Databases Relational Data Bases define a richer model of data This in comparison to the file system These databases are built on top of the file system Relational databases consist of: Data stored as tables Relations between the tables Databases handle: Storing the database Providing an API for database manipulation Programmer handles: Defining the tables, and their relations

Relational Database Services
Implement a Data Model defined by a programmer. Storing is handled by the database application itself! Databases provide an interface for data storage and manipulation The programmer uses the interface to execute its queries API is access by using the database management system (object) DBMS Tasks: Manages storage of data Optimizes executed queries Allows efficient concurrent access – and keeps data integrity (concurrency safety!) But wait! There’s more!

Relational Databases Services
Manages secure access: Using credentials (username/password access) Ensures legal access to database and parts to database following permission table Session and Transaction management: Session is the lifecycle of an active connection to the database executed by a programmer Transaction is a series of queries and commands executed on the database as an atomic unit – either completely succeeds, or completely fails Back-end Services – not visible to the user: File and Access methods Buffer management Disk space management

Data Models Decides which data values can be stored in storage
Object Oriented Data Model: Objects (complex types) contain primitive fields as well as other objects Classes contain methods manipulating state of the Object Relational Database Model: Logical representation of information Data organized in Tables (called relations) Tables consist of labeled columns (attributes) Records – an entry in a table (one row, tuple) Relations between the tables using primary and foreign keys

Relational Model: Advantages
Simplicity: A relational data model is simpler than the hierarchical and network model. Structural Independence: The relational database is only concerned with data and not with a structure. This can improve the performance of the model. Easy to use: The relational model is easy as tables consisting of rows and columns Quite natural and simple to understand Query Capability: It makes possible for a high-level query language like SQL to avoid complex database navigation. Data Independence: The structure of a database can be changed without having to change any application. Scalable: Scalable in terms of a number of records, or rows, and the number of fields, The implementation is scalable – handled by the model manager

Example: University Data Model
Departments Each specializing in one subject Each has a one head of department (and other properties) Has one geographical location Students Each student have a name (and other properties) May be enrolled in one department Registered to multiple courses Courses Each course has a name (and other properties) Each course has a list of registered students

Relational Model Concepts: Basic Concepts
Tables: Relations are saved in the table format It is stored along with its entities A table has two properties rows and columns Rows represent records and columns represent attributes Attribute: Each column name in a Table. Attributes are the properties which define a relation Attribute domain – The pre-defined values and type of a specific attribute Tuple: A single row of a table Contains a single record in the table Relation Schema: A relation schema represents the name of the relation with its attributes

Step 1:define the table names students courses departments Step 2:define the table attributes (column names) students: name, address, phone number, … courses: name, credit points, teacher, … departments: name, head of department name, location, …

Relational Model Concepts: Table Key
Primary Key – An attribute that contains a unique value in a table Its value uniquely identifies one record in a table! Each table may have one primary key only. Primary keys are used to implement relations between tables! Foreign Key – It is an attribute in one table that references primary key of another table Foreign key values are taken from the referenced primary key By adding a primary key of a table as a foreign key to another table We effectively implement a relationship between these two tables! Composite Key – Consists of two or more attributes – can act as a primary or foreign key For two composite keys to be identical – they must be the same for each of their attributes

Relational Model – 1:1 relationship
1:1 – one to one relation: 1↔1 Each person has one fingerprint (of their finger) Each fingerprint belongs to one person 1:1 implementation: Add fingerprint to people table Add id to people table Make both id, fingerprint attributes unique Proposed solution adds fingerprint to people table! Solution 2: add person_id to fingerprints table instead Solution 3: create a new table containing both keys

Relational Model – 1:n relationship
𝑛:1 – many to one relation: 𝑛↔1 Each student has multiple exam papers Each exam paper belongs to one student only 𝑛:1 implementation: student primary key is added to exam papers table as foreign key Why it works? Ensures that each exam paper belongs to one student Ensures that each student has zero or more exam papers This relation is added to exam paper table

Relational Model – n:m relationship
𝑛:𝑚 – many to many relation: 𝑛↔𝑚 Each student is registered to multiple courses Each course has multiple students registered to it 𝑛:𝑚 implementation: A new table is created: students_courses student_id, course_id added as a new composite key Composite key is unique

Step 3:Identify unique and identifying attributes – For each table: students: each will have a unique number – student_id courses: each will have a unique number – course_id department: each will have a unique number – department_id A programmer may use a unique attribute as primary key or create an additional primary key, as they see fit!

Step 4:Detect relations between tables – Relation 1: Each student can be registered to multiple courses Each course has a list of multiple students registered to it Relationship type: 𝑛↔𝑚 Implementation: A new table is created: students_courses student_id, course_id added as a new composite key Composite key is unique

Step 4:Detect relations between tables – Relation 2: Each course belongs to one department Each department has a list of courses Relationship type: 1↔𝑛 Implementation: department primary key is added to courses table as foreign key

University Data Model: Proposed Schema
We wish to do three things: Implement the schema – effectively creating the database Add data to the database (and later, to modify, and delete data as needed Execute queries on the database These three tasks are done by using SQL – Structured Query Language

SQL – Structured Query Language
SQL is the language used to interact with relational databases! SQL consists of two parts: Data Definition Language - define schemas, relations and data domains Schemas: CREATE TABLE, DROP TABLE, ALTER TABLE Relations: Primary Key, Foreign Key, Composite Key Domains: Data types(such as integer, float, varchar, etc…), NULL, NOT NULL Data Manipulation Language - queries, insertions, updates and deletions Queries: SELECT Insertions: INSERT INTO Updates: UPDATE Deletions: DELETE FROM First, we will implement the schema using the Data Definition Language Then, we will manipulate the data using the Data Manipulation Language

Schema: Implementation
CREATE TABLE Courses( id INT, name VARCHAR(50) NOT NULL, credit_points INT, department_id INT, PRIMARY KEY(id), FOREIGN KEY(department_id) REFERENCES Departments(id) ); CREATE TABLE Departments( head_name VARCHAR(50) NOT NULL, location VARCHAR(30), PRIMARY KEY(id) CREATE TABLE Students( name varchar(50), CREATE TABLE Students_Courses( student_id INT, course_id INT, PRIMARY KEY(student_id, course_id), FOREIGN KEY(student_id) REFERENCES Students(id), FOREIGN KEY(course_id) REFERENCES Courses(id)

Database Schema: Adding Data to Database
INSERT INTO Departments(id, name, head_name, location) VALUES (0, "Computer Science", "Ohad", "Beer-Sheva"); INSERT INTO Students(id, name) VALUES(0, "Joe"), (1, "Mark"); INSERT INTO Courses(id, name, credit_points, department_id) VALUES (1, "Introduction to Computer Science", 5, 0), (0, "Systems Programming", 5, 0); INSERT INTO Students_Courses(student_id , course_id) VALUES (0, 0), (1, 0), (1, 1);

Database Data: Executing Queries
This database is rich with information, beyond the data found. Queries that can be executed: Return the number of students registered to Course of id equals 0 Return the number of students registered to Systems Programming Course Two important tools are needed to execute these queries: SELECT FROM – allows to execute the query WHERE – conditions on the query COUNT – aggregation functions (others: SUM, AVG) INNER JOIN – combines tables together as one (others: LEFT, RIGHT, OUTER) Enables us to retrieve information from several tables at once Applies cross product on two tables following some condition

Return the number of students registered to Course of id equals 0
We apply an aggregation function on a query on a single table using one condition: Result? 2 What if we do not have the id? Just the name of the course? See next slide SELECT COUNT(student_id) FROM Students_Courses WHERE course_id = 0; GROUP BY student_id

Return the number of students registered to Systems Programming Course
In this case we don’t know the id, but we have its name This requires us to work on two different tables Proposition 1: Using the course name we find its id by applying a query on Courses table Using the returned result, we execute a query on Students_Courses table Proposition 2: Applying a COUNT query on the joined table Courses-Students_Courses which is the result of join operation on Courses, and Students_Courses tables Where Courses.id = Students_Courses.id

Proposition 1 Query 1: Query 2: Result? SELECT id INTO @our_course_id
our_course_id is the result of the first query. Result? 2 SELECT id FROM Courses WHERE Courses.name = 'Systems Programming'; SELECT COUNT(student_id) FROM Students_Courses WHERE course_id GROUP BY student_id;

Proposition 2 SELECT COUNT(*) FROM Students_Courses INNER JOIN Courses
This query consists of three parts: JOIN operation between Students_Courses and Courses tables A condition on the name of the course An aggregation function that counts the number of records in the result SELECT COUNT(*) FROM Students_Courses INNER JOIN Courses ON Students_Courses.course_id=Courses.id WHERE name = "Systems Programming"; GROUP BY Courses.id

SELECT * FROM Students_Courses INNER JOIN Courses ON Students_Courses.course_id=Courses.id; Proposition 2: Step 1 Apply inner product between Courses and Students_Courses tables Where Students_Courses.course_id equals Courses.id This combines both tables, as one on which we will execute the query

Proposition 2: Step 2 Add condition: name = “Systems Programming”
SELECT * FROM Students_Courses INNER JOIN Courses ON Students_Courses.course_id=Courses.id WHERE name = "Systems Programming"; Proposition 2: Step 2 Add condition: name = “Systems Programming” This filters out unneeded records

Proposition 2: Step 3 Add aggregation function: COUNT(*)
This returns the number of records, instead the lines themselves Result? 2 SELECT COUNT(*) FROM Students_Courses INNER JOIN Courses ON Students_Courses.course_id=Courses.id WHERE name = "Systems Programming"; GROUP BY Courses.id

Transactions It is a series of commands on queries to be applied on the database one after another in a series Transactions follow the ACID properties: Atomicity – treated as one unit Consistency – keep the database invariants Isolated – transaction execution is invisible until it completes Durability – data manipulation is permanent Transaction Execution: Begin the transaction Execute several data manipulations and queries If no errors occur then commit the transaction If errors occur then rollback the transaction End transaction Examples of Transactions: Moving money from one bank account to another bank account Steps: Money is removed from one bank account Money is added to another bank account If there is a power failure after step 1 and before step 2. Client can lose money!

Transactions: Atomicity
Transactions may be composed of multiple statements Atomicity guarantees that a transaction is treated as a single unit: The unit succeeds, or the transaction fails completely If at least one of the statements fails, the transaction fails and the database is left unchanged! An atomic system must guarantee atomicity in every situation: power failures, errors and crashes Once a transaction completes successfully it is committed to the storage

Transactions: Consistency
Ensures that a transaction brings the database from one valid state to another. A valid state is a state that keep invariants of the database: constraints, cascades, triggers This prevents database corruption by an illegal transaction

Transactions: Isolation
Determine how transaction execution is visible to other users and other systems Isolation is typically defined at database level as a property that defines how and when the changes made by one operation become visible to other Most DBMSs offer a number of isolation levels, which control the degree of locking that occurs when selecting data The more locking – the less others see!

Transactions: Durability
In database systems, durability guarantees that transactions that have committed will survive permanently Example, if a flight booking reports that a seat has successfully been booked, then the seat will remain booked even if the system crashes! Durability can be achieved by: Storing the transaction's log records in some non-volatile storage (not RAM!) Done before acknowledging commitment In distributed transactions, all participating servers must coordinate before commit can be acknowledged Durability is implemented by writing transactions into a transaction log It can be reprocessed to recreate the system state right before any failure A transaction is deemed committed only after it is entered in the log

Transaction: Code Example
-- start a new transaction START TRANSACTION; -- Get the latest order number FROM orders; -- insert a new order for customer 145 INSERT INTO Orders(number, date, required_date, shipped_date, status, customer_number) ' ', ' ', ' ', 'In Process', 145); -- Insert order line items INSERT INTO Order_Details(number, product_code, quantity_ordered, price_each, order_line_number) VALUES 30, '136', 1), 50, '55.09', 2); -- commit changes COMMIT;

Persistence & Database Management Systems

Similar presentations

Presentation on theme: "Persistence & Database Management Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Persistence & Database Management Systems

Similar presentations

Presentation on theme: "Persistence & Database Management Systems"— Presentation transcript:

Similar presentations

About project

Feedback