1Introduction Bayu Adhi Tama, ST., MTI. Slides are adapted from Elmasri and Silberschatz with necessary modification
2Database Management System (DBMS) DBMS contains information about a particular enterpriseCollection of interrelated dataSet of programs to access the dataAn environment that is both convenient and efficient to useDatabase Applications:Banking: all transactionsAirlines: reservations, schedulesUniversities: registration, gradesSales: customers, products, purchasesOnline retailers: order tracking, customized recommendationsManufacturing: production, inventory, orders, supply chainHuman resources: employee records, salaries, tax deductionsDatabases touch all aspects of our lives
3Purpose of Database Systems In the early days, database applications were built directly on top of file systemsDrawbacks of using file systems to store data:Data redundancy and inconsistencyMultiple file formats, duplication of information in different filesDifficulty in accessing dataNeed to write a new program to carry out each new taskData isolation — multiple files and formatsIntegrity problemsIntegrity constraints (e.g. account balance > 0) become “buried” in program code rather than being stated explicitlyHard to add new constraints or change existing onesDrawback = disadvantage
4Levels of AbstractionPhysical level: describes how a record (e.g., customer) is stored.Logical level: describes data stored in database, and the relationships among the data.type customer = recordcustomer_id : string; customer_name : string; customer_street : string; customer_city : string;end;View level: application programs hide details of data types. Views can also hide information (such as an employee’s salary) for security purposes.
5An architecture for a database system View of DataAn architecture for a database system
6Instances and SchemasSimilar to types and variables in programming languagesSchema – the logical structure of the databaseExample: The database consists of information about a set of customers and accounts and the relationship between them)Analogous to type information of a variable in a programPhysical schema: database design at the physical levelLogical schema: database design at the logical levelInstance – the actual content of the database at a particular point in timeAnalogous to the value of a variablePhysical Data Independence – the ability to modify the physical schema without changing the logical schemaApplications depend on the logical schemaIn general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.
7Data Manipulation Language (DML) Language for accessing and manipulating the data organized by the appropriate data modelDML also known as query languageTwo classes of languagesProcedural – user specifies what data is required and how to get those dataDeclarative (nonprocedural) – user specifies what data is required without specifying how to get those dataSQL is the most widely used query language
8Data Definition Language (DDL) Specification notation for defining the database schemaExample: create table account ( account_number char(10),branch_name char(10),balance integer)DDL compiler generates a set of tables stored in a data dictionaryData dictionary contains metadata (i.e., data about data)Database schemaData storage and definition languageSpecifies the storage structure and access methods usedIntegrity constraintsDomain constraintsReferential integrity (e.g. branch_name must correspond to a valid branch in the branch table)Authorization
9Relational Model Example of tabular data in the relational model Attributes
11SQL SQL: widely used non-procedural language Example: Find the name of the customer with customer-id select customer.customer_name from customer where customer.customer_id = ‘ ’Example: Find the balances of all accounts held by the customer with customer-id select account.balance from depositor, account where depositor.customer_id = ‘ ’ and depositor.account_number = account.account_numberApplication programs generally access databases through one ofLanguage extensions to allow embedded SQLApplication program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database
12Database DesignThe process of designing the general structure of the database:Logical Design – Deciding on the database schema. Database design requires that we find a “good” collection of relation schemas.Business decision – What attributes should we record in the database?Computer Science decision – What relation schemas should we have and how should the attributes be distributed among the various relation schemas?Physical Design – Deciding on the physical layout of the database
13The Entity-Relationship Model Models an enterprise as a collection of entities and relationshipsEntity: a “thing” or “object” in the enterprise that is distinguishable from other objectsDescribed by a set of attributesRelationship: an association among several entitiesRepresented diagrammatically by an entity-relationship diagram:
14Other Data ModelsObject-oriented data modelObject-relational data model
16Database Management System Internals Storage managementQuery processingTransaction processing
17Storage ManagementStorage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.The storage manager is responsible to the following tasks:Interaction with the file managerEfficient storing, retrieving and updating of dataIssues:Storage accessFile organizationIndexing and hashing
18Query Processing1. Parsing and translation2. Optimization3. Evaluation
19Query Processing (Cont.) Alternative ways of evaluating a given queryEquivalent expressionsDifferent algorithms for each operationCost difference between a good and a bad way of evaluating a query can be enormousNeed to estimate the cost of operationsDepends critically on statistical information about relations which the database must maintainNeed to estimate statistics for intermediate results to compute cost of complex expressions
20Transaction Management A transaction is a collection of operations that performs a single logical function in a database applicationTransaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.
21History of Database Systems 1950s and early 1960s:Data processing using magnetic tapes for storageTapes provide only sequential accessPunched cards for inputLate 1960s and 1970s:Hard disks allow direct access to dataNetwork and hierarchical data models in widespread useTed Codd defines the relational data modelWould win the ACM Turing Award for this workIBM Research begins System R prototypeUC Berkeley begins Ingres prototypeHigh-performance (for the era) transaction processing
22History (cont.)1980s:Research relational prototypes evolve into commercial systemsSQL becomes industry standardParallel and distributed database systemsObject-oriented database systems1990s:Large decision support and data-mining applicationsLarge multi-terabyte data warehousesEmergence of Web commerce2000s:XML and XQuery standardsAutomated database administrationIncreasing use of highly parallel database systemsWeb-scale distributed data storage systems
23Extending Database Capabilities 4/6/2017New functionality is being added to DBMSs in the following areas:Scientific ApplicationsImage Storage and ManagementAudio and Video data managementData MiningSpatial data managementTime Series and Historical Data ManagementThe above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems.
24Database UsersUsers are differentiated by the way they expect to interact withthe systemApplication programmers – interact with system through DML callsSophisticated users – form requests in a database query languageSpecialized users – write specialized database applications that do not fit into the traditional data processing frameworkNaïve users – invoke one of the permanent application programs that have been written previouslyExamples, people accessing database over the web, bank tellers, clerical staff
25Database Administrator Coordinates all the activities of the database systemhas a good understanding of the enterprise’s information resources and needs.Database administrator's duties include:Storage structure and access method definitionSchema and physical organization modificationGranting users authority to access the databaseBacking up dataMonitoring performance and responding to changesDatabase tuning
26Database Architecture The architecture of a database systems is greatly influenced bythe underlying computer system on which the database is running:CentralizedClient-serverParallel (multiple processors and disks)Distributed
27Main inhibitors (costs) of using a DBMS: 4/6/2017When not to use a DBMSMain inhibitors (costs) of using a DBMS:High initial investment and possible need for additional hardware.Overhead for providing generality, security, concurrency control, recovery, and integrity functions.When a DBMS may be unnecessary:If the database and applications are simple, well defined, and not expected to change.If access to data by multiple users is not required.
28When no DBMS may suffice: 4/6/2017When not to use a DBMSWhen no DBMS may suffice:If there are stringent real-time requirements that may not be met because of DBMS overhead.If the database system is not able to handle the complexity of data because of modeling limitationsIf the database users need special operations not supported by the DBMS.Suffice = enoughStringent = strength