CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications.

CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications

Lecture 8 Extended Topics / 2 Introduction  As a prelude, there is much development, change and ‘new products’ in the database environment  In this lecture we will cover some of this in broad detail only  It is an environment of much energy - and it is driven largely by the Business sector - and by anticipation  It is an environment of constant change

Lecture 8 Extended Topics / 3 Some Extended Topics  The materials in this lecture will cover a wide scope of databases and their purpose or contribution to the Business world  optimisation concepts  some of the additional functions in SQL  extended scope of access languages and hardware  relationship to changing and / or expanding Business conditions  relational and non-relational data  Open Source DBMSs

Lecture 8 Extended Topics / 4 Introduction  The Relational Data model was first proposed by E.F. Codd in 1970.  It is now the the dominant model for commercial database implementations.  It is based on sound theoretical foundation.  Dr. Codd also developed his 12 Rules which we will now examine briefly  Notice that he did not ‘create’ SQL - he did create its foundation

Lecture 8 Extended Topics / 5 1. The Information Rule 2. The Guaranteed Access Rule 3. Systematic Treatment of Nulls 4. Active On-Line Catalog Based on the Relational Model 5. The Comprehensive Data Sub-Language Rule 6. The View Updating Rule Codd’s 12 Rules (plus 1)

Lecture 8 Extended Topics / 6 Codd’s 12 Rules 7. High-level Insert,Update and Delete 8. Physical Data Independence 9. Logical Data Independence 10. Integrity Independence 11. Distribution Independence 12. The ‘Non-Subversion’ Rule

Lecture 8 Extended Topics / 7 “For any system that is advertised, or claimed to be, a RELATIONAL DATABASE MANAGEMENT SYSTEM, that system must be able to manage databases entirely through its relational capabilities ”  Codd, E.F., An Evaluation Scheme for Database Management Systems that are claimed to be Relational Codd’s Rule 0

Lecture 8 Extended Topics / 8 Rule 1: Information Presentation  All information in a relational database is represented explicitly at the logical level and in only one method - by values in tables  ‘All information’ includes both data and metadata, such as table names, attribute names and domain names  The reference to ‘logical level’ means that physical constructs (pointers and indexes) are not represented and need not be explicitly referenced in query writing, even if they exist

Lecture 8 Extended Topics / 9 Rule 1 cont’d  For user productivity and also to make vendors’ efforts in defining software packages easier and with low, or no, error level  To make the DBA task of maintaining the database in a state of overall integrity both simpler and more effective  Sometimes referred to as the basic principle of the relational model

Lecture 8 Extended Topics / 10 Each and every datum (atomic value) in a relational database is guaranteed to be logically accessible by using the combination of the tablename, primary key value, and attribute name.  This specifies a minimal accessibility in terms of content - the names of data and the unique primary key value. No data are to be accessed by artificial paths,(linked lists or physical sequential scanning). The relational model deals only with data at a functional or logical level, not physical constructs, and is a consequence of Rule 1. Rule 2 - Guaranteed Access

Lecture 8 Extended Topics / 11 Rule 2 - Guaranteed Access  The Primary Key concept is essential to the relational model. One of the structural features requires each base relation to have an explicitly declared primary key  Every individual scalar value in the database must be logically addressable by specifying the table name, the attribute name, and the primary key value of the containing row.

Lecture 8 Extended Topics / 12 Rule 3: Systematic Treatment of Null Values  Null values (distinct from an empty character string or a string of blank characters (white space), and distinct from zero or any other number), are supported in fully relational DBMS for representing missing information and inapplicable information in a systematic manner (independent of data type)  ‘Nulls not allowed’ can be specified to provide data integrity on primary keys or any other attribute for which nonexistent values are appropriate (e.g. foreign key attributes)

Lecture 8 Extended Topics / 13 Rule 3 cont’d  The systematic, uniform representation means that only one technique needs to be used to deal with null values.  The treatment of null values must be persistent and be applied at any value change in order to maintain integrity

Lecture 8 Extended Topics / 14 The database description is represented at the logical level just like normal data. Authorised users can apply the same relational language for queries of the catalog.  Only one data model is used for both data and metadata and only one manipulation sublanguage is used.  The catalog of definitions would become a data dictionary by including ‘data about data’ appropriate for an application Rule 4: Dynamic On-Line Catalog based on the RM

Lecture 8 Extended Topics / 15 Rule 4 con’t  Data definitions are stored in only one place given this rule (the schema)  A consequence of this Rule, and Rule 1, is that the distinction between data and metadata is no longer clear. Both can serve as a basis for information to and inquiry by a user (e.g.how many occurrences of a specific attribute value occur ?)

Lecture 8 Extended Topics / 16  A relational system may support several languages and various modes of terminal use. There must be at least one language  - whose statements can be expressed by well- defined character strings  and Rule 5: Comprehensive Data Sublanguage

Lecture 8 Extended Topics / 17 Rule 5  (b) which is comprehensive in supporting ALL of these items:  1. data definitions  2. view definitions  3. data manipulation (interactive and by program)  4. integrity constraints  5. authorisation  6. transaction boundaries (begin, commit, rollback)

Lecture 8 Extended Topics / 18 Rule 5  To create a comprehensive environment which completely envelopes another task or other tasks. It follows from Rule 4 (in part) since data definitions must be accessible from the manipulation sublanguage

Lecture 8 Extended Topics / 19 All views that are theoretically updatable are also updatable by the system  A ‘view’ is theoretically updatable is there is an update procedure which, when applied at any time to the base tables of a view, will have the same effect as the requested modification of the view, i.e. the update of the base tables necessary to effect the change in the view must be unambiguously derived by the system. e.g. increasing an extended price column value in a view (both values from the base tables) is a function of the view, not the DBMS. Rule 6: View Updating

Lecture 8 Extended Topics / 20 Rule 6: View Updating  Updating a product description in a view is reliant on a base table update, and therefore must be supported by this rule. This is an unambiguous interpretation.

Lecture 8 Extended Topics / 21 The capability of holding a base relation or a derived relation as a single operand applies to the insertion, update and deletion of data as well as to retrieval. The effect is to  give the system scope in optimising the efficiency of its execution time actions. The system can determine  the best sequence in which to execute relational activities  which access paths to exploit to obtain the most efficient code Rule 7: High Level Insert, Update, Delete

Lecture 8 Extended Topics / 22 Rule 7: High Level Insert, Update, Delete  - treat all operators as set operators, not row only operators. A set of rows can be deleted in one statement or a set of rows can be modified in a common way in one command The Rule is extremely important in obtaining efficient handling of transactions across a distributed database

Lecture 8 Extended Topics / 23 Application programs and terminal activities remain logically unimpaired whenever changes are made in either storage representations or access methods  DBMS must support a clear, sharp boundary between the logical and semantic aspects, and the physical and performance aspects of the database; application programs must deal with the logical aspects only Rule 8: Physical Data Independence

Lecture 8 Extended Topics / 24 Rule 8: Physical Data Independence  ‘Any Changes’ implies a pure view of physical data independence, i.e. a query would be written the same irrespective if an index existed or not on a qualified attribute. Programs in a Network system would change dependent on the existence of an index, hashing function, or similar  Rule 8 implies that the optimum retrieval sequence construction is the function of the DBMS, not the user.

Lecture 8 Extended Topics / 25 Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind, that theoretically permit unimpairment, are made to the base tables  Rules 8 and 9 permit a database designer to make alterations, to evolve or to correct database definition at any point without redefining or reloading. Provided no data is lost from the restructuring, no application or inquiry activity should require change. Rule 9: Logical Data Independence

Lecture 8 Extended Topics / 26 Rule 9:Logical Data Independence  It must be possible to preserve prior definitions through views, and these views must be able to be updated (Rule 6) as long as the database restructuring does not lose information.  The physical and logical independence rules permit database designers to avoid heavy penalties for any errors in initial design i.e. iterative improvement of the database structure is achievable

Lecture 8 Extended Topics / 27 Rule 10: Integrity Independence Integrity constraints specific to a particular relational database must be definable in the relational data sub-language and stored in the catalog (not in the application programs)

Lecture 8 Extended Topics / 28 Rule 10: Integrity Independence  This rule covers the ability to define controls on the values which attributes may assume as part of the database definitions. Such Rules may restrict values to be within, or outside, a certain range, to be part of a set of permitted values, to be not null, and to be a value from some other attribute of the database ( a non-null foreign key must match some current value from a row of the table with that as a primary key). User defined constraints should be possible ( !> 5 lines per order).

Lecture 8 Extended Topics / 29 Rule 10: Integrity Independence  The integrity rules must be able to be changed when external conditions change. When changed, violations must be identified and existing programs/inquiries must still be able to process.

Lecture 8 Extended Topics / 30 Rule 11 Distribution Independence  The data manipulation sublanguage of a relational DBMS must enable application programs and inquiries to remain logically the same whether and whenever data are physically centralised or distributed  Logical unimpairment when data distribution is first introduced and also when (if) the data is redistributed

Lecture 8 Extended Topics / 31 Rule 11 Distribution Independence  Distinguish distributed processing from distributed data. The Rule does not say that a DBMS must support distributed database to be fully relational, but it does say that the data manipulation language would remain the same if and when this capability were introduced

Lecture 8 Extended Topics / 32 If a relational system has a low level language (single record-at-a-time), that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher level relational language.  All data manipulation languages supported by the relational DBMS must rely only on the stored database definition (including integrity rules and security constraints) for control of processing. This Rule and Rule 5 imply that it should be not possible nor necessary to access a relational database using any language that bypasses the definition catalog Rule 12: NonSubversion

Lecture 8 Extended Topics / 33 Some More Oracle Aspects  In the next few overheads, we will be looking at  Some Inside Information on the Data Dictionary or the System Catalog (ue)  Query Optimisation

Lecture 8 Extended Topics / 34 Oracle maintains information about all tables, views, indexes and many other objects that make up the database in special system maintained tables. (So do other RDBMs) These tables make up the Oracle Data Dictionary and are accessible through data dictionary views. Oracle Data Dictionary

Lecture 8 Extended Topics / 35 Oracle Data Dictionary  Most data dictionary views begin with one of three prefixes:  USER - objects owned by the account performing the query.  ALL - USER objects plus information about objects to which public or account user access has been granted.  DBA - all database objects regardless of owner.

Lecture 8 Extended Topics / 36 Oracle Data Dictionary VIEW DESCRIPTION DICT Data dictionary objects USER_TABLES User’s own tables USER_VIEWSUser’s own views USER_INDEXES User’s own indexes USER_TABLESPACEUser accessible tablespaces USER_CATALOG Objects owned by user. ALL_CATALOGAs above + other accessible objects. Some useful data dictionary views

Lecture 8 Extended Topics / 37 Oracle Data Dictionary SQL> select table_name from user_tables; TABLE_NAME ------------------------------ BONUS DEPARTMENT DEPT EMP SALGRADE STUDENT 6 rows selected. (you should be familiar with this)

Lecture 8 Extended Topics / 38 Oracle Data Dictionary SQL> select table_name,tablespace_name from USER_TABLES; TABLE_NAME TABLESPACE_NAME ------------------------ ------------------------------ BONUS USER_DATA DEPARTMENT USER_DATA DEPT USER_DATA EMP USER_DATA SALGRADE USER_DATA STUDENT USER_DATA 6 rows selected.

Lecture 8 Extended Topics / 39 Oracle Data Dictionary SQL> select table_name,tablespace_name from ALL_TABLES; TABLE_NAME TABLESPACE_NAME ------------------------------ --------------------------- DUAL SYSTEM SYSTEM_PRIVILEGE_MAP SYSTEM TABLE_PRIVILEGE_MAP SYSTEM STMT_AUDIT_OPTION_MAP SYSTEM AUDIT_ACTIONS SYSTEM PSTUBTBL SYSTEM DEPT USER_DATA EMP USER_DATA BONUS USER_DATA …………………….and many more !

Lecture 8 Extended Topics / 40 Oracle Data Dictionary SQL> select index_name,table_name from USER_INDEXES; INDEX_NAME TABLE_NAME ----------------------------------------- DEPT_INDX DEPARTMENT EXP_INDEX STUDENT PK_DEPT DEPT PK_EMP EMP SURNAME_INDEX STUDENT SYS_C00589 STUDENT 6 rows selected.

Lecture 8 Extended Topics / 41 Oracle Data Dictionary SQL> select * from USER_CATALOG; TABLE_NAME TABLE_TYPE ------------------------------ ----------- BONUS TABLE DEPARTMENT TABLE DEPT TABLE EMP TABLE ENROLMENT TABLE SALGRADE TABLE STUDENT TABLE STUDENT_DATA VIEW 8 rows selected.

Lecture 8 Extended Topics / 42 The “costs” of executing a query are made up of access cost to files main memory processing cost writing and storing any intermediate results communication costs (distributed system) writing final results to storage Some aspects : Hashing, Indexing, Clustered index, Secondary key indexing, Sorted file, B+ tree, No Index (Binary search) Linear Search (Unordered file, No Index) etcetera, etcetera....... Query Costs

Lecture 8 Extended Topics / 43 Optimiser Operation Modes  The Oracle Optimiser has 3 modes of operation. 1. Rule 2. Cost 3. Choose (when a user cannot decide).  The choice of optimiser is normally set up at installation and resides in a table known as init.ora  It can be superceded by a user at query and session level.

Lecture 8 Extended Topics / 44 Modes 1. Rule This software evaluates possible execution paths and ranks the alternative execution paths on syntactical rules (Rule Based Optimiser - RBO) 2. Cost Evaluates the ‘cost’ of available execution paths, and selects the least or lowest relative cost. It needs to be associated with another function called analyse (and this needs to be run frequently) (Cost based optimiser - CBO)

Lecture 8 Extended Topics / 45 Modes 3. Choose Invokes the CBO if the tables have been analysed Invokes the Rule Based optimiser if the tables have not been analysed. Generally regarded as not being a good method. And not available to the ‘normal’ user.

Lecture 8 Extended Topics / 46 Full Table Scan - Table Access Full  Row access is by one of 2 methods  A full table scan  A RowID based table access  So, what are these ?  A full table scan reads each row of a table sequentially. This operation is known as TABLE ACCESS FULL. Oracle reads multiple blocks during each database read. A block is normally 2048 bytes.  A full table scan is used if no WHERE clause occurs in the query

Lecture 8 Extended Topics / 47 Table Access - Full Table Scan  The ‘cost’ of this increases as the size of the table increases - performance drops off.  If there are multiple concurrent users using full table scans on the same table, then performance drops off alarmingly - and the result is unhappy users.  A typical example of a full table scan query would be select * from (tablename) with possibly an extension of order by (attribute).

Lecture 8 Extended Topics / 48 RowID Access  All rows in all tables, including the catalogue tables, have RowID values.  The RowID records the file and block reference and also a sequential number in the block.  Oracle uses indexes to relate data values with RowID values - and this leads to the physical location(s) of data in the database.

Lecture 8 Extended Topics / 49 RowID Access  A ‘typical RowID’ would look like 00001234.0001.0010 meaning block 1234, file 0 and row 1 in that file. The Block Reference is 10.  ‘File’ is an Oracle term meaning a storage area to store database data  The RowID given above is in hexadecimal.

Lecture 8 Extended Topics / 50 RowID Access  This is the fastest way to a access a Row in a table.  Not many person-users know the RowID values of data - and even if they did, the RowID’s would change as the table contents altered.  The use of indexing to access the RowID’s is used to improve performance

Lecture 8 Extended Topics / 51 Indexes and Indexing  Oracle has 2 major type of indexes:-  Unique : - each row contains a unique value for the indexed columns (remember the ‘constraint’ features in the create table command ?)  Non-Unique : The indexed value for rows can repeat (remember the 1:M primary key/foreign key ?. And the distinct possibility of a repeating value in many rows - such as PostCode ?)

Lecture 8 Extended Topics / 52 Indexes and Indexing  Consider this schema: create table parts ( part_id varchar2(6) primary key, part_name varchar2(25), stocking_qty number(5,0), charge_price number(4,2), supplier_id varchar2(12)); The ‘primary key’ constraint motivates Oracle to create a unique index. Supplier_id could be non-unique - an index on this would be useful.

Lecture 8 Extended Topics / 53 Indexes and Indexing  This would be : create index parts$supplier_id on parts(supplier_id) tablespace INDEXES There are now 2 indexes on this table, and either (or both) could be used in a query. To use an index, the query must be written to allow the optimiser to use the index - normally via the ‘where’ clause. E.g. Select * from parts where part_id = ‘CA4180’;

Lecture 8 Extended Topics / 54 Indexes and Indexing  2 operations take place.  1. The primary key index will be accessed via an INDEX UNIQUE SCAN. The RowID which matches the part number (CA4180) will be returned from the index  2. This RowID will be used to locate the row via a Table Access by RowID operation.  IF the value required (CA4180) had been contained within the index, the Table Access by RowID would not have been necessary. The data would have been in the index.  By the way, how many rows would have been returned ?

Lecture 8 Extended Topics / 55 Indexes and Indexing  An INDEX RANGE SCAN.  This is used where  a query is over a range of values  or, a query uses a non-unique index  select part_id from parts where part_id like ‘C%’;  the ‘where’ clause cannot specify a unique value  This means that the Primary Key index will be accessed by an Index Range Scan operation.  This requires more accesses BUT as the values for part_id are stored in the Primary Key index, the data table (parts) is not accessed.

Lecture 8 Extended Topics / 56 Indexes and Indexing  In this case, the primary key INDEX will be accessed by an Index Range Scan.  If the query was of the form select supplier_id from parts where supplier_id =‘Smith’; then the Index ‘parts$supplier_id’ would be scanned by an Index Range Scan, and Table Access by RowID performed for each occurrence of ‘Smith’.  If 2 columns or a string and a column are concatenated, then indexes on those columns will not be used.

Lecture 8 Extended Topics / 57 Indexes and Indexing  The previous notes led to the decision of an index being used, or not.  The Cost Based Optimiser (CBO) calculates whether the use of an index will lower the ‘cost’ of a query, or not.

Lecture 8 Extended Topics / 58 Indexes and Indexing  If there are 800 distinct values in an attribute set (column) of 1000 rows, then the index selectivity for that set of values is 800/1000 or 0.8. A higher selectivity value would result in fewer number of rows returned for each distinct value.  If the selectivity index is low, the inference is that the Index Range Scan operations and the Table Access by RowID may be more costly than a Table Access Full

Lecture 8 Extended Topics / 59 More Aspects of Optimising  These are some additional aspects :-  Nested loops  Hash join  Subqueries  Update  Outer join  Filter  Database links - remote data  Clusters (caution : cluster stored tables perform poorly compared with data manipulation of non-clustered tables)

Lecture 8 Extended Topics / 60 Some of the 770 DBA Tables DBA_VIEWS USER_CLU_COLUMNS ALL_ERRORS USER_AUDIT_STATEMENT ALL_TABLES USER_AUDIT_TRAIL ALL_OBJECTS USER_CATALOG USER_COLL_TYPES USER_TAB_PRIVS USER_COL_COMMENTS USER_ARGUMENTS USER_COL_PRIVS USER_ALL_TABLES USER_COL_PRIVS_MADE USER_TAB_PRIVS USER_ASSOCIATIONS V$SQL USER_AUDIT_OBJECT V$SQLAREA USER_AUDIT_SESSION V$SHARED_MEMORY USER_VIEWS GV$DISPATCHER

Lecture 8 Extended Topics / 61 Oracle SQL and DB2 (UDB V.7.1 ) The next few overheads are intended to convince you that DBMSs do alter: These are some of the recent changes to Oracle :- Set SQLBlanklines (On or Off. Off is the default) Show SQLBlanklines A modification to the return message from Create or Replace/Alter/Drop Snapshot

Lecture 8 Extended Topics / 62 Oracle SQL and DB2 (UDB V.7.1) New or Modified Commands: Describe (m) Level as set in Set Describe Sql> set describe depth 3 linenum on ident on Sql> describe Recover(m) - performs media recovery on one or more tablespaces, one or more datafiles or the entire database Set (m) This has 4 new clauses -Autorecovery, describe, instance, logsource Show - 6 new clauses. Autorecovery, Describe,Instance, Logsource,Parameters, SGA (system global area)

Lecture 8 Extended Topics / 63 Oracle SQL and DB2 (UDB V.7.1) Shutdown - option of closing and dismounting a database Connect Connect [logon] as Sysoper| Sysdba| Internal logon is of the form username/password @database_specification These commands have been modified: Create type Describe Password Connect

Lecture 8 Extended Topics / 64 Oracle SQL and DB2 (UDB V.7.1) Set  maxdata  closecursor  compatability  constraint  newpage  loboffset

Lecture 8 Extended Topics / 65 Oracle SQL and DB2 (UDB V.7.1) Variable (bid variables : nchar, nchar2, nclob Show Errors Attribute Exit - allows numeric bind variables to be used. {Exit|quit} [success | Failure | Warning | n | variable | :BindVariable ] [Commit | Rollback]

Lecture 8 Extended Topics / 66 Oracle SQL and DB2 (UDB V.7.1) These are some of the recent changes to DB2 :- A Net Search Extender is included. Joins between different versions of DB2 are now supported An XML Extender is included Net.Data (Connect Web applications to DB2) DB2 Warehouse Manager (includes Query Patroller, QMF for Windows) OLAP Server Starter kit Spatial Extender - introduces time and distance attributes into business intelligence queries

Lecture 8 Extended Topics / 67 Oracle SQL and DB2 (UDB V.7.1) And in DB2 SQL, these functions are included : Moving average Moving count Moving sum Rank Correlation Stddev Variance CoVariance

Lecture 8 Extended Topics / 68 The OLAP functions proposed for SQL-99 are ceiling percentile_cont regr_slope corrpercent_rabk regr_sxx covar_pop power regr_sxy covar_samp range regr_syy cume_distrank row_number dense_rank regr_avg sqrt expregr-avgx stddev_pop floorregr_agvy stddev_samp lnregr_count car_pop moving_avgregr_intercept var_samp moving_sumregr_r2 New SQL Commands

Lecture 8 Extended Topics / 69 Oracle SQL and DB2 (UDB V.7.1) Oracle Data Types Char(n) Varchar2(n) Long Number(p,s) Decimal(p,s) Integer Smallint Raw(n) Long Raw Date Date (only the date) Date (only the time) DB2 Data Types Char(n) Varchar(n) Clob(2 Gb) Numeric(p,s) Decimal(p,s) Integer Smallint Char(n) for Bit Data Blob (2 Gb) Time stamp Date (MM/DD/YYYY) Time (HH24:Mi:Ss)

Lecture 8 Extended Topics / 70 Some Interesting Aspects An interesting about-face: Work has been done on ‘unconventional’ concurrency models - and Oracle has implemented a non-locking based model (perhaps a cache based database ?). Could all decision support systems work on the same ‘truth’ data ?

Lecture 8 Extended Topics / 71 IBM’s Direction IBM have an ongoing learning optimisation research project (eLiza) which is aimed at  automating adjustments to the configuration parameters  memory space allotment  schemas (and more importantly changes to schemas as these do change over time)

Lecture 8 Extended Topics / 72 Emerging Standards SQL-X is an emerging standard for using SQL together with XML syntax to navigate XML documents and to express XML-relayed queries SQL offers a much simpler view of data The language is about value-based relationships Data (in many cases) is maintained without value- based relationships

Lecture 8 Extended Topics / 73 Emerging Standards  XML is widely used for web based database applications  It is a standard for ‘describing’ data in data exchange.  It ‘embeds’ information about the text in a text message  XML code can be reused  The World Wide Web Consortium (WC3) completed XML’s definition in 1998  It is a ‘language about languages’  It uses ‘embedded’ tags for its ‘intelligence’  X-Query runs queries against XML-tagged documents

Lecture 8 Extended Topics / 74 A Brief History of I.T. trends  Move from centralised computing to distributed or decentralised computing  Business process re-engineering  Rapid advance and development and establishment in database technology  Advanced systems in Enterprise Resource Planning, Customer Resource Management  Expansion and use of the World Wide Web  Internet capabilities

Lecture 8 Extended Topics / 75 Business - Analytical Applications  Growth and Expansion of Financial Analytic Applications - 21st Century Focus  Why :  Costs and Cost Management and Containment  Profits and Profit Management  Enterprise, Corporate, Business Management  Regulation Compliance - e.g. Sarbanes-Oxley Act

Lecture 8 Extended Topics / 76 Pressures ?  Client/Server computing  Distributed Computing  New generation of users + their requirements  Intelligent Data  Data Management  and more Data and more Data Management

Lecture 8 Extended Topics / 77 A solution ?  Virtualisation  Addresses the problems of the rapid development of databases.  Resulting in a heterogeneous array of systems  A barrier to Business from exploiting or gaining full values from their information sources

Lecture 8 Extended Topics / 78 Data Federation  To unite - On a common basis - For a common objective  Do these qualify ?  Law Enforcement Agencies  Airline Industry  Healthcare Providers  Retailers  Manufacturers  Suppliers  Insurance Agencies  Government Agencies

Lecture 8 Extended Topics / 79 Data Federation  The concept of Information as a ‘shared resource’.  Insurers can improve satisfaction levels and reduce costs  doctors, health agents, hospitals with Web access  Required data is held in ‘older’ systems - legacy systems ?  New IT systems - business intelligence, enterprise portals, e-commerce  Critical to competitive positioning, cost efficiencies, operational performance monitoring

Lecture 8 Extended Topics / 80 Data Federation  Can the gap be overcome ?  IBM’s product - Classic Federation  provides means of access to mainframe non- relational and relational databases and files  employs ODBC and JDBC client tools and applications  ‘Fits seamlessly into existing mainframe infrastructures, reporting tools and application environments

Lecture 8 Extended Topics / 81 Data Federation  Standard SQL commands - Select, Insert, Update, Delete  Business ‘able to tap into’ multivendor legacy systems - DB2, IMS, VSAM, Adabas, CA-Datacom, CA-IDMS  How ?  DBll Classic Federation maps logical relational table and view structures over existing physical databases  Unix, Linux and mainframe tools access this data using the SQL commands

Lecture 8 Extended Topics / 82 Data Federation  ‘Classic Federation’ generates native data access commands for each database and file type  JDBC Client provides SQL developers with  WebSphere Studio (mainframe operational data to customer Web site)  WebSphere Portal (access to mainframe payroll, policy, accounting, claims data)  WebSphere Business Integrator

Lecture 8 Extended Topics / 83 Data Federation  Oracle : Real Application Clusters (9i and 10g)  Shared disk approach - ‘unified view’  Transaction processing applications  current trends in storage networks  ‘grid’ computing with ‘blade’ servers (attachable software)

Lecture 8 Extended Topics / 84 Data Federation  Oracle’s policy ?  “ virtualisation enables each component of the grid to react to changing circumstances more quickly and to adapt to component failures without compromising performance of the system as a whole”. (Brajesh Goyal)  Also interested in Linux and Intel based hardware

Lecture 8 Extended Topics / 85 Grid Computing  Grid computing is based on the concept of networked computing resources  And managed such that they can be quickly and efficiently re-allocated for use by different departments, applications, and users.  It embraces high speed networking technologies, advances in clustering and storage technologies

Lecture 8 Extended Topics / 86 Grid Computing  It also embraces automation of system administration  And the adoption of industry standard technologies storage  Allows ‘customers’ to provide cost efficient supply, access, management and sharing of computing and storage

Lecture 8 Extended Topics / 87 Associated Technologies  Data mining - the automated extraction of hidden predictive information from databases  It allows users to analyse large databases to solve business decision problems  It is not a business solution - it is a technology  Data Warehouse - A repository which stores integrated information for efficient querying and analysis.  This information may come from different sources.  It is translated into a common data model and integrated with existing data in the data warehouse.

Lecture 8 Extended Topics / 88 Future Directions  Nanotechnology - smaller, faster, mobile, more efficient  Mobile services will continue to become  smaller  faster  and embedded in many objects we touch

Lecture 8 Extended Topics / 89 Future Directions  They will enable  real-time interaction with customers  participation in collaborative projects  access to a global network of intelligence  And the distinction between communication and computing will become imprecise

Lecture 8 Extended Topics / 90 Future Directions  Molecular Memory  A means of ‘cramming’ more data into a memory cell  Molecular wires - nanotechnology  Molecular wires - parcels of charge around a molecule  A grid of wires, each about 2 nanometres in diameter  A nanometre is one millionth of a millimetre (roughly 10 carbon atoms long)

Lecture 8 Extended Topics / 91 Holographic Memory - What’s That ?  It could be the replacement for hard disks  Devices which use light (photo-optic) to store and read data  Compact disks (CD) - 783 megabytes (soon  1.3 GB)  DVD (Digital Versatile Disks) - 15.9 Gigabytes  Data is stored as bits (binary digits) - and on the surface of the recording media

Lecture 8 Extended Topics / 92 Holographic Memory  New optical storage research is focused on 3D storage - to use the volume of the storage media - not just the surface area.  Possibility of storing a terabyte (2 12 bytes) of data in a sugar-cube-size crystal - 1,000 gigabytes  The data on 1,000 CDs could fit onto a holographic memory system

Lecture 8 Extended Topics / 93 Holographic Memory  Current PC hard disk drives hold about 80/120 Gigabytes - which is considerably smaller capacity than 1,000 Gigabytes  Have you seen any advertising for an HDSS (desktop holographic storage system) ?  Data transfer rate at 40 Megabytes per second

Lecture 8 Extended Topics / 94 In the Future ?  Not to be outdone, Microsoft has signalled that it intends to ‘remove the divide between ‘High Performance Computing’ and ‘Personal Computing’.  This probably means that Microsoft will focus on Windows clustering, and exploiting Web services for large-scale, federated and distributed processing

Lecture 8 Extended Topics / 95 In the Future ? Open Source Database Management Systems MySQL + SAP = MaxDB PostgreSQL 7.4 Relational Database. 64 bit processor

Lecture 8 Extended Topics / 96 In the Future ?  Database replaced traditional ‘file keeping and management’  Will Data Warehousing eventually replace existing ‘databases’ and database technology ?  Will analytical tools (ERP, CRM, SCM, BOM …) eventually be the ‘core’ processes of databases ?  Will ‘Grid’ computing be the next wave of user access capability  And then ? ? ? ?  How will the ‘communications’ load be met or supported ?

CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications.

Similar presentations

Presentation on theme: "CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications.

Similar presentations

Presentation on theme: "CSE3180 Lecture 8 Extended Topics / 1 CSE3180 Extended Topics and Applications."— Presentation transcript:

Similar presentations

About project

Feedback