Presentation on theme: "Agricultural to Industrial to Information Age"— Presentation transcript:
1Agricultural to Industrial to Information Age DataBits and Bytese.gInformationorganized and presented in a form suitable for decision makinge.g. (518)KnowledgeRef. Alvin Toffler (1980)Data is raw, unsummarized and unanalyzed facts.We are Data rich and Information poor1
2Desirable Attributes of Information ShareableTransportableSecureAccurateTimelyRelevantReadily accessed by more than one person at a timeEasily moved to a decision makerProtected from unauthorized use and and destructionReliable, precise recordsCurrent and up to dateAppropriate to the decisioninformation: Shareable, Transportable, NOT secureAccurate, Timely & Relevant ?Depends on situation
3Where do companies get information from? They buy itConsultants, publications, news services etc.They generate itComputer systems (programs process data stored in databases)Employees (apply experience and intelligence)Who needs it?Higher level managers need less detailed informatione.g.. I need your namesMy chair needs only my class sizeThe president needs only growth trendsHow is it stored?Tape or disk2
4Where do we store Intangible Assets -- Information? In people’s headsOn paperIn card-filesIn computersComputers are like intricate card files with better indexingInformation is the content, and the delivery mechanism is technologye.g if a water utility company created the best pipes, taps and dams but when the customer turns it on, brown sludge comes out!!
5Entities, Attributes, and Relationships Entity – a person, place, thing, or eventAttribute – a property of an entityFor the entity “Person,” attributes could include eye color and heightRelationship – an association between entitiesPublishers are related to the books they publish, and a book is related to its publisherWeak or dependent entity (child) is existence-dependent on some other entity type (strong or parent)E.g. next of kin is a weak entity (requires employee)Attributes can be simple or composite (e.g. address)
6Terminology Fields - attribute Domain -Description of allowed values for an attributeRecords - logically connected set of one or more fields.Files - collection of recordsA field is some characteristic of the real-world object that is being modeledA domain is the pool of potential values. The concept of a domain is actually too complex for most systems to support
7History of Data Processing Manual record-keepingHigh labor costs and human errorsData file – stores information on a single entity and the attributes of that entityDatabase – a structure that can store information about multiple types of entities, the attributes of these entities, and the relationships among the entitiesManual record keeping works well if you only have to store and retrieve items, but breaks down if you want to cross-reference information.File -based systems are more efficient but take a decentralized approach so a lot of data is duplicated. Each application program defines the physical structure and storage of the data file and records.4
8Limitations of File-Based Systems Separation and Isolation of DataDuplication of DataData dependenceIncompatibility of filesFixed queries / proliferation of application programs / pressure on DP staff1) Difficult to access data (need to synchronize the processing of two or more files)2) Duplication wastes time and money with data entry and takes up computer storage space. Also could lead to inconsistencies.3) For example, to change the size of a field in a file, need to write a new program to replace the original but also need to identify any other program that accesses that file and modify them, even if they do not use that particular field.4) The structure of a file generated by COBOL is different from that of a file generated by a C program and so cannot process the files jointly5) All queries and reports have to be written by an application programmer, so no ad hoc queries
9Database A self-describing collection of integrated records Properties of a Database:It represents some aspect of the real worldIt is a logically coherent collection of data with some inherent meaningIt is designed, built, and populated with data for a specific purposeIt has users and applicationsA shared corporate resource that has minimum duplicationThe description of the data is called a system catalog or data dictionary or meta-data. This provides program-data independenceIf we add a field or create a file, existing applications are unaffected, but if we delete a field from a file that an application program uses, then that program must be modified.6
10Spreadsheet or Database? Data sizeData storage formatData structureextent to which relationships among data items are fixedData sharingData controldegree of data input editing and validatingSS can handle about 5000 records, DB can effectively handle up to a millionDB can open, use and store files that are pictures, graphs, audio, video, spreadsheets and word documentsIf relationships are static (use the data in the same way always) SS may be OK, but for dynamic queries, need DB.Number of users who have access to data . If low (e.g if only for acctg dept) use SS, but if acctg, sales and inventory all query the data then it remains undisturbed in a DB.Hi control in DB (e.g. sales data as opposed to forecasting estimates)
12DBMS A software system that : Enables users to define, create and maintain the databaseProvides controlled access to this databaseFile processing systems support a limited schema through the creation of directory structures for filesThey do not support a query languageThey do not guarantee against data loss if it is not backed upDo not support efficient access to data items whose location in a particular file is not knownWhen they allow concurrent access, they will not prevent two users from modifying the same file at the same time11
13DBMS components Machine Data Human Hardware Software Procedures People Hardware: PC, mainframe or network of computers (e.g. client-server architecture)Software consists of the DBMS software itself along with application programs, the operating system, and network software (if it is used over a network).Data bridges the machine and human componentsProcedures consist of instructions on how to log on to the DBMS, how to start and stop it, how to make backup copies, handle hardware and software problems etc.
14Data Life Cycle Data acquisition Data use data modeling and populating with ultimate goal of storing dataData useCombines data that has been previously stored and interprets output in a decision making context (Data Warehousing)In Access, Tables deal with the first stage of the life cycle (data acquisition) and Queries, Forms and Reports deal with the second stage (data use)17
15Data acquisition Logical database design Physical database design E/R diagrams, normalization, database modelsPhysical database designIntegrity constraints, indexes, denormalizationPopulating the databasedata entry, import, downloadUpdate recordsdata dictionary, metadataData modeling and design are the responsibility of the IT group whereas populating the database is the responsibility of the functional manager(s) that use the data18
16Data Use Define view Retrieve data Manipulate data Present results Query design, DDL (SQL or QBE)Retrieve dataQuery performance and optimization, concurrency controlsManipulate dataSort, aggregate, classify, analyzePresent resultsReports, formsQuery design and processing is the province of the IT professionals, whereas manipulation and presentation is the purview of the end-userIn Access, Queries deal with the first three and reports and forms deal with the lastForms are primarily used for data entry and reports for printouts20
17Access Database Objects TablesStores data as recordsQueriesAnswers questions about the databaseFormsPresents data using a customized layoutReportsFormats the data (primarily for printouts)MacrosUsed to automate repetitive tasksModulesPagesTables contain the permanent dataQuery datasheets are temporaryModules require programming in Basic.Note: Access automatically saves the active database periodically and when you close. So do not remove the diskette from the drive when the database is open.The SAVE button saves the design of the table, query, form or report.Although for Word you can undo last 100 changes, for Access you can only undo the last one.Access saves changes in the current record as soon as you move to the next or when you close the table.
18Users Administrators Data Administrator Database Administrator Database designersConceptual and logical design (WHAT?)Physical design (HOW?)Application programmersEnd usersnaïve (e.g checkout assistant)sophisticatedDA manages data and consults with senior managers to ensure that the database supports corporate objectives.The DBA is more technically oriented and is in charge of security and integrity control and ensures satisfactory performance for the application and the users.The importance of the corporate resources is reflected in the allocation of teams of staff to each role (in some organizations there is no distinction between the DA and the DBA)The logical database designer is concerned with the data, the relations between the data and the constraints on the data (must know business rules)9
19Everyday Database Systems SupermarketCredit cardTravel AgentInsuranceLibraryUniversity1) Purchase goods and the barcode reader finds the price of each item from a products database and reduces the number of items on stock.2) When you use your credit card to purchase goods, a card reader linked to a computer system checks the database to see if the price of goods you wish to buy along with the sum of purchases you have already made this month is within your credit limit. The program also checks that the card is not on the list of lost or stolen cards before authorizing the purchase. When your purchase is confirmed,the details of the purchase are added to the database.
20Applications of DBMS Airline reservations systems Banking systems Reservations (customer name, assigned seat)Flights (airports, arrival and departures)Tickets (prices, requirements, availability)Banking systemsCustomers (names, addresses, accounts, loans)Corporate recordsAccounts (payable, receivable)Employees (names, addresses, salary, benefits)Airline:Queries: Which seats available and at what pricesModifications: Book a flight, assign a seat, indicate a meal preferenceProtect against data loss if system fails, Protect against two agents assigning the same seatBanking:Queries: Account balanceModifications: Deposit to accountIf money has been ejected from ATM machine the bank must record the debit even if the power fails immediatelyCorporate:Queries: Printing of weekly paychecksModifications: Employees fired or hired16
21Creating a Table in Access Datasheet viewTo add, delete or edit recordsDesign ViewTo define table the initially and specify its fieldsIn Word or Excel you can have several documents open at a time. But Access only allows you to open one database at a time.Table names must start with a letter and cannot contain spaces.Field names can contain (not at the start) spaces.Given the choice between one wide field and several narrow ones, the latter is more flexible. (It is easier to concatenate fields than to separate one large field into several sub-fields.)Leave extra room for growth in numeric data items.Do not store calculated data on a record (use dates rather than numbers to represent time duration)..
22Custom Tables Validation rules Input masks Default values Lookup fieldsFormatTo set a validation rule linking two fields in a table use table propertiesInput masks ease data entry by providing formattingLookup wizard can be used to look up values from a field in another tableUse >(<) in Format to convert all letters to upper (lower)caseCan state criteria for advanced filtersTools->Analyze->Documenter builds a data dictionary for each object in the database
23Advantages of Database Processing Getting more information from the same amount of dataWhen all the data for various systems are stored in a single database, the information becomes available, as well as the process of retrieving the information can be quick and easyCombine budget and resources to create important applications for the whole organization.Information is quickly and easily accessible, since it is integrated
24Advantages of Database Processing Sharing of dataSeveral users can have access to the same piece of data (Concurrency control allows shared access)Balancing conflicting requirementsA person or group, often called Database Administration/Administrator (DBA) can structure the database in such a way that it benefits the entire organization, not just a single groupCan write new applications for same data with DBA balancing users’ conflicting requirements
25Advantages of Database Processing Controlling redundancyNot only saves space, but makes the updating process easierConsistencyConsistency is a direct result of redundancy, so by reducing redundancy, there is much less potential for this sort of inconsistency with the database approachReducing redundancy improves consistencye.g. in 1805, Austria, Britain, Russia &Sweden were at war with Napoleon. The Austrian and Russian commanders collaborated on a combined attack of the French Army. They agreed to meet on Oct 20th but the Austrians were using the Gregorian calendar whereas the Russians were still using the Julian calendar which was 10 days behind and so the Austrians got wiped out. (However, Napoleon was defeated at the Battle of Trafalgar by Nelson on Oct 21st and so did not invade Britain.)A Mars probe was lost in Sep 1999 due to a mix-up by its flight controllers between imperial and metric units
26Advantages of Database Processing IntegrityAn integrity constraint is a rule that must be followed by data in the databaseExample: Not allowing a person’s age to be lower than zeroSecurityThe prevention of access to the database by unauthorized usersRecovery control restores the data to previous consistent state after hardware/software failureAccording to European legislation, it is a criminal offense if data is not accurate or up-to-date
27Advantages of Database Processing Increasing productivityA good DBMS comes with many features that allow users to gain access to data without having to do any programming at allData independenceA property that allows the structure of a database to be changed without the programs that access the database having to changeInformation is quickly and easily accessible, since it is integratedReduced program maintenance for programmers due to data independence, improved data accessibility for end-users
28Disadvantages of Database Processing DBMS sizeDBMSs are large programs that occupy a large amount of disk space as well as internal memoryDBMS complexityThe complexity and breadth of the functions provided by a DBMS make it a complex product to useMainframe multi-user projects are complex, with recurrent annual maintenance costsAlso costs of converting legacy systems (hardware, training etc.)A file-based system is written for a particular application, so performs well, but DBMS is more general, so not as fast.
29Disadvantages of Database Processing Greater impact of a failureA failure on the part of any one user that damages the database in some way may affect all the other users on the systemMore difficult recoveryIf the database is being updated by a large number of users, all updates must be redone since the time of its restorationGreater vulnerability to multiple users as well as poor design.Failure on the part of any user that damages the database will affect all users sharing that databaseMultiple users mean that all updates will have to be restored
30When can an organization justify a database? Application needs are constantly changingRapid access is required for ad hoc queriesNeed to reduce long lead times and high development costs for new systemsData elements are shared by usersNeed to communicate and relate data across functional and departmental boundariesNeed to improve quality of data resources and control access to themUncertainty as to important data elements and expected volumeSubstantial dedicated programming assistance is not availableNeed to improve quality and consistency of data resources14
31History of DBMSIBM developed the Generalized Update Access Method (GUAM) in 1964 for North American Rockwell, the prime contractor for the APOLLO projectGUAM was made available for the general public under the name Data Language/I (DL/I) in 1966
32History of DBMSDL/I became the data management component for the Information Management System (IMS), which was the dominant DBMS for many yearsIn the mid-1960s, General Electric developed Integrated Data Store (I-D-S)Enterprises nowadays maintain two distinct databases,one containing operational or production (day to day) dataanother containing decision support data (summary data that is extracted periodically from the operational database)
33History of DBMS First generation Second generation Third generation Hierarchical and network modelsSecond generationRelational modelsThird generationObject oriented models
34Data Models Record Based Object Based Hierarchical (60’s) Network (70’s)Relational (80’s)Object BasedEntity-Relationship (70’s)Semantic data models (80’s)Object-oriented (90’s)Relational systems specify what data is to be recoveredNetwork and hierarchical systems specify how the data is to be recoveredEven the most complicated hierarchical and network models can be represented using a relational model (2-dimensional tables)Forms the basis for future object oriented databases
35Record-Based Data Models HierarchicalParent-child relationships with only one parent (N:1 relationships are not supported)NetworkExtends hierarchical model by allowing multiple parentsAssociations are created via pointersRelationalHierarchical models evolved from tape systemsSuited for sequential processingGenerally not implemented on PCsDML is more difficult than SQL or QBERelational DBMS is the dominant data processing software in use today23
36Hierarchical ModelPerceived by the user as a collection of hierarchies, or treesMore restrictive structure than a network modelGUAM, DL/I, and IMS are examples of DBMSs that conform to the hierarchical modelRepresents data as composed of a hierarchy of data recordsTree structures (parent-child relationships, with only one parent record type allowed for any record type).
37Network ModelPerceived by the user as a collection of record types and relationships between these record typesI-D-S is an example of a DBMS that conforms to the network data modelRepresents data as records and relationships are represented by sets(records appear as nodes and sets as edges in the graph)
38Assignment 1 MS Access 2000 Pages AC 2.34 –2.36 #1-16 Database should have at least 4 entities