Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide 8-1 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica College 9 – November 1 Computer Science an overview EDITION 7 J. Glenn Brookshear.

Similar presentations


Presentation on theme: "Slide 8-1 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica College 9 – November 1 Computer Science an overview EDITION 7 J. Glenn Brookshear."— Presentation transcript:

1 Slide 8-1 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica College 9 – November 1 Computer Science an overview EDITION 7 J. Glenn Brookshear

2 Slide 8-2 Copyright © 2003 Pearson Education, Inc. C H A P T E R 8 (now chap. 9, 2 nd part) File Structures Abstractions of the actual data organization on mass storage Again: differences between conceptual and actual data organization

3 Slide 8-3 Copyright © 2003 Pearson Education, Inc. 8.1: Files, Directories & the Operating System OS storage structure: –conceptual hierarchy of directories and files directory tree files

4 Slide 8-4 Copyright © 2003 Pearson Education, Inc. 8.1: Files: Conceptual vs. Actual View View at OS-level is conceptual –actual storage may differ significantly!

5 Slide 8-5 Copyright © 2003 Pearson Education, Inc. 8.2: Sequential Files To ‘remember’ where data resides on disk, the OS maintains a list of sectors for each file Result: sequential view of scattered set of data

6 Slide 8-6 Copyright © 2003 Pearson Education, Inc. 8.2: Text Files Sequential file consisting of long string of encoded characters ( e.g. ASCII-code ) –But: character-string still interpreted by word processor! Same file in “MS Word”File in “Notepad”

7 Slide 8-7 Copyright © 2003 Pearson Education, Inc. 8.2: Text files & Markup Languages ( e.g. HTML )

8 Slide 8-8 Copyright © 2003 Pearson Education, Inc. 8.2: From actual storage to conceptual view sequential view Interpretation by Application Program Assembly by Operating System actual storage conceptual view Sequential buffer

9 Slide 8-9 Copyright © 2003 Pearson Education, Inc. 8.2: Data Conversion When programming: note that data transfer to/from file may involve data conversion: –e.g., from two’s complement notation to ASCII: So: again it’s about the interpretation of data

10 Slide 8-10 Copyright © 2003 Pearson Education, Inc. 8.3: Quick File Access Disadvantage of sequential files: –no quick access to particular file data Two techniques to overcome this problem: –(1) Indexing or (2) Hashing keys Indexing: Indexed FileIndex loaded into main memory when opened

11 Slide 8-11 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 8 - Problem 10 Why is a ‘patient identification number’ a better choice for a key field than the last name of each patient? If key unique: –additional sequential search never required Patient’s last name is not always unique

12 Slide 8-12 Copyright © 2003 Pearson Education, Inc. 8.3: Inverted Files Variation to (single) indexing: inverted file

13 Slide 8-13 Copyright © 2003 Pearson Education, Inc. 8.4: Hashing Disadvantage of indexing is… the index –requires extra space + includes 1 extra indirection Solution: ‘hashing’ –finds position in file using a key value (as in indexing)… –… simply by identifying location directly from the key How? –define set of ‘buckets’ & ‘hash function’ that converts keys to bucket numbers … key value bucket number … N hash function

14 Slide 8-14 Copyright © 2003 Pearson Education, Inc. 8.4: Hash Function: Example If storage space divided into 40 buckets and hash function is division: –key values 14, 54, & 94 all map onto same bucket (collision) Key values

15 Slide 8-15 Copyright © 2003 Pearson Education, Inc. 8.4: Key field value can be anything

16 Slide 8-16 Copyright © 2003 Pearson Education, Inc. 8.4: Handling Bucket Overflow When bucket-sizes are fixed: –buckets can fill up and overflow One solution: –designate special overflow storage area not fixed in size!

17 Slide 8-17 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 8 - Problem 22 If we use division as a hash function and have 23 buckets, in which bucket should we search to find the record whose key is interpreted as the integer value 101? … 101 bucket number: … 9 … 23 Division: 101 / 23 = 4, remainder 9 …

18 Slide 8-18 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 8 - Problem 16 a) What advantage does an indexed file have over a hash file? b) What advantage does a hash file have over an indexed file? a) When key unique: index directly points to required data, while hashing oftens require an additional (sequential) bucket search (incl. bucket overflow). b) No additional index file storage is required.

19 Slide 8-19 Copyright © 2003 Pearson Education, Inc. Chapter 8 - File Structures: Conclusions File Structures: –abstractions of actual data organization on mass storage Changes of ‘view’: –actual storage -> sequential view by OS -> conceptual view presented to user Quick access to particular file data by –(1) indexing ( many forms ) –(2) hashing ( requires no index, but requires bucket search! )

20 Slide 8-20 Copyright © 2003 Pearson Education, Inc. C H A P T E R 9 Database Structures (Large) integrated collections of data that can be accessed quickly Combination of data structures (chap. 7) and file structures (chap. 8)

21 Slide 8-21 Copyright © 2003 Pearson Education, Inc. 9.1: Historical Perspective Originally: departments of large organizations stored all data separately in flat files Problems: redundancy & inconsistencies

22 Slide 8-22 Copyright © 2003 Pearson Education, Inc. 9.1: Integrated Database System Better approach: integrate all data in a single system, to be accessed by all departments

23 Slide 8-23 Copyright © 2003 Pearson Education, Inc. 9.1: Disadvantages of Data Integration Disadvantages: –Control of access to sensitive data?! Bijvoorbeeld: personeelszaken heeft niets te maken met persoonlijke gegevens opgeslagen door de bedrijfsarts! –Misinterpretation of integrated data Supermarkt-database zegt dat een klant veel medicijnen koopt. Wat betekent dit? Wat als deze klant solliciteert op een baan bij de supermarkt-keten? –What about the right to hold/collect/interpret data? Heeft een credit card company het recht gegevens over koopgedrag van personen te gebruiken/verkopen?

24 Slide 8-24 Copyright © 2003 Pearson Education, Inc. 9.2: Conceptual Database Layers Operating System Actual data storage Data seen in terms of a sequential view Compare:

25 Slide 8-25 Copyright © 2003 Pearson Education, Inc. 9.3: The Relational Model Relational Model –shows data as being stored in rectangular tables, called relations, e.g.: –row in a relation is called ‘tuple’ –column in a relation is called ‘attribute’

26 Slide 8-26 Copyright © 2003 Pearson Education, Inc. 9.3: Issues of Relational Design So, relations make up a relational database… … but this is not so straightforward: Problem: more than one concept combined in single relation

27 Slide 8-27 Copyright © 2003 Pearson Education, Inc. 9.3: Redesign by extraction of 3 concepts Any information obtained by combining information from multiple relations

28 Slide 8-28 Copyright © 2003 Pearson Education, Inc. 9.3: Example: Finding all departments in which employee 23Y34 has worked:

29 Slide 8-29 Copyright © 2003 Pearson Education, Inc. 9.3: Relational Operations Extracting information from a relational database by way of relational operations –Most important ones: (1) extract tuples (rows) :SELECT (2) extract attributes (columns) :PROJECT (3) combine relations :JOIN Such operations on relations produce other relations –so: they can be used in combination, to create complex database requests (or ‘queries’)

30 Slide 8-30 Copyright © 2003 Pearson Education, Inc. 9.3: The SELECT operation

31 Slide 8-31 Copyright © 2003 Pearson Education, Inc. 9.3: The PROJECT operation

32 Slide 8-32 Copyright © 2003 Pearson Education, Inc. 9.3: The JOIN operation

33 Slide 8-33 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 9 - Problem 10 RESULT := PROJECT W from X X relation UV W AZ 5 BD 3 CQ 5 Y relation R S 3 J 4 K RESULT X.U X.V X.W Y.R Y.S A Z 5 3 J A Z 5 4 K C Q 5 3 J C Q 5 4 K SELECT from X where W=5PROJECT S from YJOIN X and Y where X.W > Y.R

34 Slide 8-34 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 9 - Problem 11 PART relation PartName Weight Bolt 2X1 Bolt 2Z1.5 Nut V50.5 a) Which companies make Bolt 2Z? –NEW := SELECT from MANUFACTURER where PartName = Bolt2Z –RESULT := PROJECT CompanyName from NEW MANUFACTURER relation CompanyNamePartName Cost Company XBolt 2Z.03 Company XNut V5.01 Company YBolt 2X.02 Company YNut V5.01 Company YBolt 2Z.04 Company ZNut V5.01

35 Slide 8-35 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 9 - Problem 11 PART relation PartName Weight Bolt 2X1 Bolt 2Z1.5 Nut V50.5 b) Obtain a list of the parts (+cost) made by Company X? –NEW := SELECT from MANU’ER where CompanyName=CompanyX –RESULT := PROJECT PartName, Cost from NEW MANUFACTURER relation CompanyNamePartName Cost Company XBolt 2Z.03 Company XNut V5.01 Company YBolt 2X.02 Company YNut V5.01 Company YBolt 2Z.04 Company ZNut V5.01

36 Slide 8-36 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 9 - Problem 11 PART relation PartName Weight Bolt 2X1 Bolt 2Z1.5 Nut V50.5 c) Which companies make a part with weight 1? –NEW1 := JOIN MANUCTURER and PART where MANUFACTURER.PartName = PART.PartName –NEW2 := SELECT from NEW1 where PART.Weight = 1 –RESULT := PROJECT MANU’ER.CompanyName from NEW2 MANUFACTURER relation CompanyNamePartName Cost Company XBolt 2Z.03 Company XNut V5.01 Company YBolt 2X.02 Company YNut V5.01 Company YBolt 2Z.04 Company ZNut V5.01

37 Slide 8-37 Copyright © 2003 Pearson Education, Inc. Opdracht: Chapter 9 - Problem 11 PART relation PartName Weight Bolt 2X1 Bolt 2Z1.5 Nut V50.5 MANUFACTURER relation CompanyNamePartName Cost Company XBolt 2Z.03 Company XNut V5.01 Company YBolt 2X.02 Company YNut V5.01 Company YBolt 2Z.04 Company ZNut V5.01 c) Which companies make a part with weight 1? –NEW1 := SELECT from PART where Weight = 1 –NEW2 := JOIN MANUCTURER and NEW1 where MANUFACTURER.PartName = NEW1.PartName –RESULT := PROJECT MANU’ER.CompanyName from NEW2

38 Slide 8-38 Copyright © 2003 Pearson Education, Inc. Chapter 9 - Database Structures: Conclusions Database Structures: –(large) integrated collections of data that can be accessed quickly Database Management System –provides high-level view of actual data storage (database model) Relational Model most often used –relational operations: SELECT, PROJECT, JOIN, … –high-level language for database access: SQL

39 Slide 8-39 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica – Tentamen (1) Most important sections ( editie 7 ) & keywords: –Ch , 3, 4: abstractie / algoritme –Ch , 2, 3, 4, 5, 6, 7: bits / data opslag & representatie ( ASCII, etc) / Boolse operaties / flipflops / geheugen-vormen en -karakteristieken / getalstelsels (binair, hexadecimaal, etc…) / overflow & truncation errors –Ch , 2, 3, 4, 6: cpu architectuur / machine language & instructions / programma executie / machine cycle / alternatieve architecturen –Ch , 2, 3, 4: operating systems / batch processing / time-sharing / multitasking / OS componenten / process vs. programma / competition –Ch , 2, 3, 4, 5, 6: algoritme (formeel) / primitiven / pseudo-code / syntax / semantiek / iteratie / loop control / recursie / efficientie

40 Slide 8-40 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica – Tentamen (2) Most important sections ( editie 7 ) & keywords: –Ch , 2, 3, 4, 5: generaties: 1 e, 2 e, 3 e / assembly language / compilers / machine independence / paradigma’s / imperatief / object-georienteerd / programming concepts / procedures / parameters / call by value/reference –Ch , 2, 3: software life cycle / ontwikkelings-fase / modulariteit / koppeling / cohesie / documentatie / complexiteits-maat voor software –Ch , (2-5): datastructuren / abstractie / statisch vs. dynamisch / pointers / (arrays, lists, stacks, queues, etc…) –Ch , 2, 3, 4: files / sequential / tekst / indexed / hashing –Ch , 2, 3: databases vs. ‘platte’ files / relaties / tuples / attributen / relationele operaties: SELECT, PROJECT, JOIN

41 Slide 8-41 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica – Tentamen (3) Geen tentamenstof: –Ch (editie 7): Networks –Ch. 4 (editie 8): Networking and the Internet –Ch. 10 (editie 7 & 8): Artificial Intelligence –Ch. 11 (editie 7 & 8): Theory of Computation Veel succes!


Download ppt "Slide 8-1 Copyright © 2003 Pearson Education, Inc. Overzicht Informatica College 9 – November 1 Computer Science an overview EDITION 7 J. Glenn Brookshear."

Similar presentations


Ads by Google