Data Structures and Algorithms

Slides:



Advertisements
Similar presentations
Data Structures Haim Kaplan and Uri Zwick November 2012 Lecture 5 B-Trees.
Advertisements

CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 Database indices Database Systems manage very large amounts of data. –Examples: student database for NWU Social Security database To facilitate queries,
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
CS 206 Introduction to Computer Science II 11 / 24 / 2008 Instructor: Michael Eckmann.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
More Trees Multiway Trees and 2-4 Trees. Motivation of Multi-way Trees Main memory vs. disk ◦ Assumptions so far: ◦ We have assumed that we can store.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Oct 29, 2001CSE 373, Autumn External Storage For large data sets, the computer will have to access the disk. Disk access can take 200,000 times longer.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
Indexing CS 400/600 – Data Structures. Indexing2 Memory and Disk  Typical memory access: 30 – 60 ns  Typical disk access: 3-9 ms  Difference: 100,000.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
Trees Chapter 15.
B/B+ Trees 4.7.
B-Tree Michael Tsai 2017/06/06.
B-Trees Text B-Tree Objects Building a B-Tree Read Weiss, §19.8
Multiway Search Trees Data may not fit into main memory
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
Secondary Storage Data Retrieval.
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
UNIT III TREES.
CSE 332 Data Abstractions B-Trees
Extra: B+ Trees CS1: Java Programming Colorado State University
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B+-Trees.
B+-Trees.
B+ Tree.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
ITEC 2620M Introduction to Data Structures
Trees 4 The B-Tree Section 4.7
Chapter Trees and B-Trees
Chapter Trees and B-Trees
Haim Kaplan and Uri Zwick November 2014
Data Structures and Algorithms
Wednesday, April 18, 2018 Announcements… For Today…
BTrees.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
Data Structures and Algorithms
B-Tree Insertions, Intro to Heaps
B- Trees D. Frey with apologies to Tom Anastasio
B- Trees D. Frey with apologies to Tom Anastasio
B-Tree.
B+-Trees (Part 1).
Other time considerations
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B- Trees D. Frey with apologies to Tom Anastasio
Data Structures and Algorithms
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
Lecture 9: Self Balancing Trees
CSE 373: Data Structures and Algorithms
Lecture 13: Computer Memory
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
B-Trees.
Lecture 9: Intro to Trees
CSE 326: Data Structures Lecture #10 B-Trees
CSE 332: Data Abstractions Memory Hierarchy
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
B-Trees B-trees are characterized in the following way:
Presentation transcript:

Data Structures and Algorithms B trees Data Structures and Algorithms CSE 373 SP 18 - Kasey Champion

Warm Up Suppose we have an AVL tree of height 50. What is the best case scenario for number of disk accesses? What is the worst case? CSE 373 SP 18 - Kasey Champion

Memory Architecture What is it? Typical Size Time The brain of the computer! 32 bits ≈free Extra memory to make accessing it faster 128KB 0.5 ns 2MB 7 ns Working memory, what your programs need 8GB 100 ns Large, longtime storage 1 TB 8,000,000 ns CPU Register L1 Cache L2 Cache 1 ns vs 8,000,000 is like 1 sec vs 3 months It’s faster to do: 5 million arithmetic ops than 1 disk access 2500 L2 cache accesses than 1 disk access 400 RAM accesses than 1 disk access RAM Disk CSE 373 SP 18 - Kasey Champion

Locality How does the OS minimize disk accesses? Spatial Locality Computers try to partition memory you are likely to use close by - Arrays - Fields Temporal Locality Computers assume the memory you have just accessed you will likely access again in the near future CSE 373 SP 18 - Kasey Champion

Thought Experiment Suppose we have an AVL tree of height 50. What is the best case scenario for number of disk accesses? What is the worst case? RAM Disk Draw a tree entirely in RAM Draw a tree who’s top node is on RAM and everything else is in disk CSE 373 SP 18 - Kasey Champion

Maximizing Disk Access Effort Instead of each node having 2 children, let it have M children. Each node contains a sorted array of children Pick a size M so that fills an entire page of disk data Assuming the M-ary search tree is balanced, what is its height? What is the worst case runtime of get() for this tree? logm(n) log2(m) to pick a child logm(n) * log2(m) to find node CSE 373 SP 18 - Kasey Champion

Maximizing Disk Access Effort If each child is at a different location in disk memory – expensive! What if we construct a tree that stores keys together in branch nodes, all the values in leaf nodes K <- internal nodes K V K V K V K V K V K V leaf nodes -> CSE 373 SP 18 - Kasey Champion

B Trees Has 3 invariants that define it 1. B-trees must have two different types of nodes: internal nodes and leaf nodes 2. B-trees must have an organized set of keys and pointers at each internal node 3. B-trees must start with a leaf node, then as more nodes are added they must stay at least half full CSE 373 SP 18 - Kasey Champion

Node Invariant Internal nodes contain M pointers to children and M-1 sorted keys A leaf node contains L key-value pairs, sorted by key K M = 6 L = 3 K V CSE 373 SP 18 - Kasey Champion

Order Invariant For any given key k, all subtrees to the left may only contain keys x that satisfy x < k. All subtrees to the right may only contain keys x that satisfy k >= x 3 7 12 21 X < 3 3 <= X < 7 7 <= X < 12 12 <= X < 21 21 <= x CSE 373 SP 18 - Kasey Champion

Structure Invariant If n <= L, the root node is a leaf When n > L the root node must be an internal node containing 2 to M children All other internal nodes must have M/2 to M children All leaf nodes must have L/2 to L children All nodes must be at least half-full The root is the only exception, which can have as few as 2 children Helps maintain balance Requiring more than 2 children prevents degenerate Linked List trees K V CSE 373 SP 18 - Kasey Champion

B-Trees Has 3 invariants that define it 1. B-trees must have two different types of nodes: internal nodes and leaf nodes An internal node contains M pointers to children and M – 1 sorted keys. M must be greater than 2 Leaf Node contains L key-value pairs, sorted by key. 2. B-trees order invariant For any given key k, all subtrees to the left may only contain keys that satisfy x < k All subtrees to the right may only contain keys x that satisfy k >= x 3. B-trees structure invariant If n<= L, the root is a leaf If n >= L, root node must be an internal node containing 2 to M children All nodes must be at least half-full CSE 373 SP 18 - Kasey Champion

get() in B Trees get(6) get(39) 12 44 6 20 27 34 50 1 2 3 6 4 8 5 9 10 14 9 16 10 17 11 20 12 22 13 24 14 27 15 28 16 32 17 34 18 38 19 39 20 41 21 Worst case run time = logm(n)log2(m) Disk accesses = logm(n) = height of tree CSE 373 SP 18 - Kasey Champion

put() in B Trees Suppose we have an empty B-tree where M = 3 and L = 3. Try inserting 3, 18, 14, 30, 32, 36 3 1 3 1 18 2 14 3 18 32 14 3 18 2 3 1 18 2 32 5 14 3 30 4 36 6 32 5 CSE 373 SP 18 - Kasey Champion