DGrid: A Library of Large-Scale Distributed Spatial Data Structures Pieter Hooimeijer, 2006-05-02.

Slides:



Advertisements
Similar presentations
Lecture Computer Science I - Martin Hardwick Making our programs more flexible rSo far we have largely programmed using l Arrays of integers l Arrays.
Advertisements

Nearest Neighbor Search
C++ Templates. What is a template? Templates are type-generic versions of functions and/or classes Template functions and template classes can be used.
UBlas: Boost High Performance Vector and Matrix Classes Juan José Gómez Cadenas University of Geneve and University of Valencia (thanks to: Joerg Walter,
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Praveen Yedlapalli Emre Kultursay Mahmut Kandemir The Pennsylvania State University.
They’re not just binary anymore!
Computer Programming 1 More on functions. Computer Programming 2 Objectives Function overloading Scope rules and namespace Inline Templates Pass by value.
Encapsulation by Subprograms and Type Definitions
Spatial Information Systems (SIS) COMP Raster-based structures (1)
Spatial Indexing I Point Access Methods.
The Composite Pattern.. Composite Pattern Intent –Compose objects into tree structures to represent part-whole hierarchies. –Composite lets clients treat.
By: Jamie McPeek. 1. Background Information 1. Metasearch 2. Sets 3. Surface Web/Deep Web 4. The Problem 5. Application Goals.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 20: Binary Trees.
Parallel Processing (CS526) Spring 2012(Week 5).  There are no rules, only intuition, experience and imagination!  We consider design techniques, particularly.
Sort-Last Parallel Rendering for Viewing Extremely Large Data Sets on Tile Displays Paper by Kenneth Moreland, Brian Wylie, and Constantine Pavlakos Presented.
Query Processing Presented by Aung S. Win.
OOP Languages: Java vs C++
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
Composite Design Pattern. Motivation – Dynamic Structure.
Review for Midterm Chapter 1-9 CSc 212 Data Structures.
CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees.
CSCE 3110 Data Structures & Algorithm Analysis Binary Search Trees Reading: Chap. 4 (4.3) Weiss.
FEN 2012UCN Technology - Computer Science 1 Data Structures and Collections Principles revisited.NET: –Two libraries: System.Collections System.Collections.Generics.
Review C++ exception handling mechanism Try-throw-catch block How does it work What is exception specification? What if a exception is not caught?
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Searching: Binary Trees and Hash Tables CHAPTER 12 6/4/15 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education,
The Structure of a C++ Program. Outline 1. Separate Compilation 2. The # Preprocessor 3. Declarations and Definitions 4. Organizing Decls & Defs into.
Programming Language C++ Xulong Peng CSC415 Programming Languages.
Binomial Queues Text Read Weiss, §6.8 Binomial Queue Definition of binomial queue Definition of binary addition Building a Binomial Queue Sequence of inserts.
Trees By Charl du Plessis. Contents Basic Terminology Basic Terminology Binary Search Trees Binary Search Trees Interval Trees Interval Trees Binary Indexed.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:
Programming Language Support for Generic Libraries Jeremy Siek and Walid Taha Abstract The generic programming methodology is revolutionizing the way we.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Copyright © Curt Hill Generic Classes Template Classes or Container Classes.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
5. Collections Arrays Other basic data structures.NET collections Class library example.
Standard Template Library The Standard Template Library was recently added to standard C++. –The STL contains generic template classes. –The STL permits.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
APS105 Lists. Structures Arrays allow a collection of elements –All of the same type How to collect elements of different types? –Structures; in C: struct.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
The Visitor Pattern (Behavioral) ©SoftMoore ConsultingSlide 1.
Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.
Integration Testing Beyond unit testing. 2 Testing in the V-Model Requirements Detailed Design Module implementation Unit test Integration test System.
(1) ICS 313: Programming Language Theory Chapter 11: Abstract Data Types (Data Abstraction)
Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated.
Chapter 3 Templates. Objective In Chapter 3, we will discuss: The concept of a template Function templates Class templates vector and matrix classes Fancy.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
1 Principles revisited.NET: Two libraries: System.Collections System.Collections.Generics Data Structures and Collections.
1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.
PYTHON FOR HIGH PERFORMANCE COMPUTING. OUTLINE  Compiling for performance  Native ways for performance  Generator  Examples.
MA/CSSE 473 Day 05 More induction Factors and Primes Recursive division algorithm.
Templates 3 Templates and type parameters The basic idea templates is simple: we can make code depend on parameters, so that it can be used in different.
Department of Computer Science, Johns Hopkins University Lecture 7 Finding Concurrency EN /420 Instructor: Randal Burns 26 February 2014.
Semantic Analysis. Find 6 problems with this code. These issues go beyond syntax.
Run-Time Environments Chapter 7
Data Structures: Disjoint Sets, Segment Trees, Fenwick Trees
Data Structure and Algorithms
Behavioral Design Patterns
COMP 430 Intro. to Database Systems
L21: Putting it together: Tree Search (Ch. 6)
CPSC 531: System Modeling and Simulation
Combining Compile-Time and Run-Time Components
Multidimensional Search Structures
Presentation transcript:

DGrid: A Library of Large-Scale Distributed Spatial Data Structures Pieter Hooimeijer,

2 Motivation DGrid was designed to: –Support very large sets of dynamic point data (i.e. points that move unpredictably). –Offer flexible trade-offs between the cost of search operations and the cost of updates. –Run on parallel and distributed systems.

3 Spatial Data Structures Definition—Any data structure that holds: - points- rectangles - lines- polygons - curves- etc. They are typically optimized for a particular type of search operation. We’ll focus on point data.

4 Commonly used example: The Quadtree. –Works like a binary search tree. –Each node in the tree has four children: NE, SE, SW, NW. –This is called ‘Recursive Decomposition’

5 The quadtree implementation in DGrid works like this: A (1,3)B (1,2)C (2,0) This is a ‘bottom-up Matrix (MX) Quadtree.’ (section 3.1.2)

6 Let’s do a search on that quadtree: We ruled out the ‘entire’ NE quadrant at the root level of the tree.

7 Trade-offs for bottom-up MX Quadtree, compared to other tree data structures: –The shape of the tree does not depend on the insertion order. –No need to balance the tree (which would be expensive). –Insertion and deletion are cheaper for clustered data.

8 C++ Templates They look like this: template class vector { //... }; In this case, a separate vector class is generated for each item type. // Strong type-checking using templates: MyType * a = someVector.get(5); // instead of: MyType * a = (MyType)someVector.get(5);

9 Turns out, C++ templates are a crude functional programming language. Why ‘crude?’ {- Haskell -} fact 1 = 1 fact n = n * fact (n - 1) // C++ Templates template struct fact { static const int value = N * fact :: value; }; template struct fact { static const int value = 1; }; This is ‘executed’ by the compiler!

10 This is called template metaprogramming. It’s used extensively in DGrid, to make it: –easier to use; –faster; –type safe.

11 Distributed Data DGrid uses Message Passing Interface (MPI) to run on distributed systems. –MPI is a library of basic ‘send’ and ‘receive’ operations. –Each processor gets a unique ID (‘rank’). –Use if-statements to run different code on different processes.

12

13 DGrid DGrid has these data structures: –Two types of 2D arrays. –A quadtree. –A distributed data structure. –A location class. Allows nesting of these data structures.

14 Let’s see some examples of nested data structures: –A 2D array of quadtrees (implied: the quadtree contains locations). –A quadtree of small 2D arrays. –A 2D array of 2D arrays. Called ‘tiling.’ A lot like a ‘shallow’ quadtree.

15 DGrid uses the Composite Design Pattern: DataStructure location

16 DGrid uses templates instead of a ‘Component’ interface. The result is that the user can do this: using namespace dgrid::tags; typedef dgrid::dgrid<MyItem, partial_grid_tag< quadtree_tag > > bucket; bucket a(0, 0, 639, 639, tiles(64, 64) << tiles(1, 1)); This is the ‘2D array of quadtrees’ example.

17 Important: The definition of the data structure is a type! Consequences: –Can check parameters at compile time. (Must be tiles( ) << tiles ( ) for this example, or it won’t compile.) –Compiler can optimize extensively (it knows which functions are going to call each other). –Can’t define a type at runtime, so ‘composition’ must be known at compile time.

18 Data structure operations: –insert(x, y, item) – add item at (x, y) –delete(x, y, item) – remove item from (x, y) –get(x, y, some_list) – get all items at (x, y) –get_range(x 0, y 0, x 1, y 1 ) – get all items in the range [ (x 0, y 0 ) : (x 1, y 1 ) ] Note: even the location class must support these operations.

19 In a nested data structure, operations are passed on from level to level. Because the types are known at compile time, these calls can be inlined. –Pro: eliminates the overhead of the function call. –Con: code size increases (function body is repeated).

20

21 Future Work Add more data structures, more search operations. Separate interface further from implementation. (Dynamic) Load Balancing.