Project 2 Presentation & Demo Course: Distributed Systems By Pooja Singhal 11/22/2011 1.

Slides:



Advertisements
Similar presentations
PRACTICAL IMPROVEMENT: A-STAR, KD-TREE, FIBONACCI HEAP CS16: Introduction to Data Structures & Algorithms Thursday, April 10,
Advertisements

Christian Lauterbach COMP 770, 2/16/2009. Overview  Acceleration structures  Spatial hierarchies  Object hierarchies  Interactive Ray Tracing techniques.
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Nearest Neighbor Search
Spatial Join Queries. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
CS252: Systems Programming Ninghui Li Program Interview Questions.
1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal
CPSC 335 Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Multidimensional Indexing
Searching on Multi-Dimensional Data
Lecture 13: Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data.
Week 14 - Monday.  What did we talk about last time?  Bounding volume/bounding volume intersections.
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Lecture - 1 on Data Structures. Prepared by, Jesmin Akhter, Lecturer, IIT,JU Data Type and Data Structure Data type Set of possible values for variables.
KD TREES CS16: Introduction to Data Structures & Algorithms Tuesday, April 7,
Introduction General Data Structures - Arrays, Linked Lists - Stacks & Queues - Hash Tables & Binary Search Trees - Graphs Spatial Data Structures -Why.
Introduction General Data Structures - Arrays, Linked Lists - Stacks & Queues - Hash Tables & Binary Search Trees - Graphs Spatial Data Structures -Why.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
FLANN Fast Library for Approximate Nearest Neighbors
10/11/2001CS 638, Fall 2001 Today Kd-trees BSP Trees.
 This lecture introduces multi-dimensional queries in databases, as well as addresses how we can query and represent multi- dimensional data.
Spatial Data Structures Jason Goffeney, 4/26/2006 from Real Time Rendering.
CS223 Algorithms D-Term 2013 Instructor: Mohamed Eltabakh WPI, CS Introduction Slide 1.
B+ Tree What is a B+ Tree Searching Insertion Deletion.
Sorting HKOI Training Team (Advanced)
Introduction. 2COMPSCI Computer Science Fundamentals.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations: Other Operations Chapter 14 Ramakrishnan & Gehrke (Sections ; )
Multi-dimensional Search Trees
IT 152 Data Structures and Algorithms Tonga Institute of Higher Education.
PRESENTED BY – GAURANGI TILAK SHASHANK AGARWAL Collision Detection.
Data Structures Using C++ 2E Chapter 10 Sorting Algorithms.
CSC 211 Data Structures Lecture 13
CSC 221: Recursion. Recursion: Definition Function that solves a problem by relying on itself to compute the correct solution for a smaller version of.
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
Page 1 MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services Shoji Nishimura (NEC Service Platforms Labs.), Sudipto Das,
Hashing is a method to store data in an array so that sorting, searching, inserting and deleting data is fast. For this every record needs unique key.
Lecture by: Prof. Pooja Vaishnav.  Language Processor implementations are highly influenced by the kind of storage structure used for program variables.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Sorting and Searching by Dr P.Padmanabham Professor (CSE)&Director
Chapter 15 A External Methods. © 2004 Pearson Addison-Wesley. All rights reserved 15 A-2 A Look At External Storage External storage –Exists beyond the.
Multi-dimensional Search Trees CS302 Data Structures Modified from Dr George Bebis.
1 C++ Classes and Data Structures Jeffrey S. Childs Chapter 15 Other Data Structures Jeffrey S. Childs Clarion University of PA © 2008, Prentice Hall.
Data Structure and Algorithms
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Single Linked Lists Objectives In this lesson, you will learn to: *Define single linked list *Identify the following types of linked lists: Single linked.
© 2006 Pearson Addison-Wesley. All rights reserved15 A-1 Chapter 15 External Methods.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
Mohammed I DAABO COURSE CODE: CSC 355 COURSE TITLE: Data Structures.
School of Computing Clemson University Fall, 2012
General-Purpose Learning Machine
Strategies for Spatial Joins
Indexing and hashing.
Multiway Search Trees Data may not fit into main memory
Multidimensional Access Structures
Spatial data structures -kdtrees
Spatial Indexing I Point Access Methods.
COMP 430 Intro. to Database Systems
B+ Tree.
Data Structures Using C++ 2E
External Methods Chapter 15 (continued)
KD Tree A binary search tree where every node is a
Orthogonal Range Searching and Kd-Trees
Quadtrees 1.
Indexing and Hashing Basic Concepts Ordered Indices
Lecture 13: Query Execution
Indexing 4/11/2019.
Shape-based Registration
Presentation transcript:

Project 2 Presentation & Demo Course: Distributed Systems By Pooja Singhal 11/22/2011 1

Outline Requirement Requirement Analysis Challenges Design Data Structures 3-Tiered Model Algorithms Implementation Learning and Experience Summary Acknowledgement Demo 2

Requirement Design and Implement a 3-Tiered Client Server Model. Given a City and State, find 5 nearest Airports. Using RPC System: Linux machine, Sun RPC, Language: C++/C 3

3) 3-Tiered Model Requirement Client Places Server Airport Server 4) Requirement of Algorithm To search Nearest Neighbors 4 Requirement Analysis 1) Parsing of Places2k.txt – Fast and Efficient. Which DS? 2) Parsing of airport-locations.txt - Spatial Partitioning. Which DS?

Challenges 3-Tiered Client Server Model. Spatial Partition: Did not know much about it! How to Search 5 nearest Airports ? Again No idea! 1) Parsing of Places2k.txt 2) Design and Implement Client 3) Design and Implement Places Server Test the code So far 4) Design and Implement 3-Tiered Model 5) Design and implement Spatial Partitioning of data in airport- locations.txt 6) Design and implement N nearest Airports Search 5

Design 6

Design Tactics Weekly Submissions made the job easy! First week Design: Parse both the files, Data Structures Second Week Design: IDL Design : Structure Design, Function Design Client Design and Logic: Places Server Design: Server for Client Final Week Design 3-Tiered Model Design Change of Data Structure for airport-locations.txt records Nearest Neighbor Search design 7

8 CLIENTPLACES SERVER AIRPORT SERVER 3-Tiered Model

Client Design Client Gets the location from User in the form of CITY and STATE. Pass it to Places Server Display the results 9

Places Server Design Phase 1 As a Server to Client: On start up, parse places2k.txt in hash table Hash Table key is combination of “CITY” and “STATE” Latitude and Longitude are stored as DATA along with the key Gets the inputs (CITY and STATE) from Client Make the Key: Apply Hash Function Search Hash Table If Found: Get Latitude and Longitude Phase 2: As a Client to Airport Server: Pass on Latitude and Longitude to Airport Server Get the result back from Airport Server Return the result to Client. 10

Airport Server Design 11 Act as a Server to Places Server On start up, parse airport-location.txt Creates a K-D Tree in memory Gets the latitude and location from places server. Search Nearest Neighbor Calculate the Distance Sort the results on Distance Return 5 neared neighbor back to places server

Design - Data Structures 12

Data Structures (1) Hash Table Store places2k.txt records Key is combination of CITY and STATE Data: Latitude and Longitude Advantages: Fast K-D Tree Spatial Partitioning of 2 Dimensional Points consist of Latitude and Longitude Node consists of 2 Dim points (Latitude, Longitude) Node Data: Airport Code, Airport Name, State Linked List Store Results consisting of 5 nearest airports Since, pointers do not get passed over RPC, needed to store the address of the next record 13

Data Structures (2) K-D Tree Space Partitioning Data Structure for storing a finite set of points in a k- dimensional space Invented by J Luis Bentley in 1975 Is a Binary Tree: Special example of Binary Space Partitioning Trees Applications in wide areas: Neural Networks, searching multidimensional data 14 Source: (2,3), (5,4), (9,6), (4,7), (8,1), (7,2) 1. X Plane Division 2. Y Plane Division 3. X Place Division

Design - Algorithms 15

Algorithms Nearest Neighbor Search Starting with the root node, the algorithm moves down the tree recursively Once the algorithm reaches a leaf node, it saves that node point as the "current best“ The algorithm unwinds the recursion of the tree, performing the following steps at each node: If the current node is closer than the current best, then it becomes the current best. The algorithm checks whether there could be any points on the other side of the splitting plane that are closer to the search point than the current best done by intersecting the splitting hyperplane with a hypersphere around the search point that has a radius equal to the current nearest distance.hyperplanehypersphere Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see whether the difference between the splitting coordinate of the search point and current node is less than the distance (overall coordinates) from the search point to the current best. If the hypersphere crosses the plane, there could be nearer points on the other side of the plane, so the algorithm must move down the other branch of the tree from the current node looking for closer points, following the same recursive process as the entire search. If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking up the tree, and the entire branch on the other side of that node is eliminated. When the algorithm finishes this process for the root node, then the search is complete. 16

17 Source:

18 Source:

19 Source:

20 Source:

21 Source:

22 Source:

Implementation 23

Implementation Weekly Submissions made the job easy! First week Implementation: Parsing of places2k.txt and airport-locations.txt, Storage in Hash Tables Second Week Implementation: Make Client and Places Server work Properly IDL implementation: client.x Implementation of Client Logic : client.c Places Server Implementation: places_server.c, client_svc.c Final Implementation Phase 1 : 3-Tiered Model Implementation Phase 2: Implementation of K-D Tree Phase 3: Implementation of Nearest Neighbor Search 24

Implementation Phase 1 : 3-Tiered Model Implementation 2 nd IDL : placesclient.x created places_server.c modified to call airport server Ist IDL client.x modified to include “host” input Phase 2 : K-D Tree Creation Airport server creates K-D tree and stores airport records. Libkdtree++ is used: kd_create(), kd_insert3() Phase 3: Nearest Neighbor Search kd_nearest_range(tree, point, radius) Calculation of Distance of all the points inside the circle with the Point Top 5 Nearest Airports were selected and returned Change of Resultant Structure: Made as Linked List 25

Learning and Experience GREAT LEARNING EXPERIENCE !! 3-Tiered Client - Server Model Data Distribution on different Servers Space Partitioning of multi dimensional data Search in Multi Dimensional data – Practical Approach Working with Hash Tables, K-D Trees, Linked List, Sort 26

Summary Implemented 3-Tiered Client Server Model Use of Hash Table to store places2k.txt Use of K-D Tree to store airport-locations.txt Use of Nearest Neighbor search algorithm Use of Linked List to return Result containing nearest airports 27

Acknowledgements Martin Krafft, Paul Harris, Sylvain Bougerel Library: libkdtree- Open Source STL Like implementation of K-D Trees 28

Demo 29

30