Optimizing search through distributed space partitioning RUMBLE Gal Yaroslavsky.

Slides:



Advertisements
Similar presentations
Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.
Advertisements

The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Quality Aware Privacy Protection for Location-based Services Zhen Xiao, Xiaofeng Meng Renmin University of China Jianliang Xu Hong Kong Baptist University.
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
PNUTS: Yahoo’s Hosted Data Serving Platform Jonathan Danaparamita jdanap at umich dot edu University of Michigan EECS 584, Fall Some slides/illustrations.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Company Confidential 1 © 2005 Nokia V1-Filename.ppt / yyyy-mm-dd / Initials Towards a mobile content delivery network with a P2P architecture Carlos Quiroz.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
2/18/2004 Challenges in Building Internet Services February 18, 2004.
2/11/2004 Internet Services Overview February 11, 2004.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
The University of Akron Dept of Business Technology Computer Information Systems Database Management Approaches 2440: 180 Database Concepts Instructor:
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Web Caching and CDNs March 3, Content Distribution Motivation –Network path from server to client is slow/congested –Web server is overloaded Web.
Two-Tier Architecture of OSD Metadata Management Xianbo Zhang, Keqiang Wu 11/11/2002.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Project Title: Find New Buddy (FNB) web service Group Number:3 Group Members: Ankur Aggarwal ( ) Saurebh Raut ( ) Siddharth Kodwani ( )
Module 13: Network Load Balancing Fundamentals. Server Availability and Scalability Overview Windows Network Load Balancing Configuring Windows Network.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
A Distributed Architecture for Multi-dimensional Indexing and Data Retrieval in Grid Environments Athanasia Asiki, Katerina Doka, Ioannis Konstantinou,
JBoss Cache. Cache A place to temporarily store data that is expensive or difficult to compute or retrieve. Caches should be fast to access. May or may.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Copyright 2006 MySQL AB The World’s Most Popular Open Source Database MySQL Cluster: An introduction Geert Vanderkelen MySQL AB.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
Applications Web et bases de données en grappe Séminaire InTech 3 Février 2005 – Grenoble.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
International Directory Network (IDN) Scalability, Security and Interoperability WGISS, 2006 Tom Northcutt Systems Administrator: GCMD September 13, 2006.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
MySQL spatial indexing for GIS data in a web 2.0 internet application Brian Toone Samford University
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Windows Azure Conference 2014 LAMP on Windows Azure.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
JSP Server Integrated with Oracle8i Project2, CMSC691X Summer02 Ching-li Peng Ying Zhang.
Windows 7 WampServer 2.1 MySQL PHP 5.3 Script Apache Server User Record or Select Media Upload to Internet Return URL Forward URL Create.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Evaluation of distribution Alternatives of Pantex Spatial database for the Pantex Plant Presented by Ye Maggie Ruan (
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 6: Planning, Configuring, And Troubleshooting WINS.
/ Fast Web Content Delivery An Introduction to Related Techniques by Paper Survey B Li, Chien-chang R Sung, Chih-kuei.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
System Models Advanced Operating Systems Nael Abu-halaweh.
A Presentation Presentation On JSP On JSP & Online Shopping Cart Online Shopping Cart.
Fault – Tolerant Distributed Multimedia Streaming Web Application By Nirvan Sagar – Srishti Ganjoo – Syed Shahbaaz Safir
Scaling Network Load Balancing Clusters
Table General Guidelines for Better System Performance
Services DFS, DHCP, and WINS are cluster-aware.
Affinity Depending on the application and client requirements of your Network Load Balancing cluster, you can be required to select an Affinity setting.
TECHNOLOGY GUIDE THREE
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 6: Planning, Configuring, And Troubleshooting WINS.
Open Source distributed document DB for an enterprise
6/25/2018.
Network Load Balancing
.NET Performance Solutions
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
1. Public Network - Each Rackspace Cloud Server has two networks
Storage Virtualization
TECHNOLOGY GUIDE THREE
Edge computing (1) Content Distribution Networks
Table General Guidelines for Better System Performance
Outline Ganesan, D., Greenstein, B., Estrin, D., Heidemann, J., and Govindan, R. Multiresolution storage and search in sensor networks. Trans. Storage.
TECHNOLOGY GUIDE THREE
Presentation transcript:

Optimizing search through distributed space partitioning RUMBLE Gal Yaroslavsky

What is Rumble? Content aggregator Users “vote” and influence content Submissions are geographically fixated Content ranking based on locality

What are the requirements? Fast search results Fast insertions Space efficient Dynamic balancing & quick updates

Initial thoughts Peer-to-peer Servers are geographically distributed, client polls from surroundings latency – slow results data replication – redundancy update cascade – slow updates Peer-to-peer Servers are geographically distributed, client polls from surroundings

Initial thoughts Client-server. Single server Discretized search space certain areas are more popular than others – not space efficient still O(n) no fault tolerance

Initial thoughts Client-server Single server In-Memory Data Grid Discretized search space Spatial Database using a spatial database incurs large overhead not optimized certain types of queries

Initial thoughts Client-server Single server In-Memory Data Grid Discretized search space Spatial Database

Distributed Search Tree © Client-server REST with optional authentication. In-memory data grid Asynchronous operation. Concurrency. Dynamic addition and removal of cluster members Recursive space partitioning Subdivide dense regions into quadrants

Cluster Congregation Connect to database Join cluster ( Hazelcast) Join web proxy

Buckets, Bottles, Spills and Fills ® All operations on the Distributed Search Tree can be summarized as Buckets, Bottles, Spills and Fills

Technology Client Server Android Volley Asynchronous Networking Google Maps Static Map Images Google Maps APIv2 Jetty Session management RESTful requests Oracle MySQL Server Hazelcast Distributed, Open Source In-Memory Data Grid; Cluster

Client Webserver (Jetty) Hazelcast Cluster Failover Webserver (Jetty) Request Processing Volley

Future thought and implementation Social media integration Partition cluster members to preform unique actions Query by region -> draw on your screen Name-identified tags, !PSUHarrisburg