Presentation on theme: "Query Processing in Mobile P2P Databases IGERT Seminar Presentation Bo Xu joint work with Ouri Wolfson."— Presentation transcript:
Query Processing in Mobile P2P Databases IGERT Seminar Presentation Bo Xu joint work with Ouri Wolfson
June 12, 2014IGERT seminar2 Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work
June 12, 2014IGERT seminar3 Query Processing Environments Motivation: a general purpose query processing strategy mobile disconnected wireless ad-hoc networks Vehicular Sensor Network (VSN) GPS receiver chemical spill detector still/video camera vibration sensor acoustic detector
June 12, 2014IGERT seminar4 Store-and-forward to deal with sparseness q A r Q Q A AqA QA
June 12, 2014IGERT seminar5 Issues with Store-and-forward How to manage limited memory, power, and bandwidth? Which reports to save/transmit?
June 12, 2014IGERT seminar6 Difficulty of Store-and-forward Assume that the trajectories of all nodes is known a priori at a central server. If memory, energy, and bandwidth are bounded at mobile nodes, then the problem of determining whether a set of data-items can be disseminated to all the mobile nodes is NP-complete. Case: Each mobile node is interested in every data-item Mobile P2P: Trajectories unknown a priori; Heuristics needed
June 12, 2014IGERT seminar7 Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work
June 12, 2014IGERT seminar8 Mobile P2P Database report 8 query C query B report 4 report 5 Local database Local query query A report 1 report 2 report 3 Pdas, cell-phones, sensors, hotspots, vehicles, with short-range wireless capabilities A B C Applications coexist Variable report sizes A peer can be a produce, consumer, and broker
June 12, 2014IGERT seminar9 Queries A query Q maps each report R to a match degree: Examples: Top parking slots given my current location Profile with expertise children-periodontics Similarity between two images match(R,Q)=e - t- d
June 12, 2014IGERT seminar10 Query/report Dissemination Two peers within transmission range exchange queries and reports Least relevant reports that do not fit in local broker database are purged Exchange not necessarily synchronous (periodic broadcast)
June 12, 2014IGERT seminar11 Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work
June 12, 2014IGERT seminar12 Ranking Factors Rank of a report R is determined by Demand: What fraction of peers are querying R Probability that a peer is interested in R Supply: What fraction of peers already have R Probability that a peer has R Size of R
June 12, 2014IGERT seminar13 Rank of a report expected benefit = demand(R)*(1 supply(R)) reports database reportsbenefit Rank(R)= size(R) demand(R)*(1 supply(R))
June 12, 2014IGERT seminar14 Report Ranking: sample demand Queries relation is FIFO maintained
June 12, 2014IGERT seminar15 Rank of Reports Demand for R Q i s are the members of the queries relation Size of the queries relation determined based on Hoeffdings inequality E.g., if n=108, then with 95% chance the demand estimation error is smaller than 0.08
June 12, 2014IGERT seminar16 How does peer O determine supply(R)? A parametric formula giving the supply is beyond the state of the art O machine-learns supply(R) based on meta- data of R: Age of R Number of times O sighted R from other peers etc.
June 12, 2014IGERT seminar17 Computing Supply by Machine-learning aro: The age rank order within Os reports database fin: The number of times O has sighted the report from other peers MAchine LEarning based Novelty rAnking (MALENA) report-id Report description arofin R1 … 1 R4 … 2 4 R2 … 3 2 R7 … 4 2 Reports database of O report-id Report description arofin R1 … 1 R4 … 2 4 R2 … 3 2 R7 … 4 2
June 12, 2014IGERT seminar18 MALENA BB Request R2 Examples created positive negative
June 12, 2014IGERT seminar19 MALENA Implementation Considerations Minimize overhead No need to actually store examples Model incrementally built Bayesian learning a simple but effective method
June 12, 2014IGERT seminar20 Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work
June 12, 2014IGERT seminar21 mobility model=random way point, average motion speed=1 mile/hour transmission range=100 meters, mean of reports database size=100Kbytes queries database size=100 queries report size uniformly distributed between 1K and 2K bytes 0.1 report produced per second Comparison with RANDI (MDM07) RANDI=MARKET-supply 20 peers within transmission range 1 peer within transmission range MARKET half as good as ideal benchmark MARKET twice better than RANDI
June 12, 2014IGERT seminar22 Comparison with LRU and LFU response-time bound (second) throughput (matches/peer) mobility model=iMotes traces mean of reports database size=150Kbytes queries database size=10 queries report size uniformly distributed between 2K and 20K bytes 0.1 report produced per second, transmission size=100Kbytes (results obtained by Fatemeh Vafaee)
June 12, 2014IGERT seminar23 Evaluation of MALENA (TAAS09) turn-over: peers enter/exit system injection: number of peers that have a report initially mobility model=iMotes traces, reports database size=100 reports 2 reports produced per second, transmission size=10 reports MALENA always follows the best indicator low-turn-over/low-injection high-turn-over/high-injection
June 12, 2014IGERT seminar24 Application: K-nearest-neighbors Query: K-nearest-neighbors of a fixed location (query-point) Reports: current locations of mobile sensors match(Q,R): in reverse proportion to the distance from query point sink query-point
June 12, 2014IGERT seminar25 Itinerary based KNN processing Phase I: Query delivered to the sensor closest to query point Phase II: Query traverses an itinerary to collect answers Phase III: Answers returned to sink
June 12, 2014IGERT seminar26 Simulation Results mobility model=random way point, average motion speed=1 mile/hour transmission range=100 meters report size=24 bytes, query size=16 bytes mean of reports database size=100 reports one location report produced at each sensor per second MARKET is especially suitable for sparse environments
June 12, 2014IGERT seminar27 Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work
June 12, 2014IGERT seminar28 TrafficInfo: Disseminating Traffic Information in VANETs
June 12, 2014IGERT seminar29 What does relevance mean in TrafficInfo A B A B A report is relevant if it changes the route
June 12, 2014IGERT seminar30 Which factors indicate relevance of report? Distance to the reported road segment Type of road segment Speed variance …
June 12, 2014IGERT seminar31 Conceptual Learning Procedure An example is created for a received report The example is labeled positive if the report changes route and negative otherwise Individual vs. group How to deal with aggregation?
June 12, 2014IGERT seminar32 Query processing Conclusion Store-and-forward enables in-network processing in mobile disconnected networks Ranking is important for dealing with memory, bandwidth, and energy constraints sensor-rich environment short-range wireless Mobile P2P+
June 12, 2014IGERT seminar33 Future Work Multimedia reports Utilization of metadata Integration of stateless and stateful approaches Starvation/fairness
June 12, 2014IGERT seminar34 Thanks! Questions?
June 12, 2014IGERT seminar Basics 3 modes: transmitting, receiving, listening (order of power consumption) When listening: if detecting a message destined to host receive-mode Time divided into slots, 20microsecs each Transmission: Listen for 1 time slot If channel free start broadcast (observe collision possible) Broadcast may last for many time slots
June 12, 2014IGERT seminar36 Energy Efficiency of a Broadcast X Throughput (Th) = (expected number of neighbors that successfully receive broadcast) (broadcast size) Power efficiency (PE) = successfully receive the broadcast from x Collisions occur at neighbor
June 12, 2014IGERT seminar37 Computation of Throughput X Y 1.No green node inside starts to broadcast at the same time slot with X 2.No transmission from any purple node overlaps with that from X Conditions for successful reception at an arbitrary node Y
June 12, 2014IGERT seminar38 Energy Constraints Energy consumed by a network interface for transmitting a message of size M bytes En=f M+g For broadcast, g= Joule, f= Joule/byte
June 12, 2014IGERT seminar39 Experimental MP2P Projects (Pedestrians) 7DS – Columbia University (web pages) iClouds – Darmstadt Univ. (incentives) MoGATU – UMBC (specialized query processing, e.g., collaborative joins) PeopleNet – NUS, IIS-Bangalore (Mobile commerce, information type location baazar) MoB – Wisconsin, Cambridge (incentives, information resources e.g. bandwidth) Mobi-Dik – Univ. of Illinois, Chicago (brokering, physical resources, bandwidth/memory/power management)
June 12, 2014IGERT seminar40 Vehicular Projects Inter-vehicle Communication and Intelligent Transportation: CarTALK 2000 is a European project VICS (The Vehicle Information and Control System) is a government-sponsored system in Japan with an 11-year track record FleetNet, an inter-vehicle communications system, is being developed by a consortium of private companies and universities in Germany IVI (Intelligent Vehicle Initiative) and VII (Vehicle Infrastructure Integration), the US DOT MP2P provides data management capabilities on top of these communication systems Grassroots, TrafficView, SOTIS, V3 – P2P dissemination of traffic info to reduce travel times
June 12, 2014IGERT seminar41 RANk-based DIssemination (RANDI) Ranking of reports Bandwidth/energy aware Exchange enhances Consumer functionality Broker functionality Consumer: Answer local query (pull) Broker: Transmit reports most likely requested by future-encountered peers (push) Transmission trigger: Encounter New reports
June 12, 2014IGERT seminar42 RANDI When two peers meet they conduct a two-phase exchange: local query answers more reports satisfied as a consumer (pull) enhanced as a broker (push) Phase 1: Exchange queries and receive answers (pull) Phase 2: Exchange more reports using available energy/bandwidth (push) Phase 1 Phase 2 Combination of: unicast (thin line) and broadcast (thick lines) to enable overhearing.
June 12, 2014IGERT seminar43 RANDI (Contd) To solve problem with static peers: Two interaction modes which combine pull and push Query-response: triggered by discovery of new neighbors Relay: triggered by receipt of new reports Disseminate to existing neighbors new reports
June 12, 2014IGERT seminar44 query reports 7DS P2P mode: each node periodically broadcasts its query and receives reports from neighboring peers. No strategy to determine query frequency and transmission size. Cache management based on web- page expiration time.
June 12, 2014IGERT seminar45 PeopleNet before exchangeafter exchange Peer APeer BPeer APeer B random-spread before exchangeafter exchange Peer APeer BPeer APeer B random-swap Reports are randomly selected for exchanging and saving upon encountering.
June 12, 2014IGERT seminar46 query reports 7DS Each peer periodically broadcasts its query and receives reports from neighboring peers. No strategy to determine query frequency and transmission size. Cache management based on web-page expiration time.
June 12, 2014IGERT seminar47 PeopleNet before exchangeafter exchange Peer APeer BPeer APeer B Reports are randomly selected for exchanging and saving upon encountering.
June 12, 2014IGERT seminar48 Mobile Local Search: Applications transportation Announce sudden stop, malfunctioning brake light, patch of ice Floating car data Dissemination of multi-media traffic information (picture, video, voice) Search close-by taxi customer, parking slot, ride-share social networking (wearable website) Personal profile of interest at a convention Singles matchmaking Floating BBS mobile electronic commerce Sale on an item of interest at mall Music-file exchange emergency response Search for victims in a rubble asset management and tracking Sensors on containers exchange security information => remote checkpoints tourist and location-based-services Closest ATM
June 12, 2014IGERT seminar49 Applications – Common features Mobile/stationary peers Resources of interest in a limited geographic area Short time duration Can be solved by fixed servers, but Unlikely solution Proposed mp2p paradigm can enhance fixed solution (reliability, performance, coverage)
June 12, 2014IGERT seminar50 MARKET When two peers meet they conduct a two-phase exchange: Local query answers more reports satisfied as a consumer (pull) enhanced as a broker (push) Phase 1: Exchange subscriptions and receive answers (pull) Phase 2: Exchange more publications using available energy/bandwidth (push) Phase 1 Phase 2 Combination of: unicast (thin line) and broadcast (thick lines) to enable overhearing.
June 12, 2014IGERT seminar51 MARKET (Contd) To solve problem with static peers: Two interaction modes which combine pull and push Query-response: triggered by discovery of new neighbors Relay: triggered by receipt of new publications Disseminate to existing neighbors new publications
June 12, 2014IGERT seminar52 Query in static disconnected network q A A r In-network query processing may not be possible Q Q Q
June 12, 2014IGERT seminar53 Query in static connected sensor network q A A A r Data transmission delay is 0.Answer can be obtained instantaneously Q Q Q Q QAA A A A qA
June 12, 2014IGERT seminar54 Query in static disconnected network q A A r In-network query processing may not be possible Q Q Q
June 12, 2014IGERT seminar55 Query in mobile disconnected network q A A r One hop case QA qA Query processing enabled by mobility and store-and-forward
June 12, 2014IGERT seminar56 Query in mobile disconnected network q A r Multil-hop case Q Q Query can be in network processed, but it is delayed A Query processing alogrithm doesnt control motion. The answer is disseminated only after an answer node receives query AqA QA First stage: query disseminated during encounter