Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNIVERSITY OF SOUTHERN CALIFORNIA Understanding and Utilizing Multi-Dimensional Correlations in Sensor Networks: A Protocol Design Perspective Ahmed Helmy.

Similar presentations


Presentation on theme: "UNIVERSITY OF SOUTHERN CALIFORNIA Understanding and Utilizing Multi-Dimensional Correlations in Sensor Networks: A Protocol Design Perspective Ahmed Helmy."— Presentation transcript:

1 UNIVERSITY OF SOUTHERN CALIFORNIA Understanding and Utilizing Multi-Dimensional Correlations in Sensor Networks: A Protocol Design Perspective Ahmed Helmy Department of Electrical Engineering USC Viterbi School of Engineering University of Southern California helmy@usc.edu Web: ceng.usc.edu/~helmy, Lab: nile.usc.edu

2 UNIVERSITY OF SOUTHERN CALIFORNIA Outline Classifying Correlations How to Utilize Correlations? Insights for Protocol Design –Gradient-based Routing (RUGGED) –Active Query Routing (ACQUIRE) –Abnormality Detection and Filtering Inserted Data WLANs as Sensor Networks (IMPACT) –Sensing access and usage patterns –Analyzing correlations in wireless users behavior Issues

3 UNIVERSITY OF SOUTHERN CALIFORNIA Correlation Classification Dimensions of Correlation: –Spatial Between neighboring nodes –Temporal Across time (different samples) for the same node –Spatio-temporal Moving target (e.g., vehicle), moving phenomenon (e.g., fire) What is correlated? –Sensor readings (e.g., temperature, light, gradients) –Communication channel (e.g., loss, fading) –Localization information, …

4 UNIVERSITY OF SOUTHERN CALIFORNIA How Can We Utilize Correlations? In-network processing –Aggregation –Abstraction/ adaptive fidelity/ zoom-in Prediction (model-based), enables Caching Routing (gradients in time and space, etc.) Abnormality detection (attacks, failures, mis-calibration) Equivalence –Sampling smaller set of nodes (sleep/wake-up) –Topology control

5 UNIVERSITY OF SOUTHERN CALIFORNIA RUGGED: RoUting on finGerprint Gradients in sEnsor Networks Jabed Faruque, Ahmed Helmy Department of Electrical Engineering University of Southern California faruque@usc.edu, helmy@usc.edu URL: http://nile.usc.edu, http://ceng.usc.edu/~helmy - Faruque, Psounis, Helmy, IEEE/ACM DCOSS 2005. - Faruque, Helmy, IEEE ICPS 2004.

6 UNIVERSITY OF SOUTHERN CALIFORNIA Introduction Sensor networks are envisioned to be widely used for habitat and environmental monitoring, among others Sensor networks are envisioned to be widely used for habitat and environmental monitoring, among others Every physical event produces a fingerprint in the environment Every physical event produces a fingerprint in the environment Usually diffusion laws are inherent property of many physical phenomena Usually diffusion laws are inherent property of many physical phenomena f(d)  1/d , where d = distance from the source,  = diffusion parameter, depends on the type of effect () d = distance from the source,  = diffusion parameter, depends on the type of effect (e.g. for temperature  = 1, light  = 2)

7 UNIVERSITY OF SOUTHERN CALIFORNIA Example (of diffusion) : ( North Palm Springs earthquake of July 8, 1986 ) Example (of diffusion) : Isoseismal (intensity) maps ( North Palm Springs earthquake of July 8, 1986 ) Ref.: Southern California Earthquake Center. (http://www.scec.org)

8 UNIVERSITY OF SOUTHERN CALIFORNIA Why Natural Information Gradient is Important? This natural information gradient is FREE This natural information gradient is FREE Routing protocols can use it to forward query packet (greedily) Routing protocols can use it to forward query packet (greedily) - Locate event(s); e.g., fire, nuclear leakage. Diffusion property is not limited to natural phenomena Diffusion property is not limited to natural phenomena - Time gradient Existing approaches – flooding, expanding ring search, random-walk, etc. do not utilize this information gradient Existing approaches – flooding, expanding ring search, random-walk, etc. do not utilize this information gradient

9 UNIVERSITY OF SOUTHERN CALIFORNIA Challenges -Erroneous reading of malfunctioning sensors - Calibration error, obstacles. Cause local max/min -Environmental noise -In real life, sensors unable to measure below certain threshold. So, diffusion curve has finite tail -Non-uniform sensor distribution (gaps) Local Maximum Dip gap

10 UNIVERSITY OF SOUTHERN CALIFORNIA Objective Design an efficient algorithm to locate source(s) in sensor networks, utilizing the natural information gradient i.e., the diffusion pattern of the event’s effect - Gradient based - Fully distributed - Robust to node or sensor failure or malfunction - Capable of finding multiple sources Environment Model Event’s effect follows the diffusion law Event’s effect follows the diffusion law Discontinuity exists in the diffusion curve with finite tail Discontinuity exists in the diffusion curve with finite tail Environmental noise Environmental noise

11 UNIVERSITY OF SOUTHERN CALIFORNIA Basic Protocol  A node can have two mode - flat region mode - gradient region mode  A node forwards the query to neighbors with its information level  To forward the query, each node uses following algorithm: 1. Information gradient region follows greedy approach 1. Information gradient region follows greedy approach - Forwards the query to the neighbors if the information level about the event improves - Forwards the query to the neighbors if the information level about the event improves 2. Unsmooth gradient region use probabilistic forward based on the Simulated Annealing concept 2. Unsmooth gradient region use probabilistic forward based on the Simulated Annealing concept - Probabilistic function is f p (x) = 1/x a, where x = hop count in the information gradient region and ‘a’ depends on the diffusion parameter  - Probabilistic function is f p (x) = 1/x a, where x = hop count in the information gradient region and ‘a’ depends on the diffusion parameter (  ) 3. Use flooding for the flat (ie. zero) information region 3. Use flooding for the flat (ie. zero) information region - Decrease latency to reach gradient information region - Decrease latency to reach gradient information region - Handles query in the absence of event - Handles query in the absence of event  Query ID prevents looping  Once query is resolved, node uses the reverse path to reply

12 UNIVERSITY OF SOUTHERN CALIFORNIA E Q Q’Q’Q’ Q’Q’Q’ Q’Q’Q’ E Q MnMn ngng ngng ngng ngng ngng ngng ngng ngng MxMx npnp npnp npnp npnp npnp npnp npnp npnp All neighbors (n g ) of M n have more information, so they forward the query to their neighbors All neighbors (n g ) of M n have more information, so they forward the query to their neighbors All neighbors (n p ) of M x have less information, so they forward the query to their neighbors probabilistically All neighbors (n p ) of M x have less information, so they forward the query to their neighbors probabilistically

13 UNIVERSITY OF SOUTHERN CALIFORNIA Query Types I. Single-value query - Search for a specific value and have a single response I. Single-value query - Search for a specific value and have a single response II. Global Maxima search - Search for the maximum value of information in the system - Intermediate nodes suppress non-promising replies II. Global Maxima search - Search for the maximum value of information in the system - Intermediate nodes suppress non-promising replies III. Multiple Events detection (still presents a challenge) - Search for multiple events of same type III. Multiple Events detection (still presents a challenge) - Search for multiple events of same type Performance Metrics Reachability i.e., success probability - Probability that the query will reach the source Reachability i.e., success probability - Probability that the query will reach the source Overhead in terms of average energy dissipation - Number of transmissions to forward the query and to get the reply Overhead in terms of average energy dissipation - Number of transmissions to forward the query and to get the reply For the probabilistic function f p (x) = 1/x a, a <  is recommended, but close to  gives optimal trade-off between reachability and overhead For the probabilistic function f p (x) = 1/x a, a <  is recommended, but close to  gives optimal trade-off between reachability and overhead - Reachability ~98% is achievable in presence of noise, gaps and flat region

14 UNIVERSITY OF SOUTHERN CALIFORNIA Comparisons Existing gradient-based routing protocols can be categorized into two major approaches Existing gradient-based routing protocols can be categorized into two major approaches Single-path approach - CADR [Chu2002], Min-hop [Liu2003], … Single-path approach - CADR [Chu2002], Min-hop [Liu2003], … Multiple-path approach - GRAB [Ye2003], RUGGED [Faruque2004] Multiple-path approach - GRAB [Ye2003], RUGGED [Faruque2004] Which approach to choose?

15 UNIVERSITY OF SOUTHERN CALIFORNIA Objective Analyze the performance of these general approaches to route a query - Model query success rate and overhead Analyze the performance of these general approaches to route a query - Model query success rate and overhead Using probability tools Using probability tools - For ideal and lossy wireless link conditions Simulate the protocols based on these approaches in more realistic scenarios Simulate the protocols based on these approaches in more realistic scenarios - Also investigate path quality metric Compare both approaches using analytical and simulation results Compare both approaches using analytical and simulation results

16 UNIVERSITY OF SOUTHERN CALIFORNIA Brief Description of Routing Approaches Single-path Query forwarding with look-ahead = 1 Multiple-path Query forwarding 17.2 18.93.118.9 17.218.921.1 17.292.121.123.8 21.16.921.1 23.84.198.123.8 67.03.221.123.827.5 18.921.130.027.529.032.9 80.5 32.9 23.831.032.941.5 23.827.83.441.557.4 41.5 23.827.832.941.557.410057.441.5 57.441.5 57.4 41.5 Q 17.2 18.93.818.9 17.218.921.1 17.292.121.123.8 21.16.921.1 23.84.198.123.8 67.03.221.123.827.5 18.921.130.027.529.032.9 9.032.9 21.123.831.032.941.5 23.827.83.441.557.4 41.5 23.827.832.941.557.410057.441.5 57.441.5 57.4 41.5 Q SS Look-ahead = 1 Active Node Candidate Node Active Nodes

17 UNIVERSITY OF SOUTHERN CALIFORNIA Variations of Single-path Approach 1. Basic single-path approach - Selects a candidate node having maximum information and higher than current active node - Sensitive to local maxima 2. Improved single-path approach - Selects a candidate node having maximum information 7 8 10 15 12 18 Depends on Next Active node selection policy 7 8 9 14 10 12 Candidate node Active node 13 11 10 9 14 10 12 - Information of the selected node can be less than the current active node - Information of the selected node can be less than the current active node

18 UNIVERSITY OF SOUTHERN CALIFORNIA Comparisons - Query Success Rate (ideal and lossy link case, p c = 0.05 ) Lossy link case - analytical result Ideal link case - analytical result Query success rate of the improved single-path approach drops drastically for lossy links while the multiple-path approach is quite resilient ARQ may improve success rate of the improved single-path approach

19 UNIVERSITY OF SOUTHERN CALIFORNIA Comparisons - Overhead Overhead of both approaches Energy saving of the multiple-path approach over improved single-path approach Multiple-path approach creates extra paths due to probabilistic forwarding, so overhead increases Single-path approach uses 1-hop look ahead at every step to decide on the forwarder With the increase of malfunctioning nodes, the overhead of the single-path approach increases - The length of the path increases

20 UNIVERSITY OF SOUTHERN CALIFORNIA Results – Path Quality (ideal link case) Ratio of the average path length due to a routing approach over the shortest path length between a source and a sink Multiple-path approach results shorter path which are close to the shortest path With the increase of malfunctioning nodes, the path length of the single-path approach increases

21 UNIVERSITY OF SOUTHERN CALIFORNIA Conclusions Multiple-path approach causes less overhead when a source is < 20hops from sink Multiple-path approach causes less overhead when a source is < 20hops from sink - Multiple-path approach yields shorter paths - With increase of malfunctioning nodes, the query success rate of the multiple-path approach degrades gracefully - With lossy links - Query success rate of the single-path approach drops drastically - Multiple-path approach is quite resilient

22 UNIVERSITY OF SOUTHERN CALIFORNIA Future work Combine the benefits of both routing approaches in a hybrid routing approach Combine the benefits of both routing approaches in a hybrid routing approach Develop more adaptive multiple-path approach to reduce the number of extra paths due to probabilistic forwarding Develop more adaptive multiple-path approach to reduce the number of extra paths due to probabilistic forwarding Implementation & evaluation in a test-bed Implementation & evaluation in a test-bed - on-going 150 sensor node new test-bed at USC - continued work under the NSF-funded ACQUIRE project

23 UNIVERSITY OF SOUTHERN CALIFORNIA ACQUIRE: ACtive QUery Forwarding In Sensor Networks Original team: Narayanan Sadagopan, Bhaskar Krishnamachari, Ahmed Helmy Current: Sundeep Pattem, Jabed Faruque, Rahul Orgaonkar, Yongjin Kim, Jung-Hyun Jun, Sapon Tanachaiwiwat, Shao-Cheng Wang Department of Electrical Engineering USC Viterbi School of Engineering University of Southern California URL: http://ceng.usc.edu/~acquire Funding: NSF NETS NOSS, Intel (equipment)

24 UNIVERSITY OF SOUTHERN CALIFORNIA Develop a model of variation over time (or space) using measurements Use the model to predict data/readings. Only trigger updates or queries when data/readings deviate from predicted value. Depending on the data dynamics, we may be able to cache information collected earlier and answer queries without having to trigger new data collection.

25 UNIVERSITY OF SOUTHERN CALIFORNIA ACtive QUery forwarding In sensoR nEtworks (ACQUIRE)* A mechanism for answering one-shot, complex queries for replicated data in sensor nets: –One-shot (vs. continuous): answers are given based explicit queries about current readings. –Complex (vs. simple): the query can contain several sub-queries. E.g: (x OR y) AND z. –Replicated data: several sensors might have answer to a sub-query. Example: Micro Climate Data Collection –Different sensor modalities –Give a location where (Temp > 80 degrees OR Humidity > 40%) AND Wind speed > 20 mph * N. Sadagopan, B. Krishnamachari, A. Helmy, “Active Query Forwarding In Sensor Networks (ACQUIRE)”, AdHoc Networks Journal - Elsevier, Jan 2005 [Earlier version in SNPA ‘03]

26 UNIVERSITY OF SOUTHERN CALIFORNIA

27 UNIVERSITY OF SOUTHERN CALIFORNIA Flooding Based Queries (Directed Diffusion) Flooding: Useful for long standing (continuous) queries Replicated responses might make it very inefficient.

28 UNIVERSITY OF SOUTHERN CALIFORNIA ACQUIRE An active node “refreshes” data from its “neighborhood”. The query is then forwarded to a node on the edge of the neighborhood

29 UNIVERSITY OF SOUTHERN CALIFORNIA ACQUIRE Key Features –In-network processing –Does not rely on geographic information or unicast routing protocol Existence of these may considerably improve performance –d helps us span the space from random walk (d = 0) to flooding (d = D, the network diameter)

30 UNIVERSITY OF SOUTHERN CALIFORNIA ACQUIRE Look-ahead parameter, d –Determines the size of the “neighborhood” in hops. –Effects a tradeoff between the number of steps taken to resolve the query and the energy consumed. –Optimal look-ahead, d* Depends on the query rate, refresh rate and the data dynamics (captured by the amortization factor, c) May be achieved by localized schemes. The higher the query rates & lower the data dynamics, the higher the optimal look ahead.

31 UNIVERSITY OF SOUTHERN CALIFORNIA Performance of ACQUIRE C is the refresh/query ratio (e.g., 0.01 means refresh once every 100 queries) [the refresh overhead is amortized over the saving in queries]

32 UNIVERSITY OF SOUTHERN CALIFORNIA ACQUIRE Efficiency –60-75% energy savings over Expanding Ring Search (analytical results) –Order of magnitude savings over flooding. Future Work –Develop ACQUIRE in to a full fledged protocol that actively adapts the ‘d’ parameter for optimal performance –Evaluation over an experimental sensor network test bed. –ceng.usc.edu/~acquire

33 UNIVERSITY OF SOUTHERN CALIFORNIA Correlations and Inserted Data Main purpose of sensor networks: Collect Data Sybil attacks may insert false data that affect operation of sensor networks: –Impersonating multiple IDs (at same/different times) –Outlier detection alone will not work Approach: –Understand normal correlations between data –Detect outliers based on reference to normal behavior –Design protocol robust to massive amount of forged data

34 UNIVERSITY OF SOUTHERN CALIFORNIA Single Attacker Scenario I Data: X from location (x,y) --Interesting events MobiQuitous 2005 5

35 UNIVERSITY OF SOUTHERN CALIFORNIA Single Attacker Scenario II Data: X’ from location (x,y) --Normal events MobiQuitous 2005 6

36 UNIVERSITY OF SOUTHERN CALIFORNIA Sybil Attack Scenario I Data: W i from location (x i,y i ) --Interesting events MobiQuitous 2005 Source Source/forwarder Attackers (sybil nodes) Inactive node Aggregator Sink

37 UNIVERSITY OF SOUTHERN CALIFORNIA Sybil Attack Scenario II Data: W i ’ from location (x i,y i ) --Normal events MobiQuitous 2005 Source forwarder Inactive node Aggregator Sink Attackers (sybil nodes)

38 UNIVERSITY OF SOUTHERN CALIFORNIA TPHTPHTPHTPH 111 111 116.74.64.74111 122.83.42.91.84.67.80111 126.67.41.56.55.50.64.70.55.77111 ID111116122126 Data Correlation (Great duck island) T: Temperature, P: Pressure, H: Humidity ID: Sensor ID (only 4 neighboring sensors are shown)

39 UNIVERSITY OF SOUTHERN CALIFORNIA Anomaly Relationship Test (ART) Architecture Statistical Analysis Module T*-test (Outlier threshold) Correlation- coefficient analysis Authentication Module Distributed Interactive Proof S. Tanachaiwiwat, A. Helmy, MobiQuitous 2005

40 UNIVERSITY OF SOUTHERN CALIFORNIA Anomaly Relationship Test ( ART) Protocol (1)Correlation/T*- test (2)Request valid credential (3)Response with valid/invalid/no response Compromised /Failed source Verifier (aggregator) sink Prover (attacker) Verifier (forwarder) Sybil MobiQuitous 2005 9 Perform at verifiers only! (4) Send report to sink (5) Cross verify

41 UNIVERSITY OF SOUTHERN CALIFORNIA Summary Dynamic sliding window Correlation analysis and T*- Test can alleviate the attack effectively even under full scale attack from sybil nodes.Dynamic sliding window Correlation analysis and T*- Test can alleviate the attack effectively even under full scale attack from sybil nodes. RemarksRemarks –Recognition of normal/abnormal/malicious events based on statistical analysis –Malicious data insertion can cause the problem to critical mission in WSN –Error is reduced by using Dynamic Sliding Window and careful choice of correlation threshold MobiQuitous 2005 22

42 UNIVERSITY OF SOUTHERN CALIFORNIA Total Population: ~ 25,000 students Wireless Users: ~6000 students Access Points: ~400 WLANs as Sensor Networks

43 UNIVERSITY OF SOUTHERN CALIFORNIA IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace Analysis* Classes of future sensor networks will be attached to humans What kinds of correlations exist between users? Analyze measurements of wireless networks –Understand Wireless Users Behavior (individual and group) –Develop models to understand associations and friendship Study of relationships and user behavior based on measurements of various University WLANs * W. Hsu, A. Helmy, “IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace Analysis”, USC TR, July ‘05 (Under Submission)

44 UNIVERSITY OF SOUTHERN CALIFORNIA Statistics of Studied Traces - Four major campuses - Month long traces studied - Total users in the study: over 12,000 users - Total Access Points in the study: over 1,300

45 UNIVERSITY OF SOUTHERN CALIFORNIA Observations: On-line Time On-off behavior is very common for wireless users. This seems especially true for small handheld devices. There are clear categories of heavy and light users, the distribution of which is skewed and heavily depends on the campus.

46 UNIVERSITY OF SOUTHERN CALIFORNIA Observations: Visited Access Points (APs) Individual users access only a very small portion of APs in the network, less than 35% in all campuses. The long-term mobility of users is highly skewed in terms of time associated with each AP. On average a user spends more than 95% of time at its top five most visited APs. [percentage of visited APs]

47 UNIVERSITY OF SOUTHERN CALIFORNIA Observations: Visited APs The majority of users experience low mobility while using the network. This is even true for portable devices such as PDAs. The actual handoff statistics depend heavily on the environment.

48 UNIVERSITY OF SOUTHERN CALIFORNIA We observe clear repetitive patterns of association in wireless network users. Typically, user association patterns show the strongest repetitive pattern at time gap of one day/one week. Observations: Similarity Index

49 UNIVERSITY OF SOUTHERN CALIFORNIA Observations: Encounters In all the traces, the MNs encounter a relatively small fraction of the user population; below 40% in most cases and never reaching above 60% in any case. Except for UCSD trace, on average a MN only encounters 1.88% - 5.94% of the whole population. The number of total encounters for the users follows a BiPareto distribution, the parameters of which depends on the campus.

50 UNIVERSITY OF SOUTHERN CALIFORNIA Encounter-graphs Definition –When 2 nodes access the same AP at the same time we call this an ‘encounter’ –The encounter graph has all the mobile nodes as vertices and its edges link all those vertices that encounter each other

51 UNIVERSITY OF SOUTHERN CALIFORNIA Regular Graph - High path length - High clustering Random Graph - Low path length, - Low clustering Small World Graph: Low path length, High clustering - In Small Worlds, a few short cuts contract the diameter (i.e., path length) of a regular graph to resemble diameter of a random graph without affecting the graph structure (i.e., clustering)

52 UNIVERSITY OF SOUTHERN CALIFORNIA Encounters link most of the MNs together in a connected graph: –Albeit each MN encounters only with small portion of the population. –The encounter graph is a SmallWorld graph –Even for short time period (1 day) its clustering coefficent, average path length, and connectivity are all close to those for longer traces. Friendship between MNs is highly asymmetric. –The distribution for the friendship index is exponential for all the traces, regardless of the friendship definition (based on time, encouner, or location). –Among all node pairs there are less than 5% with friendship index larger than 0.01, and less than 1% with friendship index larger than 0.4. Encounter-graphs and Friendship

53 UNIVERSITY OF SOUTHERN CALIFORNIA

54 UNIVERSITY OF SOUTHERN CALIFORNIA Top-ranked friends tend to form cliques and low-ranked friends are the key to provide random links and reduce the degree of separation in encounter graph. Encounter-graphs using Friends

55 UNIVERSITY OF SOUTHERN CALIFORNIA Encounters patterns are rich enough to support information diffusion. Specifically, information can be delivered to more than 94% of users within two days. The reachability and average delay do not decrease significantly until at least ~40% of nodes are selfish. Encounter-based Information Diffusion

56 UNIVERSITY OF SOUTHERN CALIFORNIA Vision: Building Community-wide Wireless/Mobility Library Library of measurements from WLANs, mobility and associations from potential wireless societies (e.g., universities, vehicular nets) Library of realistic models of user behavior (e.g., mobility, traffic, friendship, encounter models, … ) Library of benchmarks and guidelines for simulation and evaluation How much insight can we get by analyzing the traces? Can we use the insight to ‘design’ protocols of the future (not only for evaluation)? Currently 20 major universities willing to share their traces …. more to come: http://nile.usc.edu/MobiLib (under heavy update) If you have traces: helmy@usc.edu !

57 UNIVERSITY OF SOUTHERN CALIFORNIA Issues How can we model correlations accurately? How can we further utilize correlations? Context-aware protocols: –Phenomenon-aware protocols –Socially-aware protocols Other kinds of correlations: –Sensor Networks Test-beds: correlation between radio connectivity and phenomenon (e.g., rain) –…

58 UNIVERSITY OF SOUTHERN CALIFORNIA Thank You ! Related Links –ACQUIRE: ceng.usc.edu/~acquire –Mobility Library: nile.usc.edu/MobiLib –Lab: nile.usc.edu –Homepage: ceng.usc.edu/~helmy


Download ppt "UNIVERSITY OF SOUTHERN CALIFORNIA Understanding and Utilizing Multi-Dimensional Correlations in Sensor Networks: A Protocol Design Perspective Ahmed Helmy."

Similar presentations


Ads by Google