Troubleshooting Wireless Mesh Networks Victor Bahl joint work with Lili Qiu, Ananth Rao (UCB) & Lidong Zhou Microsoft Research April.

Slides:



Advertisements
Similar presentations
Ch. 12 Routing in Switched Networks
Advertisements

Mitigating Routing Misbehavior in Mobile Ad-Hoc Networks Reference: Mitigating Routing Misbehavior in Mobile Ad Hoc Networks, Sergio Marti, T.J. Giuli,
Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks By C. K. Toh.
802.11a/b/g Networks Herbert Rubens Some slides taken from UIUC Wireless Networking Group.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
Fault Tolerant Routing in Tri-Sector Wireless Cellular Mesh Networks Yasir Drabu and Hassan Peyravi Kent State University Kent, OH
Madhavi W. SubbaraoWCTG - NIST Dynamic Power-Conscious Routing for Mobile Ad-Hoc Networks Madhavi W. Subbarao Wireless Communications Technology Group.
Unicast Performance Analysis of Extended ODMRP in a Wired-to- Wireless Hybrid Ad-Hoc Network Sang Ho Bae Sungwook Lee Mario Gerla UCLA Computer Science.
Edith C. H. Ngai1, Jiangchuan Liu2, and Michael R. Lyu1
Chapter 19: Network Management Business Data Communications, 4e.
1 Estimation of Link Interference in Static Multi-hop Wireless Networks Jitendra Padhye, Sharad Agarwal, Venkat Padmanabhan, Lili Qiu, Ananth Rao, Brian.
Jorge Hortelano, Juan Carlos Ruiz, Pietro Manzoni
Wireless Mesh Networks 1. Architecture 2 Wireless Mesh Network A wireless mesh network (WMN) is a multi-hop wireless network that consists of mesh clients.
Random Access MAC for Efficient Broadcast Support in Ad Hoc Networks Ken Tang, Mario Gerla Computer Science Department University of California, Los Angeles.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Traffic Engineering With Traditional IP Routing Protocols
Network Coding and Reliable Communications Group A Multi-hop Multi-source Algebraic Watchdog Muriel Médard † Joint work with MinJi Kim †, João Barros ‡
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Muhammad Mahmudul Islam Ronald Pose Carlo Kopp School of Computer Science & Software Engineering Monash University, Australia.
Mitigating routing misbehavior in ad hoc networks Mary Baker Departments of Computer Science and.
IEEE OpComm 2006, Berlin, Germany 18. September 2006 A Study of On-Off Attack Models for Wireless Ad Hoc Networks L. Felipe Perrone Dept. of Computer Science.
Cross Layer Design in Wireless Networks Andrea Goldsmith Stanford University Crosslayer Design Panel ICC May 14, 2003.
The Impact of Multihop Wireless Channel on TCP Throughput and Loss Zhenghua Fu, Petros Zerfos, Haiyun Luo, Songwu Lu, Lixia Zhang, Mario Gerla INFOCOM2003,
ITIS 6010/8010 Wireless Network Security Dr. Weichao Wang.
1 Sustaining Cooperation in Multi-Hop Wireless Networks Ratul Mahajan, Maya Rodrig, David Wetherall and John Zahorjan University of Washington Presented.
Taming the Underlying Challenges of Reliable Multihop Routing in Sensor Networks.
Measurement and Monitoring Nick Feamster Georgia Tech.
Adaptive Self-Configuring Sensor Network Topologies ns-2 simulation & performance analysis Zhenghua Fu Ben Greenstein Petros Zerfos.
Selfish MAC Layer Misbehavior in Wireless Networks Pradeep Kyasanur and Nitin H. Vaidya 2005 IEEE Reviewed by Dean Chiang.
Fault Detection, Isolation, and Diagnosis In Multihop Wireless Networks Lili Qiu, Paramvir Bahl, Ananth Rao, and Lidong Zhou Microsoft Research Presented.
Distributed Quality-of-Service Routing of Best Constrained Shortest Paths. Abdelhamid MELLOUK, Said HOCEINI, Farid BAGUENINE, Mustapha CHEURFA Computers.
A Vehicular Ad Hoc Networks Intrusion Detection System Based on BUSNet.
MOBILE AD-HOC NETWORK(MANET) SECURITY VAMSI KRISHNA KANURI NAGA SWETHA DASARI RESHMA ARAVAPALLI.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
SMUCSE 8344 Wireless Mesh. SMUCSE 8344 The Premise.
1 Architecture and Techniques for Diagnosing Faults in IEEE Infrastructure Networks Atul Adya, Victor Bahl, Ranveer Chandra, Lili Qiu Microsoft.
Denial of Service (DoS) Attacks in Green Mobile Ad–hoc Networks Ashok M.Kanthe*, Dina Simunic**and Marijan Djurek*** MIPRO 2012, May 21-25,2012, Opatija,
Power Save Mechanisms for Multi-Hop Wireless Networks Matthew J. Miller and Nitin H. Vaidya University of Illinois at Urbana-Champaign BROADNETS October.
Wireless Mesh Network 指導教授:吳和庭教授、柯開維教授 報告:江昀庭 Source reference: Akyildiz, I.F. and Xudong Wang “A survey on wireless mesh networks” IEEE Communications.
Packet Dispersion in IEEE Wireless Networks Mingzhe Li, Mark Claypool and Bob Kinicki WPI Computer Science Department Worcester, MA 01609
Congestion Control in CSMA-Based Networks with Inconsistent Channel State V. Gambiroza and E. Knightly Rice Networks Group
Overview of Mesh Networking MSR Jitendra Padhye Microsoft Research January 23, 2006.
S4-Chapter 3 WAN Design Requirements. WAN Technologies Leased Line –PPP networks –Hub and Spoke Topologies –Backup for other links ISDN –Cost-effective.
Muhammad Mahmudul Islam Ronald Pose Carlo Kopp School of Computer Science & Software Engineering Monash University, Australia.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
11/15/20051 ASCENT: Adaptive Self- Configuring sEnsor Networks Topologies Authors: Alberto Cerpa, Deborah Estrin Presented by Suganthie Shanmugam.
S Master’s thesis seminar 8th August 2006 QUALITY OF SERVICE AWARE ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKS Thesis Author: Shan Gong Supervisor:Sven-Gustav.
Architectures and Algorithms for Future Wireless Local Area Networks  1 Chapter Architectures and Algorithms for Future Wireless Local Area.
Differential Ad Hoc Positioning Systems Presented By: Ramesh Tumati Feb 18, 2004.
High-integrity Sensor Networks Mani Srivastava UCLA.
Distributed Channel Assignment and Routing Multiradio Mutlichannel Multihop Wireless Networks Haitao Wu, Fan Yang, Kun Tan, Jie Chen, Qian Zhang, and Zhenshrng.
SenProbe: Path Capacity Estimation in Wireless Sensor Networks Tony Sun, Ling-Jyh Chen, Guang Yang M. Y. Sanadidi, Mario Gerla.
Ad Hoc Network.
Performance of Adaptive Beam Nulling in Multihop Ad Hoc Networks Under Jamming Suman Bhunia, Vahid Behzadan, Paulo Alexandre Regis, Shamik Sengupta.
Troubleshooting Mesh Networks Lili Qiu Joint Work with Victor Bahl, Ananth Rao, Lidong Zhou Microsoft Research Mesh Networking Summit 2004.
Evaluation of ad hoc routing over a channel switching MAC protocol Ethan Phelps-Goodman Lillie Kittredge.
Wireless Mesh Networks Myungchul Kim
-1/16- Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks C.-K. Toh, Georgia Institute of Technology IEEE.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
MAC Protocols for Sensor Networks
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
In the name of God.
Problem: Internet diagnostics and forensics
Architecture and Algorithms for an IEEE 802
Semester 4 - Chapter 3 – WAN Design
Presented by Prashant Duhoon
ITIS 6010/8010 Wireless Network Security
Protocols.
Performance of VoIP in a b wireless mesh network
Protocols.
Presentation transcript:

Troubleshooting Wireless Mesh Networks Victor Bahl joint work with Lili Qiu, Ananth Rao (UCB) & Lidong Zhou Microsoft Research April 1, 2004

Mesh Network Management ISO’s definition of network management: –Fault Management –Configuration Management –Security Management –Performance management –Accounting “Network management is a process of controlling a complex data network so as to maximize its efficiency and productivity”

Goals Assist with Mesh Router configuration Reactive and Pro-active Trouble Shooting –Investigate reported performance problems Time-series analysis to detect deviation from normal behavior –Localize and Isolate trouble spots Collect and analyze traffic reports from mesh nodes –Determine possible causes for the trouble spots Interference, or hardware problems, or network congestion, or malicious nodes …. Respond to troubled spots –Re-route traffic –Rate limit –Change topology via power control & directional antenna control –Flag environmental changes & problems

Nomenclature Mesh Management Module (M 3 ) –Runs on every node Mesh Management Server (MMS) –Runs on gateway or designated nodes Mesh Network Management Protocol (MNMP) –Protocol (similar to SNMPv3) between M 3 and MMS

Focus of this talk Gathering & Distribution Data Cleaning Data Fault Isolation & Diagnosis

Challenges in Fault Diagnosis Characteristics of multi-hop wireless networks –Unpredictable physical medium, prone to link errors –Network topology is dynamic –Resource limitation calls for a diagnosis approach with low overhead –Vulnerable to link attacks Identifying root causes –Just knowing link statistics is insufficient –Signature Based Techniques don’t work well –Determining normal behavior is hard Handling multiple faults –Complicated interactions between faults and traffic, and among faults themselves

Previous Approaches to Fault Diagnosis Protocols for Network Management ANMP [singh99] Guerrilla [shen02] Detecting Routing and MAC misbehavior Watchdog & pathrater [Baker00] MACMis [Vaidya03] Fault Management in Infrastructure mode AirWave, AirDefense, UniCenter, Symbol’s WNMS, IBM’s WSA, Wibhu’s SpetraMon, …

Our Approach Use a network simulator as a real-time diagnostic tool

Fault Detection, Isolation & Diagnosis Process Collect Data Clean Data Diagnose Faults Simulate Raw Data Root Causes Measured Performance Routes Link Loads Signal Strength Inject Candidate Faults Performance Estimate Agent Module Manager Module SNMP MIBs Performance Counters WRAPI MCL NativeWiFi

Root Cause Analysis Module

Our Fault Diagnosis Framework Advantages –Flexible & customizable for a large class of networks –Captures complicated interactions within the network, between the network & environment, and among multiple faults –Extensible in its ability of detecting new faults –Facilitates what-if analysis Challenges –To accurately reproduce the behavior of the network inside a simulator –To build a fault diagnosis technique using the simulator as a diagnosis tool

Handling the Challenges Reproducing network behavior Identify the set of traces to collect Rule out erroneous data from the trace Drive the simulator with the cleaned traces Building fault diagnosis Use performance results from trace-driven simulation to establish the normal behavior Deviation from the normal behavior indicates a potential fault Identify root causes by efficiently search over fault space to re-produce faulty symptoms

Why Simulator? Flow 1 Flow 2 Flow 3 Flow 4 Flow Mbps0.23 Mbps2.09 Mbps0.17 Mbps2.55 Mbps

Simulator Accuracy: RF Propagation RF propagation model versus measured signal strengths for IEEE a cards from different vendors

Simulator Accuracy Experiments –A single one-hop UDP flow –2 UDP flows within communication range –2 UDP flows within interference range –1 UDP flow with 2 hops where the src. & dest. Are within communication range –1 UDP flow with 2 hops where the src. & dest. Are within interference range but not communications range

Simulator Accuracy: Throughput Estimated versus actual throughput when channel conditions are good (IEEE a)

Simulator Accuracy: Throughput (2) Estimated matches measured throughput till the channel conditions become poor

Simulator Accuracy: Throughput No. of Walls Loss Rate Measured Throughput Simulated Throughput %15.52 Mbps15.94 Mbps %12.56 Mbps14.01 Mbps %12.97 Mbps11.55 Mbps Estimated matches measured throughput for poor channel conditions when loss rate is incorporated

How Stable is the Channel? Good environmental conditions, received signal strength remains stable

Data Collection What should we collect? –Network Topology/Connectivity Info (Neighbor Table) –Noise level & signal strength –Traffic load to direct neighbor –Loss rate to direct neighbor (retransmission count)

Data Distribution Design Goal Minimize bandwidth consumption Techniques –Dynamic scoping Each node takes a local view of the network The coverage of the local view adapts to traffic patterns –Adaptive monitoring Minimize measurement overhead in normal case Change update period Push and pull –Delta compression –Multicast

Management Overhead 40 Kb/sec 25 Kb/sec 15 Kb/sec BW requirement does not go up much with network size Info distributed: Routing changes Traffic counters (e.g. pkts. sent & rcv.) Signal Strength Avg: 1 to 5 hops

Measurement Overhead on Throughput

Data Cleaning Data may not be pristine. Why? –Liars, malicious users –Missing data –Measurement errors Clean the Data –Detect Liars Assumption: most nodes are honest Approach: –Neighborhood Watch –Find the smallest number of lying nodes to explain inconsistency in traffic reports –Smoothing & Interpolation

Example: Resiliency against Liars/Lossy Links Problem Identify nodes that report incorrect information (liars) Detect lossy links Assume Nodes monitor neighboring traffic, build traffic reports and periodically share info. Most nodes provide reliable information Challenge Wireless links are error prone and unstable Approach Find the smallest number of lying nodes to explain inconsistency in traffic reports Use the consistent information to estimate link loss rates Results

Fault Diagnosis Algorithm 1. Initialization: diagnosed fault set F = { } 2. Forward addition while (diff(MeasuredPerf, SimulatedPerf(F)) > threshold) { Find a candiate fault that explains the mismatch between current and predicted performance the most, and add it to F } 3. Backward deletion while (diff(MeasuredPerf, SimulatedPerf(F)) > threshold) { Find a fault in F that explains the mismatch the least. Delete it from F if excluding it results in little change } 4. Report F

Fault Diagnosis Algorithm (Cont’d) What does it mean “The fault A explains the mismatch between current and predicted performance the most”? –Diff(MeasuredPerf, PredictedPerf(with fault A)) is smallest –Probability(MeasuredPerf|Fault A) is largest These two criterions give us two effective search algorithms

Performance Number of faults Coverage False Positive Faults detected: - Random packet dropping - MAC misbehavior - External noise 25 node random topology

Performance Evaluation Measurements show that performance from trace-driven simulation matches reality We are able to diagnose random packet dropping, external noise sources, and MAC misbehavior –Diagnose over 10 simultaneous faults of multiple types in a simulated 49-node network with 80% coverage and close to 0 false positive –Implemented our approach in a small multihop IEEE a testbed, and showed it can diagnose random packet dropping

What-if Analysis Improvement on removing flows ActionTotal Throughput (Mbps) None1.064 Reduce Flow 8 by ½1.148 Re-route Flow 8 around grid boundary1.217 Increase power from 15 dBm to 20 dBm0.99 Increase power from 15 dBm to 25 dBm1.661

Mesh Visualization Module

Thanks!

Backup

Detection of Intentional Packet Drops Scenario - 49 node network - Randomly pick nodes that drop packets