Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why Data Fusion in Sensor Networks needs a new Champion? Kalyan Veeramachaneni Evo-Design Group CSAIL, Yumm Eye Tee Work done at Development and Research.

Similar presentations


Presentation on theme: "Why Data Fusion in Sensor Networks needs a new Champion? Kalyan Veeramachaneni Evo-Design Group CSAIL, Yumm Eye Tee Work done at Development and Research."— Presentation transcript:

1 Why Data Fusion in Sensor Networks needs a new Champion? Kalyan Veeramachaneni Evo-Design Group CSAIL, Yumm Eye Tee Work done at Development and Research in Evolutionary Algorithms for Multisensor Smart Networks (DreamsNet) Syracuse University Evo-Design Group, CSAIL, MIT, September 3, 2009

2 Acknowledgements Lisa Osadciw, Syracuse University Lisa Osadciw, Syracuse University Kai Goebel, NASA Ames Research Kai Goebel, NASA Ames Research Arun Ross, West Virginia University Arun Ross, West Virginia University Weizhong Yan, GE Global Research Center Weizhong Yan, GE Global Research Center Vishwanath Avasarala, GE Global Research Center Vishwanath Avasarala, GE Global Research Center Nisha Srinivas, Syracuse University Nisha Srinivas, Syracuse University 2

3 Sensor Network Projects Biometric Security System Biometric Security System Wind Turbine Diagnostics and Prognostics Wind Turbine Diagnostics and Prognostics First Responders Sensor Network First Responders Sensor Network Pipeline Crack Detection System Pipeline Crack Detection System Airport Ground Surveillance System Airport Ground Surveillance System 3

4 Sensors : How big are they? 4

5 5 What are we detecting? Modern day society relies on detection or determining the meaning of the presence or absence of a signal Modern day society relies on detection or determining the meaning of the presence or absence of a signal Digital Communications Digital Communications Pipeline/Bridges crack detection Pipeline/Bridges crack detection Genuine User detection using biometrics Genuine User detection using biometrics Presence of aircraft, ships, or motor vehicles Presence of aircraft, ships, or motor vehicles Locating emergency personnel Locating emergency personnel Weather Phenomena Weather Phenomena Building Security Building Security Sensors are located in remote areas making decisions using a variety of criteria Sensors are located in remote areas making decisions using a variety of criteria Maximum A-Posteriori Criterion Maximum A-Posteriori Criterion Maximum Likelihood Criterion Maximum Likelihood Criterion Minimum Error Criterion Minimum Error Criterion

6 Systems Level View (1) Signal Processing 6 Hardware drives the design Ideally we would want a simple threshold on the incoming data

7 Systems Level View (2) Machine Learning 7 We do not have control on collection of data Data drives the entire design

8 Applications Signal Processing Signal Processing Digital communications Digital communications Wireless communications Wireless communications Radars Radars Surveillance systems Surveillance systems Locationing and GPS Locationing and GPS 8 Machine Learning Machine Learning Online diagnostic tools (aircrafts, turbines etc. ) Online diagnostic tools (aircrafts, turbines etc. ) Medical Diagnostics ( Cancer, Neurological disorders, seizures etc.) Medical Diagnostics ( Cancer, Neurological disorders, seizures etc.) Fraud detection on online systems Fraud detection on online systems Inferencing in Sensor Networks Inferencing in Sensor Networks A mix of both problems A mix of both problems Seamless interaction of hardware and software Seamless interaction of hardware and software Applications are a mix as well Applications are a mix as well Seamless interaction of system entities as well Seamless interaction of system entities as well Biometrics is a classic example !! Biometrics is a classic example !!

9 Likelihood Ratio Test (1) Traditional “digital communications” example Traditional “digital communications” example Decide either a bit ‘0’ or ‘1’ has been sent Decide either a bit ‘0’ or ‘1’ has been sent Additive white Gaussian noise (AWGN) Additive white Gaussian noise (AWGN) Likelihood Ratio Test (maximizes posterior probability) Likelihood Ratio Test (maximizes posterior probability) Optimal for Bayesian Cost Function Optimal for Bayesian Cost Function 9 Noise only Signal +noise Notice this is a linear cost function

10 Likelihood Ratio Test (2) The ratio for the digital communications gives us a neat threshold detector The ratio for the digital communications gives us a neat threshold detector As long as standard deviation under both the Hypothesis is the same As long as standard deviation under both the Hypothesis is the same It makes the LRT linear and very simple to implement It makes the LRT linear and very simple to implement After taking logarithm on both sides and solving After taking logarithm on both sides and solving 10

11 Likelihood Ratio Test (3) What happens when it is not digital communications ? What happens when it is not digital communications ? For example, unknown signal buried in noise For example, unknown signal buried in noise Standard Deviation under both the hypotheses is different Standard Deviation under both the hypotheses is different LRT becomes quadratic and requires two thresholds LRT becomes quadratic and requires two thresholds Note: Still Gaussian under both the Hypothesis Note: Still Gaussian under both the Hypothesis Solving for the roots of the quadratic we get decision regions as: Solving for the roots of the quadratic we get decision regions as: Decide H 0 : Decide H 0 : Decide H 1 : Decide H 1 : 11

12 Likelihood Ratio Test (4) What about when under H 0 is Gaussian and under H 1 it is Exponential, will the ratio still result in a simple detector? What about when under H 0 is Gaussian and under H 1 it is Exponential, will the ratio still result in a simple detector? Multimodal distributions with multiple peaks? Multimodal distributions with multiple peaks?Summary: When you have a linear cost function, linear system operations (additive noise) you will have neat linear operations in the detector. Can we design detectors for more complicated models? What happens when we have multiple detectors helping us make a decision? 12

13 13 Data Fusion (1) Binary hypothesis testing problem Binary hypothesis testing problem H 0 : Indicates an absence H 0 : Indicates an absence H 1 : Indicates the presence of the phenomena H 1 : Indicates the presence of the phenomena Decisions rendered by multiple classifiers (matchers) are fused to generate a global decision Decisions rendered by multiple classifiers (matchers) are fused to generate a global decision In bandwidth constrained remote processing, decisions are made locally by the classifier before sending them to the central node In bandwidth constrained remote processing, decisions are made locally by the classifier before sending them to the central node

14 14 Data Fusion (2) Let x i be the match score generated by the i th classifier Let x i be the match score generated by the i th classifier Each classifier applies its own threshold,, to determine if x i is a genuine or an impostor score Each classifier applies its own threshold,, to determine if x i is a genuine or an impostor score The variable u i records the decision made by the local classifier. The variable u i records the decision made by the local classifier. Let [u] = (u 1, u 2, …u n ) be the set of decisions rendered by multiple classifiers Let [u] = (u 1, u 2, …u n ) be the set of decisions rendered by multiple classifiers The variable u f denotes the global decision as a consequence of fusing local decisions (u f is 0, or u f is 1) The variable u f denotes the global decision as a consequence of fusing local decisions (u f is 0, or u f is 1)

15 15 Bandwidth Constrained Detection Networks Sensor1 Sensor2 X1X1 X2X2 Fusion Rule u1u1 u2u2 Second Classifier Only OR AND First Sensor Only Likelihood density model for a sensor Noise only Event

16 16 Errors to be minimized Goal : Two errors need to be minimized. Goal : Two errors need to be minimized. Bayesian risk function is minimized Bayesian risk function is minimized

17 17 Independent Decisions The errors can be estimated using

18 18 Correlated Decisions Estimation of 2 n -1 joint probabilities for n classifiers Estimation of 2 n -1 joint probabilities for n classifiers Numerical integration is done to estimate the joint probability integrals Numerical integration is done to estimate the joint probability integrals Bahadur-lazarfeld expansion reduces computational burden Bahadur-lazarfeld expansion reduces computational burden Correlation between normalized decisions Normalized Decisions

19 19 What is the Problem? Joint optimization of thresholds and fusion rule (decision level) Joint optimization of thresholds and fusion rule (decision level) The objective function is the Bayesian risk function: The objective function is the Bayesian risk function: We incorporate the thresholds as the search variables, the search is a NP Complete problem 1 We incorporate the thresholds as the search variables, the search is a NP Complete problem 1 1 John N Tsitsiklis, Michael Athans, “On Complexity of Decentralized Decision making and detection problems” 23 rd IEEE Conference on Decision and Control, 1984 Effect of fusion rule design Effect of threshold design

20 20 Bandwidth Constrained Detection Networks Two types of Errors need to be reduced Two types of Errors need to be reduced If the entire observation value is transmitted to a central processing node, an efficient machine learning technique can be designed to achieve better accuracy If the entire observation value is transmitted to a central processing node, an efficient machine learning technique can be designed to achieve better accuracy Shown below are 20000 samples of observations, 10000 belong to events, 10000 to noise. Shown below are 20000 samples of observations, 10000 belong to events, 10000 to noise. 9 to 32 bits required per sample if all bits are transmitted 9 to 32 bits required per sample if all bits are transmitted Reduces to 1 bit decision if decision is transmitted instead Reduces to 1 bit decision if decision is transmitted instead Misses: Fail to detect an event False Alarms: detecting an event that did not occur Threshold on Sensor 1 Threshold on Sensor 2 Event is declared only in this quadrant, i.e. AND rule Noise * Event

21 21 What has been happening in this area? Amount of Research and Publications on Topic Indicates Complexity Amount of Research and Publications on Topic Indicates Complexity Quick Check Research Publications Quick Check Research Publications 120 Journal Articles with Approximately 45 Discussing Similar Design Issues 120 Journal Articles with Approximately 45 Discussing Similar Design Issues 48 Textbooks At Least Currently On Sale In This Area 48 Textbooks At Least Currently On Sale In This Area 5 Dissertations deal with same problem and provide human developed designs 5 Dissertations deal with same problem and provide human developed designs Paper Published that Addresses the Difficulty Paper Published that Addresses the Difficulty John N Tsitsiklis, Michael Athans, “On Complexity of Decentralized Decision making and detection problems” 23rd IEEE Conference on Decision and Control, 1984 John N Tsitsiklis, Michael Athans, “On Complexity of Decentralized Decision making and detection problems” 23rd IEEE Conference on Decision and Control, 1984 Optimizing Distributed Detection for 2 Sensors Optimizing Distributed Detection for 2 Sensors Independent sensors: Intractable Independent sensors: Intractable Correlated sensors: NP Complete - Correlated sensors: NP Complete - Researchers are reluctant to use EAs Researchers are reluctant to use EAs A simple architectural or a parameter change can give you literally 10 pages worth of equations, fancy !! A simple architectural or a parameter change can give you literally 10 pages worth of equations, fancy !! Failure modes of gradient descent and other approaches are not identified Failure modes of gradient descent and other approaches are not identified

22 22 Likelihood Ratio Test Based Design Decouple the two problems: optimize thresholds and fusion rule separately Decouple the two problems: optimize thresholds and fusion rule separately Identify optimal individual threshold that minimizes the Bayesian Error Identify optimal individual threshold that minimizes the Bayesian Error Optimal fusion rule for independent decisions Optimal fusion rule for independent decisions Optimal fusion rule for correlated decisions Optimal fusion rule for correlated decisions

23 Gradient Descent Approach Use gradient information to simultaneously optimize fusion rule and thresholds Use gradient information to simultaneously optimize fusion rule and thresholdswhere Threshold for a sensor is the solution of the likelihood ratio test given by Threshold for a sensor is the solution of the likelihood ratio test given bywhere

24 24 Particle Swarm Optimization Each particle is a solution Particles are randomly initialized in the search space Particle are moved in the search space using Demonstration on a test problem

25 25 PSO Based Design Random Initialization of Particles Velocity and Position Updates Cost Evaluation Save the best solution so far Update Particles Memory i<n PSO parameters C FA Training Data Output the best solution Convergence

26 PSO : Binary Search Spaces Using a sigmoid transformation on the velocity, the probability of a binary variable can be determined ( Kennedy et al.) Using a sigmoid transformation on the velocity, the probability of a binary variable can be determined ( Kennedy et al.) Position update is changed to Position update is changed to Velocity update equation is not changed and the learning behavior of swarm is preserved Velocity update equation is not changed and the learning behavior of swarm is preserved

27 PSO : Binary Search Spaces Transition is now probabilistic Transition is now probabilistic Particles try to position themselves in the velocity space such that they have maximum probability of having a value ‘1’, in case they have evidence from multiple neighbors/iterations about the goodness of being at value ‘1’ for a variable Particles try to position themselves in the velocity space such that they have maximum probability of having a value ‘1’, in case they have evidence from multiple neighbors/iterations about the goodness of being at value ‘1’ for a variable

28 PSO : Discrete Search Spaces Many problems in real world optimization are binary, discrete Many problems in real world optimization are binary, discrete For example, in sensor management, sensor selection, i.e., the sensor number is discrete variable For example, in sensor management, sensor selection, i.e., the sensor number is discrete variable Increased complexity due to binary transformation of a discrete variable Increased complexity due to binary transformation of a discrete variable The Hamming distance between two discrete values undergoes a non- linear transformation when an equivalent binary representation is used instead The Hamming distance between two discrete values undergoes a non- linear transformation when an equivalent binary representation is used instead The range of the discrete variable often does not match the upper limit of the equivalent binary representation The range of the discrete variable often does not match the upper limit of the equivalent binary representation For example, a discrete variable of range [0,1,2,3,4,5] requires a three bit binary representation, which ranges between [0-7] For example, a discrete variable of range [0,1,2,3,4,5] requires a three bit binary representation, which ranges between [0-7]

29 PSO : Discrete Search Spaces Modify the Sigmoid Transformation, for a M-ary system Modify the Sigmoid Transformation, for a M-ary system The sigmoid gives the parameters of the distribution from which the discrete value is generated, i.e., The sigmoid gives the parameters of the distribution from which the discrete value is generated, i.e., Particles try to position themselves in the velocity space such that the probability of one or the other discrete variable is high Particles try to position themselves in the velocity space such that the probability of one or the other discrete variable is high Using normal distribution here, Other distributions can be used if Boundary Conditions, due to infinite support of the normal distribution

30 PSO : Discrete Search Spaces if Boundary Conditions, due to infinite support of the normal distribution

31 31 Sensor1 Sensor2 X1X1 X2X2 Fusion Rule u1u1 u2u2 Human Design Solution: Likelihood Ratio Test (LRT) Design Optimize thresholds individually by keeping other thresholds and fusion rule constant Use LRT for independent or correlated deriving fusion rule Human Design Solution: Person-by-Person Optimal (PBPO) for Independent Sensors Human Competitive Result: Particle Swarm Optimization (PSO) Based Design Joint optimization of thresholds and Fusion Rule No closed form solution exists

32 Sensor Suites : Homogeneous Network All sensors are identical in performance All sensors are identical in performance

33 Sensor Suites: Heterogeneous Network Type 1 Different sensors have different separation of means between the two hypothesis Different sensors have different separation of means between the two hypothesis

34 Sensor Suite : Heterogeneous Network Type 2 Different standard deviations under both hypothesis and different separation of means, solution to LRT is quadratic Different standard deviations under both hypothesis and different separation of means, solution to LRT is quadratic

35 Results- Independent Observations, Homogeneous Network Number of Sensors PBPOPSO % Improvements 30.226840.2268430 60.15613 0 90.112540.1097582.4720 120.0838290.0798624.7322 150.0602430.05869172.575 Probability of Error Achieved for Different Algorithms Averaged over 100 Trials

36 Results- Independent Observations, Homogeneous Network Counting the evaluations to measure “time” Counting the evaluations to measure “time”

37 Results : Independent Observations, Heterogeneous Type 1 Number of Sensors PBPO PSO % Improvements 3 0.05648340.0555011.7384 5 0.00234260.001451838.0250 7 7.022446e-0061.97107e-00671.9317 9 3.539055e-0095.2906e-01198.5050 Probability of Error Achieved for Different Algorithms Averaged over 100 Trials

38 Preliminary Results : Independent Observations, Heterogeneous Type 2 Number of Sensors PBPOPSO (Single Threshold) % Benefits 3 1.4350e-0042.7207e-00581.04 4 8.9807e-0067.9398e-00611.59 Probability of Error Achieved for Different Algorithms Averaged over 100 Trials

39 39 Result: Independent Sensors Number of Sensors PBPO PSO PSO % Improvements in accuracy % Improvements in accuracy 30.05648340.0555011.7384 50.00234260.001451838.0250 77.022446e-0061.97107e-00671.9317 93.539055e-0095.2906e-01198.5050 Human Design Accuracy PSO Resulting Accuracy PBPO-Person-By-Person Optimal PSO – Particle Swarm Optimization

40 40 Result: Correlated Sensors Human Design 54% 13% 2.5%

41 Data Driven Design 41 no yes

42 42 Correlated Sensors: Designs for 0.1 Correlation For one specific cost structure LRT (Human) Based Design: 2 thresholds on each sensor 2 Sensor only fusion rule Region where an event is declared PSO Based Design: Simple 1 Threshold for each sensor AND fusion rule Very few errors Region where an event is declared

43 43 Correlated Sensors: Designs for 0.9 Correlation LRT (Human) Based Design: 2 thresholds on each sensor 2 Sensor only fusion rule Region where an event is declared PSO Based Design: Simple 1 Threshold for each sensor AND fusion rule Higher number of errors, but still better Region where an event is declared

44 Comparison of Data Driven PSO Design with Other Approaches and Single Sensor Performance 44 Varying the costs in the Bayesian Risk function and generating the designs gives the entire Receiver operating characteristic curve

45 Discrete Version of the Problem Vendors only allow you to have access to multiple points on the ROC Vendors only allow you to have access to multiple points on the ROC The problem then becomes a combinatorial optimization problem The problem then becomes a combinatorial optimization problem Design problem is then: Design problem is then: Operating point for each sensor Operating point for each sensor Fusion rule ( can still be solved by LRT) Fusion rule ( can still be solved by LRT) Suppose we have three classifiers and each classifier can operate on any of the ‘N’ operating points, there are 3 N choices for this problem Suppose we have three classifiers and each classifier can operate on any of the ‘N’ operating points, there are 3 N choices for this problem Discrete version of PSO or GA is used to identify the operating point sets. Discrete version of PSO or GA is used to identify the operating point sets. No alternative approaches exist No alternative approaches exist 45

46 Multi-Objective Design Allows system designer to make trade-offs Allows system designer to make trade-offs Makes the fused system ROC available to the system designer Makes the fused system ROC available to the system designer Adding a sensor, how much does it help? Adding a sensor, how much does it help? Since fused system ROC is available, area under the curve gives a metric to evaluate the system Since fused system ROC is available, area under the curve gives a metric to evaluate the system Allows system designers to make choices when acquiring sensors from multiple vendors Allows system designers to make choices when acquiring sensors from multiple vendors If I have to use sensors incrementally, which ones should I focus on ? If I have to use sensors incrementally, which ones should I focus on ? If I want to add sensors to my detection system, which sensors should I add to improve performance If I want to add sensors to my detection system, which sensors should I add to improve performance 46

47 Multi-Objective Design Homogeneous Sensor Suite 47

48 Multi-Objective Design Heterogeneous Sensor Suite 48

49 Multi-Objective Design Results for Sensor Suites with 4,5 Sensors 49 Algorithm design for generating non-dominated solutions (close to Pareto set) Non-Dominated Sorting PSO instead of a cost function Continuous PSO for thresholds Binary PSO for fusion rule, cannot use LRT for fusion rule

50 Multi-Objective Design 50

51 Distributed Detection Networks : Parallel and Serial

52 2 Sensor Serial Network Example Sensor2 X2X2 Sensor1 X1X1 b1b1

53 Organization of Serial Networks: Who reports to Whom? For a homogeneous network with all the sensors having same statistics, this is not a problem For a homogeneous network with all the sensors having same statistics, this is not a problem For a Heterogeneous network, the sequence affects the performance For a Heterogeneous network, the sequence affects the performance As the number of sensors increase, the number of possible sequences increase exponentially As the number of sensors increase, the number of possible sequences increase exponentially

54 Serial Networks: Coupled Problem The algorithm design for optimization and control of a distributed serial detection network involves two steps: The algorithm design for optimization and control of a distributed serial detection network involves two steps: 1:Identify the optimal sequence of sensors, ‘who reports to whom?’ 1:Identify the optimal sequence of sensors, ‘who reports to whom?’ 2:Identify the optimal local decision rules for sensors. 2:Identify the optimal local decision rules for sensors. A hybrid of PSO –ABC is used to control the sequence and identify the thresholds for a given sequence A hybrid of PSO –ABC is used to control the sequence and identify the thresholds for a given sequence

55 Receiver Operating Characteristic Curve: 10 Sensor Test Bed Best Performing Sensor

56 Results: Serial Networks Probability of Error Achieved for Different Algorithms Averaged over 30 Trials

57 Sensor Management of a Building Access Control System Adaptation in Real Time 57

58 58 Thank you! Thank you!

59 59 Sensor1 Sensor2 X1X1 X2X2 Fusion Rule u1u1 u2u2 Human Design Solution: Likelihood Ratio Test (LRT) Design LRT based fusion rule for independent sensors LRT based Fusion Rule for correlated sensors


Download ppt "Why Data Fusion in Sensor Networks needs a new Champion? Kalyan Veeramachaneni Evo-Design Group CSAIL, Yumm Eye Tee Work done at Development and Research."

Similar presentations


Ads by Google