(with Thiago Teixeira and Andreas Savvides)

(with Thiago Teixeira and Andreas Savvides)
Towards Cooperative Localization of Wearable Sensors using Accelerometers and Cameras Deokwoo Jung (with Thiago Teixeira and Andreas Savvides) Embedded Networks & Applications Lab Yale University Good morning, everyone. Thanks for coming to my presentation, My name is deokwoo jung, and 5th year PhD student in electrical engineering at Yale University I have been working in embedded networks and application lab since 2005 with Prof. Andreas Savvides. In this presentation, I will introduce a cooperative indoor localization system using accelerometers and a camera. March 18, 2010

Indoor Localization Indoor localization is an essential technology for many applications Security Application, Assisted Living, Life logging System etc Indoor localization comparison Cricket [Priyantha.et.al ,00,04] RADAR [Bahl.et.al,00] Surroundsense [Azizyan.et.al,09] Our System Precision Physical Location (cm) Physical Location (<5 meter) Logical Location (>5 meters) Mobile device Customized Sensor Node WLAN Card - Laptop Mobile Phone Mobile Phone/ Wearable Sensor Infra- structure Ultrasound Beacon Nodes on Ceiling WLAN APs RF fingerprint database GSM Network + Ambient signal database Networked cameras Sensing modality Ultrasound RF signal Ambient signals- Light, sound, color Human walking motions Indoor localization system is very essential technologies for other higher-level applications such as building security, assisted living, life loggging etc. As many of them already are aware, GPS system does not work inside buildings, so there has been many research works to localize people or objects. The table shows some famous representative system and compare to ours. We can characterize those system by precision, type of mobile device, infrastructure, and sensing modalities as shown in the table. The cricket uses ultrasound signal to localize mobile nodes and have cm precision. The radar uses RF fingerprint method and it has 1-3 meters of localization error. And last year mobicom paper, the Surroundsense is tracking logical location of mobile phones using many ambient signals detected by its built-in sensors. Our system also assume to use bulit-in sensor of mobile for localization, but also use camera network for high-precision of localization. The major sensing modality in our system is a human motion. Let me briefly describe the overall system next slides. March 18, 2010

Cooperative Localization Approach
Our Approach Localizing mobile phones by combining their built-in accelerometer (human motion) and infrastructure camera (human centroids) Why Human Centroids and Human Motion ? They are Complementary to each other Wearable inertial sensors (Human Motion) Camera (Human Centroid) ID tracking Accurate - node address Difficult – feature extraction Location Positioning Difficult –walking orientation and distance estimation Accurate –background subtraction The idea of our localization is to combine two sensing signals from accelerometers' and a camera, and we expect that those two sensors will be more and more accessible and prevalent because of the popular use of smart phone and a CCTV. You can see the system overview in the figure in bottom of the slide. A camera sensor extracts human centroids from image frames, a wearable sensors or mobile phone is on person's body and built-in-accelerometer measure body acceleration. Then fusion center combines those two sensing modalities to track human’s locations with their ID. Those two sensing modalities are very complementary each other. In order to localize people, we need two information, one is the physical locations of people and their corresponding IDs. For wearable sensors, we can easily obtain the ID information, which is in fact their node ID, but it is very difficult to find the nodes’ physical relative locations using inertial sensors. On the other hands, for camera it is very easy to compute the exact physical locations, which is the centroids of people using a simple background subtraction algorithm, but it is quite computationally intensive and often fails to estimate accurate IDs of the centroids. In our system, we attempts to combine each sensor’s strength to obtain a complete location information. March 18, 2010

Sensing Modeling and Approach
Human Walking Model ∫ ∫Accel. ≠ Walking distance The law of movement of human body by complex kinetics Inverted pendulum model of human gait. The body center of mass (BCOM) oscillates in the z direction as the person moves forward (y direction). Before we go to the algorithm detail, let me briefly introduce human walking model first. As I said before, we want to combine two sensing information, human centroids from camera and body acceleration from accelerometer. Then how to compare those two information. At first look, it seems reasonable to compute distance by double integration of accelerations and compare it to moving distance of centroids. But in reality it does not work. When people walk, the body movement is governed by complex kinetics rather than a simple one-dimensional movement. In fact, the body center of mass oscillates back and forth and up and down, which follows a sinusoidal graph, So when we decomposes acceleration of human body, the horizontal acceleration , a_y, which can be modeled as an interfered pendulum movement. its vertical acceleration, which is a_z, looks like spring movement. March 18, 2010

Statistical Analysis of Sensor Data
Intuition : BCOM follows a sinusoidal pattern Velocity of Body α Standard Deviation of Vertical Acceleration A correlation coefficient for the similarity measure between accelerometer and camera data Experiment So from the observation, the intuition tells us that the standard deviation of vertical acceleration of human body will increase when people walk fast. In fact, we found from extensive experiments that they have a strong linear relationship The plots in bottom shows a part of experiment result. The left plot shows the distribution of vertical acceleration of an accelerometer when a walking speed increases. In the figure, in clock-wise the walking speed increases, and you can see the distribution flattens more and more So we plot the standard deviation of vertical acceleration and walking velocity, and it show a strong linear relationship. In our work, we compute walking velocity from human centroids from camera and the standard deviation of vertical acceleration from wearable sensors , and then we use correlation coefficient for similarity metric between those two sensing modalities. March 18, 2010

Tracking Algorithm A camera extracts only centroid information
Privacy Preserving and Cheap A simple tracking algorithm computes a speed of anonymous centroids associates human centroids in consecutive frames based on their distances. Problem: Many possible ambiguous associations Since a camera extracts centroid information only, it does not requires a complex and expensive signal processing engine. , and it also has advantage in privacy-preserving We can easily compute the velocity of body center of mass by tracking the centroids in consecutive frames. We use a simple tracking algorithm, which associates centroids in consecutive frames by Euclidian distance. But the simplicity comes with a cost. it often has ambiguous associations when more than two centroids are too close each other. Let me explain more detail in next slide. March 18, 2010

Path Disambiguation Problem
Path Ambiguity Problem in Human Centroid Tracking A tracker associates one object with more than two objects in two consecutive image frames when two or more objects come close to each other. Let me take an simple example, in which two people are walking across each other at time t_a, Like left bottom plot. The tracking algorithm has association ambiguity at time t_a, and end up with having four different possible set of paths,A,B,C and D ,which results in two possible set of velocities, {A,D} or {B,C} On the other hand, an accelerometer generates its own pattern of standard deviation of vertical acceleration, which is roughly proportional to walking speed. Then in order to correctly localize people under the path ambiguity, we have correctly match those trace sets, So we need an algorithm to disambiguate path, and we call it path disambiguation problem. March 18, 2010

Disambiguation Algorithm
Path Disambiguation as Non-linear Optimization Problem Find a set of association hypotheses to maximize a matching rate, The number of correct ID matchings between accelerometers and centroids Develop a search algorithm in a tree structure A leaf node: a hypothesis of path segmentations Three stage pruning algorithm Sub-tree evaluation, Classification and Pruning, Reconstruction Let me formally describe the problem. We can describe it as a non-linear optimization problem, which is finding a set of association hypothesis to maximize an expected matching rate between relative IDs of human centroids and the actual IDs of wearable sensors, which can be stated as the following optimization form. In the problem, T is the observation time, N is the total number of people, I is an indicator function. and we have the total k_t number of possible path ambiguity, or path hypothesis , which is denoted by theta_1 to theta_k_T. And rho I,j given thetas is the correlation coefficient of sensing modalities between i centroids and j accelerometer given a set of theta hypothesis. There can be many ways to solve the optimization problem, but it roughly falls into two categories. One is finding deterministic solution, the other one is finding the most probable solution using probability function, which is Bayesian estimation such as Kaman filter like method. But the problem have non-linear function, which is correlation coefficient, And also it has non-Gaussian noise, which makes it very difficult to use Bayesian estimation. So we decide that it is better to using deterministic solution, which is searching the best hypothesis set from hypothesis space. We develop the disambiguation algorithm on decision tree structure. When we have a new hypothesis, the algorithm constructs a spanning tree. And the algorithm searches the best solution among possible candidates, which is leaf nodes on the tree. We have to address two challenges in order to make the algorithm practical, First, we have to make a good guess which hypothesis, a leaf node, is better, and Second, we have to prune sub-trees to prevent state explosion. So our algorithm resolve those challenges through three staged pruning process Sub-tree evaluation, classification and pruning, and reconstruction. Let me briefly explain it in the next slide. March 18, 2010

Clustering and Pruning in Hypothesis Tree
Hypothesis Quality Metric : how credible a given path hypothesis is compared to others? Correlation Coefficient Distance metric D(ρ|H) = |E(ρ, e0|H) − E(ρ, e1|H)| Accelerometers Centroid traces 2 Centroid traces 1 To evaluate path hypothesis, we introduce a hypothesis quality metric, which quantifies how credible a given path hypothesis Is compared to others. The idea is simple. For different hypothesis, we have different correlation matrix as shown in this example, The Colum index is the actual ID of wearable sensors and the row index is the relative ID of human centroids , and matrix element is the correlation coefficient between the standard deviation of the vertical body acceleration from accelerometer and velocity of body center of mass from the camera. Then we pick the highest value of the matrix element to find the ID matching between a human centroid and a wearable sensor. If we have a correct hypothesis, the difference between the element value of matched IDs and non-matched IDs will be large, If we have a wrong hypothesis, it will be smaller. So we compute the absolute difference between the average values of matched and non-matched correlation coefficients. The actual measurement profile looks like this, and you can see the difference in correlation coefficient distance metric between correct hypothesis and wrong hypothesis. D(ρ|H1) D(ρ|H2) Correct Hypothesis Wrong Hypothesis March 18, 2010

Clustering and Pruning in Hypothesis Tree
Leaf Clustering, Pruning, and Path Reconstruction Clusters the leaf nodes into groups and prunes the subset of groups with lower metric values. When only one leaf is left, reconstructs the matching sequence Tree Pruning Algorithm Our algorithm constructs a spanning tree and prune sub trees based on the hypothesis quality metric. As you can see in this figure, the tree spans its children when a new path ambiguity occurs. And at each time, leaf nodes computes the correlation coefficient distance metric, and clustering those values. , then prunes sub trees in cluster with lower average metric, In real experiment, the process looks like this, After 20 seconds, the path ambiguity occurs, then each leaf nodes have a trace of correlation coefficient distance metric. Then our algorithm clusters the leaf nodes and prunes sub trees, which is shown as a yellow bar. March 18, 2010

Performance Evaluation via Experiment & Simulation
Experiment Setup A ceiling mounted camera (12ft) with a Intel iMote2 node Computes the centroid position of a person, 15 times per second. A wearable sensor node with an Analog Devices ADXL330 accelerometer on the person’s waist Collecting body acceleration data with 15Hz sampling rate. Transmitting its measurements to a computer (fusion center) via a Zigbee wireless link. People walk for 1 minute in a 5.4 m2 space. We evaluate our algorithm in a controlled experiment setting in our lab. We put iMote2 node with camera on the ceiling, which is 12 feet height, and it computes human centroid position with 15 times per second, and send it to the base-station. And a wearable sensor node is attached on person’s waist and it samples with 15 hz and sends the acceleration data to the base-station. And people walk around 5.6 square meter for 1 minutes like on the right picture. March 18, 2010

Experiment Dataset Walking trajectory of 12 people collected from camera We collected accelerometer and human centroid data of 12 people’s independent walking traces, As shown in the figure, you can see it is quite random movement, March 18, 2010

Similarity Metric Performance
100 % matching rate without path ambiguity Bar Graph of Correlation Coefficient Matrix Standard Deviation of z-acceleration and velocity of BCOM over time for traces The we evaluate the similarity metric performance, which is the correlation coefficient of the velocity of human centroids from camera and the standard deviation of a vertical body acceleration from wearable sensors. We compute the correlation coefficient for all possible pairs, in this case it assumes no path ambiguity. The figure shows the correlation coefficient of each human centroid trace for all 12 accelerometer data. Then we choose the pair with highest correlation coefficient. As you can see, if we choose the highest one which is shown in red dot, it always gives a correct matching between centroid trace ID and the actual wearable sensor ID. March 18, 2010

Disambiguation Algorithm Performance
The performance depends on the level of crowd in camera field of view. Evaluate the performance using crowd density metric, the number of pedestrians per area, m2, [Abishai.et.al, Pedestrian flow and level of service] Crowd Density Scenario Scenario A: Normal flow B: Restricted Flow C: Dense Flow D: Very Dense Flow Crowd Density office building in business hour crowded shopping mall in weekend Crowded weekend party Subway station in Manhattan during the rush hour People / m2 <0.5 0.5~0.8 0.81~1.26 1.27~2 Then we evaluate the disambiguation algorithm. Obviously , the performance strictly depends on the level of crowd in camera field of view. For precise evaluation, we use crowd density metric, which is the number of people per square meter We can categorize the level of crowd by the metric into four groups. In group A, the crowd density is less than 0.5 people/ m2. Its typical situation happens in office building in business hour, in group B, it is between 0.5 and 0.8, and it usually occurs in a crowded shopping mall in weekend, and group C is a crowded weekend party and group D is a very crowded subway station in Manhattan during the rush hour time, which is 2 people per square meter. Our localization system usually targets for group A case and maybe for B if we allow some moderate error, but C and D are absolutely not considered for our system, we evaluate our algorithm under case C and D only for algorithm stress test. We simulate those scenarios using the actual data collected in 12 traces in previous slide, and run our algorithm on 100 times per each scenario. If the crowd density > 2, the pedestrian flow is jammed, i.e. practically people’s movement appears to be static Our system is mainly targeting for the scenario A (free flow), i.e. people can walk around without much interaction. March 18, 2010

Performance over complexity of scenario
The number of tracking errors grows with polynomial order with crowd density (left) The matching performance of disambiguation algorithm for different crowd densities (right) The performance gap is widening as crowd density increases. The performance becomes twice in the scenario D. So here is a brief result of our algorithm performance evlauation. In the left figure, the number of tracking error grows in a polynomial order over crowd density The number of possible hypothesis roughly grows 2 ^ tracking error, which is exponential growth. The we evaluate the matching rate in percentage of our disambiguation algorithm and compare it to tracker-only case. In the tracker-only, algorithm randomly pick one among multiple path hypothesis, and associate them to the accelerometer ID with the similarity metric. The results shows that our disambiguation algorithm improves the matching performance roughly 30 % in scenario A and B, and 50% in C and D. March 18, 2010

Performance over disambiguation stages
Matching rate improvement by disambiguation algorithm Let me give performance improvement over stages for 6 people walking scenario. First, for the tracker-only, it has lots of mis-ID matchings between centroids and wearable sensors. And then real-time disambiguation computation improves its performance by around 20%, If we relax a real-time constraint, we can further improve its performance because the algorithm can trace back the previous correct hypothesis when a single leaf node is left. Then you can see the performance improves more than 30% after reconstruction. March 18, 2010

Localization System Demonstration
Controlled experiments with 10 people walking scenario. The performance of disambiguation algorithm (right) is compared to the tracker-only localization (left). Now, let me show our system demonstration. We did a small experiment, that 10 people in the lab walk around independently. And each person has wearable sensors and imote2 camera sensor is on the ceiling We run our algorithm in real time and compare the performance with disambiguation algorithm to tracker-only algorithm The blue dots represent a correct ID, and the red dots represents a wrong ID. The ambiguity of ID among nodes is shown as a red line. In this movie, a tracker only algorithm constantly gives a wrong ID to people, But as you can see, using the disambiguation algorithm we can label correct IDs on people after 30 seconds, Therefore, complete localization. March 18, 2010

Conclusion We presented a hybrid localization system using accelerometers and cameras. The proposed disambiguation algorithm operates reliably, degrading gracefully even for crowded scenarios The constraint of accelerometer position (waist) can be relaxed using additional inertial measurement sensors. Future work is to have a complete system implementation running on a mobile phone + More sensors To sum up, in this work, we presented a new hybrid localization system using accelerometers and camera, And with disambiguation algorithm, it gives a decent localization accuracy over many crowded scenarios. By adopting more inertial sensors and Kalman filter, we can relax the constraint of wearable sensor position on body, And the work is already submitted in other conference. In future work, we plan to have a complete implementation on mobile phone with more realistic deployment. March 18, 2010

Thanks for your interest! For more information, please visit
QUESTION ? Thanks for your interest! For more information, please visit Ok, thanks you so much for your attention, we have more details on the algorithm complexity ,and system implementation in the following website, please visit if you want to find out more. and if you have any question, please feel free to ask now. March 18, 2010

(with Thiago Teixeira and Andreas Savvides)

Similar presentations

Presentation on theme: "(with Thiago Teixeira and Andreas Savvides)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

(with Thiago Teixeira and Andreas Savvides)

Similar presentations

Presentation on theme: "(with Thiago Teixeira and Andreas Savvides)"— Presentation transcript:

Similar presentations

About project

Feedback