Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006.

Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006

Outline Introduction Optimization Problem Motivational Example Sensor Model SBR Algorithm Get Interval Best Matching Get Base Signal … Experiments

Motivational Example Data correlation among multiple measurements A SBR (Self-Based Regression) approach is proposed

Sensor Model Multiple measurements Temperature, humidity, dew-point, etc. Base signal Extracted from sensor readings Used as a base reference (Dictionary) for approximating newly collected data

Optimization Problem The following problems are addressed: 1. How large the base signal needs to be at each transmission? 2. Base Signal Update What new features to be included in it? What older features are not relevant any more? 3. How to best approximate the data measurements using the base signal?

SBR Framework This paper presents seven algorithms 1. Regression 2. BestMap 3. GetIntervals 4. GetBase 5. SBR (Self-Based Regression) …… Input data Approximation Base Signal Candidate Construction SBR (Self-Based Regression) Inputs: N  M data transmission

Approximation -- Get Intervals() x 1M x 2M x 3M x NM … x 13 x 23 x 33 x N3 … x 12 x 22 x 32 x N2 … x 11 x 21 x 31 x N1 … … … … … N M At this point, suppose we already have it xMxM x3x3 x2x2 x1x1 … x M+1 x M+2 x M+3 … x 1base Base signal

Approximation -- Get Intervals() x 1M x 2M x 3M x NM … x 13 x 23 x 33 x N3 … x 12 x 22 x 32 x N2 … x 11 x 21 x 31 x N1 … … … … … Concatenation x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N NMNM Convert to 1-D vectors

Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N... Base signal x 1M x 13 x 12 x 11 … Matching with the base signal Matching error

Approximation -- Best Matching() xMxM x3x3 x2x2 x1x1 … x M+1 x M+2 x M+3 … x base x1Mx1M x 13 x 12 x 11 … Regression Regression error: E 1 E2E2 E3E3 … Matching error: E = min{ E i } Record the interval: [start, length] that produce E E M+1

Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N... Base signal x 2M x 23 x 22 x 21 … Matching with the base signal Matching error

Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N... Base signal x NM x N3 x N2 x N1 … Matching with the base signal Matching error

Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Error(1)Error(2)Error(N) Suppose Error(2) is the largest error x 2M x 23 x 22 x 21 … Split into two equal-size blocks x 2M … x 21 … x 2, (1+M)/2 x 2, (1+M)/2+1

Approximation -- Get Intervals() x 1M x 2M x NM x 13 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1 Block 2.1 Block N Error(1) Error(2.1) Error(N) Block 2.2 Error(2.1) Recursively: find the largest error block and split until max number of blocks are reached

Base Signal Candidate Construction -- Get Base Signal x 1M x 2M x 3M x NM … x 13 x 23 x 33 x N3 … x 12 x 22 x 32 x N2 … x 11 x 21 x 31 x N1 … … … … …... Base signal Insert new features Drop older features

Base Signal Candidate Construction -- Get Base Signal x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Divide the 1-D vector into CBI (candidate base interval) CBI 1 CBI 2 CBI 3 CBI k … W CBI … Base signal In this paper, base signal is a collection of base interval Part of the CBI ’ s are selected as the base intervals of the base signal

Base Signal Candidate Construction -- Get Base Signal x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… CBI 1 CBI 2 CBI 3 CBI k … Linear-err(1)Linear-err(2)Linear-err(3) Linear-err(k) error(12) Intra Linear regression error Inter Linear regression error error(13) error(1k)

Base Signal Candidate Construction -- Get Base Signal CBI 1 CBI 2 CBI 3 CBI k … Linear_err(1)Linear_err(2)Linear_err(3) Linear_err(k) error(1, 2) error(1, 3) error(1, k)

Base Signal Candidate Construction -- Get Base Signal The largest benefit

SBR (Self-Based Regression) -- New Base Interval Update CBI … Old base intervals CBI … Candidate base intervals CBI Old base intervals CBI … New base intervals Binary search

SBR (Self-Based Regression) -- Binary Search Goal: find the first ins CBI ’ s such that the error of the approximation when inserting the first ins intervals is lower than inserting either ins -1 or ins +1 intervals

SBR (Self-Based Regression) -- Base Interval Update CBI … If the new base signal is too large, then Drop older base intervals by Least Frequently Used (LSU) replacement policy

Experiments Datasets Phone Call Data: Includes the number of long distance calls originating from 15 states (AZ, CA, CO, CT, FL, GA, IL, IN, MD, MN, MO, NJ, NY, TX, WA). For each state we provide the number of calls per minute for a period of 19 days (data provided by AT&T Labs). Weather Data: Includes the air temperature, dew-point temperature, wind speed, wind peak, solar irradiance and relative humidity weather measurements for the station in the university of Washington, and for year 2002.9 Stock Data: Includes information on all trades performed in a minute basis over April 3 and April 4 of year 2000. Ten stocks were selected: Microsoft, Oracle, Intel, Dell, Yahoo, Nokia, Cisco, WorldCom, Ariba and Legato Systems.

Data Sets

SSE (Sum-Squared Error)

Errors Varying the Compression Ratio for Phone Call Data Set

Number of Inserted Base Intervals

Errors for the mixed dataset

Average Running Time

SSE Error vs. Base Signal size

Conclusions A new data compression technique called SBR was presented, designed for historical data collected in sensor networks. The key idea is to split the recorded series into intervals of variable length and encodes each of them using an artificially constructed base signal. A key to the SBR is the use of the base signal for encoding piece-wise linear correlations among the data values.

Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006.

Similar presentations

Presentation on theme: "Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006.

Similar presentations

Presentation on theme: "Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006."— Presentation transcript:

Similar presentations

About project

Feedback