Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006.

Similar presentations


Presentation on theme: "Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006."— Presentation transcript:

1 Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006

2 Outline Introduction Optimization Problem Motivational Example Sensor Model SBR Algorithm Get Interval Best Matching Get Base Signal … Experiments

3 Motivational Example Data correlation among multiple measurements A SBR (Self-Based Regression) approach is proposed

4 Sensor Model Multiple measurements Temperature, humidity, dew-point, etc. Base signal Extracted from sensor readings Used as a base reference (Dictionary) for approximating newly collected data

5 Optimization Problem The following problems are addressed: 1. How large the base signal needs to be at each transmission? 2. Base Signal Update What new features to be included in it? What older features are not relevant any more? 3. How to best approximate the data measurements using the base signal?

6 SBR Framework This paper presents seven algorithms 1. Regression 2. BestMap 3. GetIntervals 4. GetBase 5. SBR (Self-Based Regression) …… Input data Approximation Base Signal Candidate Construction SBR (Self-Based Regression) Inputs: N  M data transmission

7 Approximation -- Get Intervals() x 1M x 2M x 3M x NM … x 13 x 23 x 33 x N3 … x 12 x 22 x 32 x N2 … x 11 x 21 x 31 x N1 … … … … … N M At this point, suppose we already have it xMxM x3x3 x2x2 x1x1 … x M+1 x M+2 x M+3 … x 1base Base signal

8 Approximation -- Get Intervals() x 1M x 2M x 3M x NM … x 13 x 23 x 33 x N3 … x 12 x 22 x 32 x N2 … x 11 x 21 x 31 x N1 … … … … … Concatenation x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N NMNM Convert to 1-D vectors

9 Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N... Base signal x 1M x 13 x 12 x 11 … Matching with the base signal Matching error

10 Approximation -- Best Matching() xMxM x3x3 x2x2 x1x1 … x M+1 x M+2 x M+3 … x base x1Mx1M x 13 x 12 x 11 … Regression Regression error: E 1 E2E2 E3E3 … Matching error: E = min{ E i } Record the interval: [start, length] that produce E E M+1

11 Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N... Base signal x 2M x 23 x 22 x 21 … Matching with the base signal Matching error

12 Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1Block 2Block N... Base signal x NM x N3 x N2 x N1 … Matching with the base signal Matching error

13 Approximation -- Get Intervals() x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Error(1)Error(2)Error(N) Suppose Error(2) is the largest error x 2M x 23 x 22 x 21 … Split into two equal-size blocks x 2M … x 21 … x 2, (1+M)/2 x 2, (1+M)/2+1

14 Approximation -- Get Intervals() x 1M x 2M x NM x 13 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Block 1 Block 2.1 Block N Error(1) Error(2.1) Error(N) Block 2.2 Error(2.1) Recursively: find the largest error block and split until max number of blocks are reached

15 Base Signal Candidate Construction -- Get Base Signal x 1M x 2M x 3M x NM … x 13 x 23 x 33 x N3 … x 12 x 22 x 32 x N2 … x 11 x 21 x 31 x N1 … … … … …... Base signal Insert new features Drop older features

16 Base Signal Candidate Construction -- Get Base Signal x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… Divide the 1-D vector into CBI (candidate base interval) CBI 1 CBI 2 CBI 3 CBI k … W CBI … Base signal In this paper, base signal is a collection of base interval Part of the CBI ’ s are selected as the base intervals of the base signal

17 Base Signal Candidate Construction -- Get Base Signal x 1M x 2M x NM x 13 x 23 x N3 x 12 x 22 x N2 x 11 x 21 x N1 … ………… CBI 1 CBI 2 CBI 3 CBI k … Linear-err(1)Linear-err(2)Linear-err(3) Linear-err(k) error(12) Intra Linear regression error Inter Linear regression error error(13) error(1k)

18 Base Signal Candidate Construction -- Get Base Signal CBI 1 CBI 2 CBI 3 CBI k … Linear_err(1)Linear_err(2)Linear_err(3) Linear_err(k) error(1, 2) error(1, 3) error(1, k)

19 Base Signal Candidate Construction -- Get Base Signal The largest benefit

20 SBR (Self-Based Regression) -- New Base Interval Update CBI … Old base intervals CBI … Candidate base intervals CBI Old base intervals CBI … New base intervals Binary search

21 SBR (Self-Based Regression) -- Binary Search Goal: find the first ins CBI ’ s such that the error of the approximation when inserting the first ins intervals is lower than inserting either ins -1 or ins +1 intervals

22 SBR (Self-Based Regression) -- Base Interval Update CBI … If the new base signal is too large, then Drop older base intervals by Least Frequently Used (LSU) replacement policy

23 Experiments Datasets Phone Call Data: Includes the number of long distance calls originating from 15 states (AZ, CA, CO, CT, FL, GA, IL, IN, MD, MN, MO, NJ, NY, TX, WA). For each state we provide the number of calls per minute for a period of 19 days (data provided by AT&T Labs). Weather Data: Includes the air temperature, dew-point temperature, wind speed, wind peak, solar irradiance and relative humidity weather measurements for the station in the university of Washington, and for year 2002.9 Stock Data: Includes information on all trades performed in a minute basis over April 3 and April 4 of year 2000. Ten stocks were selected: Microsoft, Oracle, Intel, Dell, Yahoo, Nokia, Cisco, WorldCom, Ariba and Legato Systems.

24 Data Sets

25 SSE (Sum-Squared Error)

26 Errors Varying the Compression Ratio for Phone Call Data Set

27 Number of Inserted Base Intervals

28 Errors for the mixed dataset

29 Average Running Time

30 SSE Error vs. Base Signal size

31 Conclusions A new data compression technique called SBR was presented, designed for historical data collected in sensor networks. The key idea is to split the recorded series into intervals of variable length and encodes each of them using an artificially constructed base signal. A key to the SBR is the use of the base signal for encoding piece-wise linear correlations among the data values.


Download ppt "Compressing Historical Information in Sensor Networks From ACM SIGMOD 2004 and VLDB journal 2006."

Similar presentations


Ads by Google