Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA.

Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu vicliu@cs.ucla.edu Computer Science Department, UCLA

Why Extremes? Central Server Sensor 1Sensor 2Sensor n query: highest raindrop Sensor i (the highest one), plus its value Identifying severe weather conditions (flood / drought) Central Server link 1link 2link n query: slowest link link i (the slowest one), plus its transferring speed a network path from L.A. to N.Y. Identifying the network bottleneck Central Server AmazonBarns & NobleCampusI.com query: best Web site for “Computer Algorithms” Website i (the best one), plus the matching Web pages Identifying the best Web database for a user’s query

What Is the Challenge? Constant communication between sensors and the central server is too expensive Can the central server contact only a few sensors (i.e. use probing) to find out the maximum? Central Server Sensor 1Sensor 2Sensor n query: highest raindrop Sensor i (the highest one), plus its value

A Motivating Example Central Server Sensor 1 Sensor 2 Sensor n  expensive communication cost Sensor 2 the possible value range of Sensor 1 actual value of Sensor 1 (unknown) () Sensor n Sensor 1 () () a) The central server without the latest sensor updates Central Server Sensor 1 Sensor 2 Sensor n  Sensor 2 () Sensor n Sensor 1 () 1000 probe 1000 b) Probing sensors’ reading to reduce uncertainty

Data Model The reading of each source as a random variable, X 1, …, X n [l i, u i ] as X i ’s value range  Bounded model: l i, u i as real numbers  Unbounded model: [- , u i ], [l i, +  ], [- , +  ] Given X i ’s probability distribution in [l i, u i ]  f i (x), F i (x) X 1, …, X n independent Probing X i results in x i, costs c i  uniform-cost model, c 1 =c 2 = … = 1  non-uniform-cost model

U( ) = 0.12, cost: probing 1U( ) = 0, cost: probing 2 Uncertainty in The Answer Two variables X 1 and X 2, uniform distribution 0 f1(x)f1(x) 880 0.12 1000 X1X1 X2X2 f2(x)f2(x) 600 f1(x)f1(x) 900800 f2(x)f2(x)

Uncertainty / Probing Cost Tradeoff Uncertainty in the answer 0 Less probing, high uncertainty More probing low uncertainty Probing cost Tradeoff point The user-specified uncertainty threshold 

The Problem Given the uncertainty data model, design a probing policy P: X 1 P  X 2 P  …  X n P that  incurs the least probing cost  finds the maximum variable with an uncertainty lower than  Brute force searching takes n!

Optimal Probing under Zero-Uncertainty  = 0, i.e. return an absolutely correct answer Two policies P1: X1X2P1: X1X2 P2: X2X1P2: X2X1 0 f1(x)f1(x) X1X1 X2X2 900800 f2(x)f2(x) 1000 f1(x)f1(x) f2(x)f2(x) 

Optimal Probing under Zero-Uncertainty Theorem 1: X 1, …, X n are ranked in a descending order of their upper bounds, i.e., u 1 > … > u n, P: X 1  X 2  …  X n is optimal in the zero-uncertainty case The upper bound u i as a “representative point” for X i

Optimal Probing under Non-Zero- Uncertainty  = 0.15 Two policies  P 1 : X 1  X 2, saves the 2 nd probing if X 1 >885  P 2 : X 2  X 1, saves the 2 nd probing if X 2 >850 0 f1(x)f1(x) X1X1 X2X2 900800 f2(x)f2(x) 1000  885 850

Critical Point Critical point,  i  [l i, u i ] s.t. P(X i >  i ) =  Lemma 1: With two variables X 1 and X 2, the optimal policy always probes the one with the larger critical point 0 f1(x)f1(x) X1X1 X2X2 900800 f2(x)f2(x) 1000 885 850 x  1 885  2 F1(x)F1(x) F2(x)F2(x) 1 0.85 (1-  )

Deriving The Optimal Policy from The Critical Points? Theorem 2: The optimal policy should always place X i before X j if: Cond 1 :  i >  j Cond 2 :  x >  j, F i (x) < F j (x) x 1-  1 Fi(x)Fi(x)Fj(x)Fj(x) jj ii

Applying Theorem 2 to Derive The Optimal Policy x 1-  1 22 11 nn F1(x)F1(x) F2(x)F2(x)Fn(x)Fn(x) Case 1: Optimal policy: P: X 1  X 2  …  X n

Applying Theorem 2 to Derive The Optimal Policy Case 2: Possible candidate policies  {X 1,X 2,X 3 }  {X 4,X 5 } and X 1 must be before X 2 X 1  X 2  X 3  X 4  X 5 X 1  X 2  X 3  X 5  X 4 X 1  X 3  X 2  X 4  X 5 X 1  X 3  X 2  X 5  X 4 X 3  X 1  X 2  X 4  X 5 X 3  X 1  X 2  X 5  X 4 x 1-  1 F3(x)F3(x) F4(x)F4(x) F5(x)F5(x) F1(x)F1(x) F2(x)F2(x)

Experimental Set-up 166 rainfall sensors across Washington State Recording the rainfall at each sensor location, on every day over the past 46 years

Probability Distribution From the historical data, generate one distribution per sensor per day Distinguish two kinds of historical data:  Yesterday was dry  Yesterday was rainy

Preliminary Results Complexity of optimal-policy searching

Future Experimental Study The behavior of the optimal policy on the rainfall sensor data  Uncertainty threshold  vs. number of sensor probing The behavior of the optimal policy on synthetic datasets  Reduction in the search space   vs. number of sensor probing

Summary Under the proposed data model, find the maximum variable with uncertainty less than  Optimal probing policy   = 0, sort variables according to their upper bounds   > 0, derive probing preferences (X i before X j ) and reduce the search space

Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA.

Similar presentations

Presentation on theme: "Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA.

Similar presentations

Presentation on theme: "Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA."— Presentation transcript:

Similar presentations

About project

Feedback