Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn)

Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn) and Kamesh Munagala (Duke)

Multichannel Systems Current wireless networks have access to multiple channels –Frequencies in FDMA, codes in CDMA, antennas in MIMO, polarization states of antennas, access points in systems with multiple access points –IEEE 802.11a, IEEE802.11b, IEEE802.11h Future wireless networks are likely to have access to larger number of channels –Potential deregulation of spectrum –Advances in device technology

Multichannel systems Transmission quality in different channels stochastically vary with time. A user is likely to find a channel with acceptable transmission quality when the number of channels is large, –if he samples all the channels Sampling consumes energy and generates interference for other users Amount of information a user acquires about its channels becomes an important decision variable.

Policy Decisions Probing policy –How many channels to probe? –Which channels to probe? –Sequence of probing Selection policy –Which channel to transmit (probed channel or unprobed channel)?

System Goal Gain is the difference between the expected rate of successful transmission and the cost of probing Maximize the system gain –Need to jointly optimize the probing and selection policies –Policies will be adaptive –Policies may depend on higher order statistics and cross statistics of channels.

Sense of déjà vu? Stopping Time problem ( Bertsekas, Dynamic Programming, Vol. 1 ) –Maximize reward by optimally selecting between two control actions at any given time: (a) continue or (b) stop –IID channels ( Kanodia et al, 2004, Ji et al, 2004 ) Generalized stopping time problems –Select between multiple control actions –Only limited results are known

Sense of déjà vu? Stochastic multi-armed bandit problem ( Bertsekas, Dynamic Programming, Vol. 2 ) –Observe the state of only one arm at each slot –Select the arm to observe and get reward from it –State of an arm changes only after you observe it –State changes are stochastic –Gittins index based policies maximize the reward (Gittins-Jones, 1974)

Sense of déjà vu? Adversarial multi-armed bandit problems ( Auer et al, 2002) –State of an arm may change even when you don’t observe it –State changes are adversarial –Some versions allow you to get reward from only one selected arm and subsequently observe the states of all arms in the slot –Given a class of N policies and a time horizon of T slots, bound the performance loss as compared to the best of the policies in terms of O(sqrt(T * log N)).

Our Contribution Polynomial complexity optimal solution when channels are mutually and temporally independent and each channels has two states. –Sigmetrics, 2006, poster paper Polynomial complexity approximate solution when channels have multiple states –Approximation ratio ½ for arbitrary number of states –Approximation ration 2/3 for 3 states

System Model Single user with access to n channels Each channel can be in one of K states, 0,…K-1. Channel i is in state j with probability p ji Probability of successful transmission in state j is r j User knows the state of channel i in a slot only if it probes it. Cost of probing channel i is c i Channel states are temporally and mutually important

Optimal policy for K = 2 (Sigmetrics, 2006) Exhaustive (S, i) policy: –probes all channels in set S, –if no probed channel is in state 1, transmit in channel i (backup). Optimal policy is exhaustive (S i, i) for some i –S i = {j: p 1j (1-p 1i ) > c j } –Probes all channels with high cross-uncertainty with the backup channel –Probes channels in decreasing order of p 1j /c j Search space for optimal policy restricted (n policies) Determine the optimal by explicitly evaluating the gain in the above class.

Additional challenges for K > 2 For K = 2, the optimal policy is static –Actions can be determined before observation For K > 3, the optimal policy may be adaptive –While probed channels are in state 0, focus on channels with high expected reward –After observing a channel in intermediate state, focus on channels which have higher probabilities of being in the highest state –After observing a channel in intermediate state, may select either the observed channel or the backup. Optimal policy is a decision tree –Exponential number of decision trees –Storage of decision tree consumes exponential complexity

Constant Factor Approximation Algorithm for K > 2 Let policy A be the best among those that always transmits in a probed channel –Gain G A Let policy B be the best among those that always transmits in an unprobed channel –Policy B is clearly the one that never probes any channel and transmits in the channel with the highest expected transmission rate. –Gain G B Theorem 1: Max(G A, G B ) >= ½ Maximum Gain –The best among A and B attains half the maximum gain

Proof of Theorem 1 Q: Probability that the optimal policy transmit in an unprobed channel OPT: Expected gain of the optimal policy G: Expected gain of the optimal policy given that it transmits in an unprobed channel

Proof of Theorem 1 Modify the optimal so that it transmits in the best probed channel whenever it was using the backup –Expected gain OPT’ which is less than or equal to G A –Expected Gain given that the optimal uses a backup x –OPT – OPT’ <= Q (G – x) <= QG <= G B OPT <= OPT’ + G B <= G A + G B <= 2 max(G A, G B )

Missing link Need to compute the policy that maximizes gains among those that transmit only in probed channels. Once a channel is observed to be in state u, probe only those channels for which the expected improvement above u exceeds cost.

Special case: K = 3 Theorem 2: There exists an optimum policy which uses a unique backup channel. –There exists a channel l which satisfies the following characteristic: if the optimum policy uses a backup it uses l.

Approximation algorithm for attaining 2/3 approximation ratio Let P(l) be the class of policies which – never probe channel l and –never uses any channel as a backup other than l. Let C be the best policy in P(l) for the best choice of l. Theorem 3: Max(G A, G B, G C ) >= 2/3 Maximum Gain –Best among A, B, C attains 2/3 of the optimum gain.

Open Problems Is the optimization problem NP-hard when K exceeds 2? Can we get better approximation ratios? What happens when channels are not mutually independent? What happens when channels are not temporally independent? What happens when you can simultaneously transmit in multiple channels? –Power control What happens when you need not transmit in every slot? –Lazy scheduling

Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn)

Similar presentations

Presentation on theme: "Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn)

Similar presentations

Presentation on theme: "Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn)"— Presentation transcript:

Similar presentations

About project

Feedback