Capacity Upper Bounds for Deletion Channels Suhas Diggavi Michael Mitzenmacher Henry Pfister.

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.
Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
Xiaoming Sun Tsinghua University David Woodruff MIT
Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.
Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
Ashish Goel, 1 A simple analysis Suppose complementary DNA strands of length U always stick, and those of length L never stick (eg:
EXAMPLE 4 Solve a multi-step problem SHOPPING
Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
The simplex algorithm The simplex algorithm is the classical method for solving linear programs. Its running time is not polynomial in the worst case.
Shortest Vector In A Lattice is NP-Hard to approximate
Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)
Distributed Computing 9. Sorting - a lower bound on bit complexity Shmuel Zaks ©
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
A new bound for discrete distributions based on Maximum Entropy † Gzyl H., ‡ Novi Inverardi P.L. and ‡ Tagliani A. † USB and IESA - Caracas (Venezuela)
Capacity of Wireless Channels
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Chain Rules for Entropy
1 Finite-Length Scaling and Error Floors Abdelaziz Amraoui Andrea Montanari Ruediger Urbanke Tom Richardson.
Chapter 6 Information Theory
New Results and Open Problems for Insertion/Deletion Channels Michael Mitzenmacher Harvard University Much is joint work with Eleni Drinea.
Three Lessons Learned Never discard information prematurely Compression can be separated from channel transmission with no loss of optimality Gaussian.
Dynamic Tuning of the IEEE Protocol to Achieve a Theoretical Throughput Limit Frederico Calì, Marco Conti, and Enrico Gregori IEEE/ACM TRANSACTIONS.
Codes for Deletion and Insertion Channels with Segmented Errors Zhenming Liu Michael Mitzenmacher Harvard University, School of Engineering and Applied.
Asymptotic Enumerators of Protograph LDPCC Ensembles Jeremy Thorpe Joint work with Bob McEliece, Sarah Fogal.
Why Simple Hash Functions Work : Exploiting the Entropy in a Data Stream Michael Mitzenmacher Salil Vadhan.
RAPTOR CODES AMIN SHOKROLLAHI DF Digital Fountain Technical Report.
Division of Engineering and Applied Sciences DIMACS-04 Iterative Timing Recovery Aleksandar Kavčić Division of Engineering and Applied Sciences Harvard.
1 Verification Codes Michael Luby, Digital Fountain, Inc. Michael Mitzenmacher Harvard University and Digital Fountain, Inc.
Compression with Side Information using Turbo Codes Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University Data Compression Conference.
Noise, Information Theory, and Entropy
Noise, Information Theory, and Entropy
Information Theory & Coding…
INFORMATION THEORY BYK.SWARAJA ASSOCIATE PROFESSOR MREC.
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
When rate of interferer’s codebook small Does not place burden for destination to decode interference When rate of interferer’s codebook large Treating.
Information Coding in noisy channel error protection:-- improve tolerance of errors error detection: --- indicate occurrence of errors. Source.
Channel Capacity.
Extremal Problems of Information Combining Alexei Ashikhmin  Information Combining: formulation of the problem  Mutual Information Function for the Single.
Threshold Phenomena and Fountain Codes Amin Shokrollahi EPFL Joint work with M. Luby, R. Karp, O. Etesami.
EE 6332, Spring, 2014 Wireless Communication Zhu Han Department of Electrical and Computer Engineering Class 11 Feb. 19 th, 2014.
§2 Discrete memoryless channels and their capacity function
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Thursday, May 9 Heuristic Search: methods for solving difficult optimization problems Handouts: Lecture Notes See the introduction to the paper.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Transmission over composite channels with combined source-channel outage: Reza Mirghaderi and Andrea Goldsmith Work Summary STATUS QUO A subset Vo (with.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
Ravello, Settembre 2003Indexing Structures for Approximate String Matching Alessandra Gabriele Filippo Mignosi Antonio Restivo Marinella Sciortino.
Timo O. Korhonen, HUT Communication Laboratory 1 Convolutional encoding u Convolutional codes are applied in applications that require good performance.
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
Jayanth Nayak, Ertem Tuncel, Member, IEEE, and Deniz Gündüz, Member, IEEE.
Jayanth Nayak, Ertem Tuncel, Member, IEEE, and Deniz Gündüz, Member, IEEE.
The Finite-state channel was introduced as early as 1953 [McMillan'53]. Memory captured by channel state at end of previous symbol's transmission: - S.
Raptor Codes Amin Shokrollahi EPFL. BEC(p 1 ) BEC(p 2 ) BEC(p 3 ) BEC(p 4 ) BEC(p 5 ) BEC(p 6 ) Communication on Multiple Unknown Channels.
Fidelity of a Quantum ARQ Protocol Alexei Ashikhmin Bell Labs  Classical Automatic Repeat Request (ARQ) Protocol  Quantum Automatic Repeat Request (ARQ)
Bounds on Redundancy in Constrained Delay Arithmetic Coding Ofer ShayevitzEado Meron Meir Feder Ram Zamir Tel Aviv University.
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
The Message Passing Communication Model David Woodruff IBM Almaden.
1 CSCD 433 Network Programming Fall 2016 Lecture 4 Digital Line Coding and other...
Codes for Symbol-Pair Read Channels Yuval Cassuto EPFL – ALGO Formerly: Hitachi GST Research November 3, 2010 IPG Seminar.
Kolmogorov Complexity
Random Testing: Theoretical Results and Practical Implications IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2012 Andrea Arcuri, Member, IEEE, Muhammad.
New Characterizations in Turnstile Streams with Applications
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
Tim Holliday Peter Glynn Andrea Goldsmith Stanford University
CSE 589 Applied Algorithms Spring 1999
Sampling Theorems- Nyquist Theorem and Shannon-Hartley Theorem
Information Theoretical Analysis of Digital Watermarking
How many deleted bits can one recover?
Presentation transcript:

Capacity Upper Bounds for Deletion Channels Suhas Diggavi Michael Mitzenmacher Henry Pfister

The Most Basic Channels Binary erasure channel. –Each bit is replaced by a ? with probability p. Binary symmetric channel. –Each bit flipped with probability p. Binary deletion channel. –Each bit deleted with probability p.

The Most Basic Channels Binary erasure channel. –Each bit is replaced by a ? with probability p. –Very well understood. Binary symmetric channel. –Each bit flipped with probability p. –Very well understood. Binary deletion channel. –Each bit deleted with probability p. –We dont even know the capacity!!!

The Challenge Would like a single-letter characterization of capacity. –Or tight upper/lower bounds. –Or effective means of calculating capacity. Such characterizations are difficult. –Deletion channels are channels with memory.

Recent Progress Chain of results giving better lower bounds. –Shannon-style arguments. Diggavi/Grossglauser, Drinea/Mitzenmacher, Drinea/Kirsch. Global result: capacity is at least (1 – p)/9. But essentially no work on upper bounds. –Ullmans bound: zero-error decoding for worst-case synchronization errors. (Does not apply.) –Trivial bound of (1 – p). –Lower bound progress motivates need for progress in the other direction. –How close are we getting to capacity???

Our Results An upper bound approach using side information that gives numerical bounds. An upper bound approach using side information that gives asymptotic behavior as fraction of deletions p goes to 1, for a bound of c(1 – p).

Upper Bound via Run Lengths Input can be thought of as runs of maximally contiguous 0s/1s. Side information: Suppose an undeletable marker inserted every time a complete run is deleted. –Can only increase capacity. Equivalent to a memoryless channel where symbols are run lengths ….

Example …. Sent: Received: 0011*11000*00… …. Sent: Received: ….

Capacity Per Unit Cost Associate cost of x with run of length x at input. Capacity of binary channel with side info = Capacity per unit cost of run length channel. Latter can be upper bounded numerically using Kuhn-Tucker conditions. –Challenging because of infinite alphabet.

Upper Bound Statement For channel defined by p Y | X and any output distribution q Y let Then for any non-negative cost function c(x), the capacity per unit cost C satisfies [Abdel-Ghaffar 1993]

Upper Bound Calculation Problem: Optimization over infinite alphabet. Solution: Finitize the problem. –Find optimal input/output distribution for truncated alphabet. –Replace tail of finite output distribution with geometric distribution. To allow analytic bound on for large x. –Bound performance of resulting distribution. –Optimize over truncated alphabet.

Bounds Achieved pLower boundUpper bound

Asymptotic Result Motivation: –Previous upper bound approach breaks down for large p. Not surprising; large p implies more completely deleted runs, so more side information released. –Want to find limits of possible global results. The (1 – p)/9 lower bound argument seems tightest as p approaches 1. Can we obtain an asymptotic c(1 – p) upper bound? –Build upon insights from global lower bound.

Upper Bound via Markers Suppose an undeletable marker is inserted every 1/(1 – p) bits in the transmission. Channel now memoryless. –Input symbols = 1/(1 – p) bits. –Output symbols = random subsequence, with expected length 1. –Capacity should scale with (1 – p). –Capacity bound: How can we optimize over such a large dimensional space? –Symbols are big.

Upper Bound Calculation : Step 1, Output Problem: Optimization over all output symbols. –Potentially infinitely many bit strings. Solution: Finitize the problem. –At receiver, number of bits between markers converges to Poisson distribution. –For upper bounds, assume that if k > 6 bits received, then receiver obtained k bits of information. Only affects bounds by a few percent. –Only need to consider outputs of 6 or fewer bits.

Upper Bound Calculation : Step 2, Input Problem: Optimization over input strings. –Sequence of 1/(1 – p) bits. Potentially infinite alphabet. Solution: Finitize the problem. –Key Lemma: if only considering up to 6 bits at output, need only consider sequences of up to 6 runs at input. Same output distribution achieved by convex combination. –Upper bound achieved by optimizing over large number of finite-dimensional vectors representing up to 6 runs. Heuristic/computational approach.

Bounds Achieved As p goes to 1, cannot obtain capacity better than (1 – p). Gap between (asymptotic) upper/lower bounds now roughly (1 – p)/9 and 4(1 – p)/5. –Room for improvement, probably both sides.

Conclusions and Open Questions What are the limitations of such side information arguments? Are novel upper bound techniques required for these channels? Is there a more purely information theoretic approach? Can we characterize optimal input/output distributions? Heuristic/other approaches to guide theory? How tight can we make upper/lower bounds? What is the right answer? Extend upper bounds for insertion channels? Many, many others…