Lecture 6. Prefix Complexity K The plain Kolmogorov complexity C(x) has a lot of “minor” but bothersome problems Not subadditive: C(x,y)≤C(x)+C(y) only.

Slides:



Advertisements
Similar presentations
Algorithm Analysis Input size Time I1 T1 I2 T2 …
Advertisements

Analysis of Algorithms II
Pretty-Good Tomography Scott Aaronson MIT. Theres a problem… To do tomography on an entangled state of n qubits, we need exp(n) measurements Does this.
The Complexity of Agreement A 100% Quantum-Free Talk Scott Aaronson MIT.
Kolmogorov complexity and its applications Paul Vitanyi CWI & University of Amsterdam Microsoft Intractability Workshop, 5-7.
Truthful Mechanisms for Combinatorial Auctions with Subadditive Bidders Speaker: Shahar Dobzinski Based on joint works with Noam Nisan & Michael Schapira.
Lecture 9. Resource bounded KC K-, and C- complexities depend on unlimited computational resources. Kolmogorov himself first observed that we can put resource.
Foundations of Cryptography Lecture 2: One-way functions are essential for identification. Amplification: from weak to strong one-way function Lecturer:
Lecture 3 Universal TM. Code of a DTM Consider a one-tape DTM M = (Q, Σ, Γ, δ, s). It can be encoded as follows: First, encode each state, each direction,
QuickSort Average Case Analysis An Incompressibility Approach Brendan Lucier August 2, 2005.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Ancient Paradoxes With An Impossible Resolution. Great Theoretical Ideas In Computer Science Steven RudichCS Spring 2005 Lecture 29Apr 28, 2005Carnegie.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Lecture 12: Lower bounds By "lower bounds" here we mean a lower bound on the complexity of a problem, not an algorithm. Basically we need to prove that.
Entropy Rates of a Stochastic Process
Chapter 6 Information Theory
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
Complexity 7-1 Complexity Andrei Bulatov Complexity of Problems.
Lecture 2. Randomness Goal of this lecture: We wish to associate incompressibility with randomness. But we must justify this. We all have our own “standards”
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Lecture 6. Prefix Complexity K, Randomness, and Induction The plain Kolmogorov complexity C(x) has a lot of “minor” but bothersome problems Not subadditive:
1 Introduction to Computability Theory Lecture11: Variants of Turing Machines Prof. Amos Israeli.
CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Fundamental limits in Information Theory Chapter 10 :
Randomized Algorithms and Randomized Rounding Lecture 21: April 13 G n 2 leaves
Information Theory and Security. Lecture Motivation Up to this point we have seen: –Classical Crypto –Symmetric Crypto –Asymmetric Crypto These systems.
Lecture 2: Basic Information Theory Thinh Nguyen Oregon State University.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
1 Introduction to Computability Theory Lecture11: The Halting Problem Prof. Amos Israeli.
Information Theory and Security
Lecture 3. Relation with Information Theory and Symmetry of Information Shannon entropy of random variable X over sample space S: H(X) = ∑ P(X=x) log 1/P(X=x)‏,
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
§1 Entropy and mutual information
Kolmogorov complexity and its applications Ming Li School of Computer Science University of Waterloo CS 898.
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
INTRODUCTION TO THE THEORY OF COMPUTATION INTRODUCTION MICHAEL SIPSER, SECOND EDITION 1.
Lecture 2. Randomness Goal of this lecture: We wish to associate incompressibility with randomness. But we must justify this. We all have our own “standards”
Channel Capacity.
Lecture 5. The Incompressibility Method  A key problem in computer science: analyze the average case performance of a program.  Using the Incompressibility.
Kolmogorov complexity and its applications Paul Vitanyi Computer Science University of Amsterdam Spring, 2009.
Lecture 18. Unsolvability Before the 1930’s, mathematics was not like today. Then people believed that “everything true must be provable”. (More formally,
Kolmogorov complexity and its applications Ming Li School of Computer Science University of Waterloo CS860,
Quantifying Knowledge Fouad Chedid Department of Computer Science Notre Dame University Lebanon.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
Kolmogorov Complexity and Universal Distribution Presented by Min Zhou Nov. 18, 2002.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Lecture 10. Paradigm #8: Randomized Algorithms Back to the “majority problem” (finding the majority element in an array A). FIND-MAJORITY(A, n) while (true)
Expected values of discrete Random Variables. The function that maps S into S X in R and which is denoted by X(.) is called a random variable. The name.
6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 22.
ADS 1 Algorithms and Data Structures 1 Syllabus Asymptotical notation (Binary trees,) AVL trees, Red-Black trees B-trees Hashing Graph alg: searching,
Discrete Random Variables. Introduction In previous lectures we established a foundation of the probability theory; we applied the probability theory.
1 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 467 / Phys 767 C&O 481 / C&O 681 Richard Cleve DC 3524 Course.
Bringing Together Paradox, Counting, and Computation To Make Randomness! CS Lecture 21 
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel.
Lecture 3. Symmetry of Information In Shannon information theory, the symmetry of information is well known. The same phenomenon is also in Kolmogorov.
Lecture 3. Symmetry of Information In Shannon information theory, the symmetry of information is well known. The same phenomenon is also in Kolmogorov.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
CS-424 Gregory Dudek Lecture 14 Learning –Inductive inference –Probably approximately correct learning.
Kolmogorov Complexity
Data Structures and Algorithm Analysis Lecture 24
Chapter 6: Discrete Probability
Information Complexity Lower Bounds
Lecture 4. Importance of being small
Lecture 18: Uniformity Testing Monotonicity Testing
Lecture 6. Prefix Complexity K
CS 154, Lecture 6: Communication Complexity
Presentation transcript:

Lecture 6. Prefix Complexity K The plain Kolmogorov complexity C(x) has a lot of “minor” but bothersome problems Not subadditive: C(x,y)≤C(x)+C(y) only modulo a logn factor. There exists x,y s.t. C(x,y)>C(x)+C(y)+logn –c. (This is because there are (n+1)2 n pairs of x,y s.t. |x|+|y|=n. Some pair in this set has complexity n+logn. Nonmonotonicity over prefixes Problems when defining random infinite sequences in connection with Martin-Lof theory where we wish to identify infinite random sequences with those whose finite initial segments are all incompressible, Lecture 2 Problem with Solomonoff’s initial university distribution P(x) = 2 -C(x) but  P(x)=∞.

In order to fix the problems … Let x=x 0 x 1 … x n, then x =x 0 0x 1 0x 2 0 … x n 1 and x’=|x| x Thus, x’ is a prefix code such that |x’| ≤ |x|+2 log|x| x’ is a self-delimiting version of x. Let reference TM’s have only binary alphabet {0,1}, no blank B. The programs p should from an effective prefix code:  p,p’ [ p is not prefix of p’] Resulting self-delimiting Kolmogorov complexity (Levin, 1974, Chaitin 1975). We use K for prefix Kolmogorov complexity to distinguish from C, the plain Kolmogorov complexity.

Properties By Kraft’s inequality (proof – look at the binary tree):  x  * 2 -K(x) ≤ 1 Naturally subadditive Monotonic over prefixes C(x) ≤ K(x) ≤ C(x)+log C(x) K(x|n) ≤ K(x) + O(1) ≤ C(x|n) + K(n)+O(1) ≤ C(x|n)+log*n+logn+loglogn+…+O(1) An infinite binary sequence ω is (Martin-Lof) random iff there is a constant c s.t. for all n, K(ω 1:n )≥n-c. (Note, please compare with Lecture 2, C-measure)

Alice’s revenge Remember Bob at a cheating casino flipped 100 heads in a row. Now Alice can have a winning strategy. She proposes the following: She pays $1 to Bob. She receives K(x) in return, for flip sequence x of length 100. Note that this is a fair proposal as  |x|= K(x) < $1 But if Bob cheats with 1 100, then Alice gets log100

Chaitin’s mystery number Ω Define Ω = ∑ P halts on ε 2 -|P| <1 by Kraft’s inequality. Theorem 1. Let X i =1 iff the ith program halts. Then Ω 1:n encodes X 1:2^n. I.e., from Ω 1:n we can compute X 1:2^n Proof. (1) Ω 1:n Ω 1:n. Then if P, |P|≤n, has not halted yet, it will not (since otherwise Ω > Ω’+2 -n > Ω). QED Bennett: Ω 1:10,000 yields all interesting mathematics. Theorem 2. K(Ω 1:n ) ≥n – c. Remark. Ω is a particular random sequence! Proof. By Theorem 1, given Ω 1:n we can obtain all halting programs of length ≤ n. For any x that is not an output of these programs, we have K(x)>n. Since from Ω 1:n we can obtain such x, it must be the case that K(Ω 1:n ) ≥n – c. QED

Universal distribution A (discrete) semi-measure is a function P that satisfies  x  N P(x)≤1. A enumerable semi-measure P 0 is universal (maximal) if P 0 if for all enumerable semi-measure P, there is a constant c p, s.t. for all x  N, c P P 0 (x)≥P(x). We say that P 0 dominates each P. Theorem. There is a universal enumerable semi- measure, m. Theorem. –log m(x) = K(x) + O(1) ---- Proof omitted. Remark. This universal distribution m is one of the foremost notions in KC theory. As prior probability in a Bayes rule, it maximizes ignorance by assigning maximal probability to all objects (as it dominates other distributions up to a multiplicative constant).

Average-case complexity under m Theorem [Li-Vitanyi]. If the input to an algorithm A is distributed according to m, then the average-case time complexity of A equals to A’s worst-case time complexity. Proof. Let T(n) be the worst-case time complexity. Define P(x) as follows: a n =  |x|=n m(x) If |x|=n, and x is the first s.t. t(x)=T(n), then P(x):=a n else P(x):=0. Thus, P(x) is enumerable, hence c P m(x)≥P(x). Then the average time complexity of A under m(x) is: T(n|m) =  |x|=n m(x)t(x) /  |x|=n m(x) ≥ 1/c P  |x|=n P(x)T(n) /  |x|=n m(x) = 1/c P  |x|=n [P(x)/  |x|=n P(x)] T(n) = 1/c P T(n). QED Intuition: The x with worst time has low KC, hence large m(x) Example: Quicksort. With easy inputs, more likely incur worst case.

Inductive Inference Solomonoff (1960, 1964): given a sequence of observations: S= Question: predict next bit of S. Using Bayesian rule: P(S1|S)=P(S1)P(S|S1) / P(S) =P(S1) / [P(S0)+P(S1)] here P(S1) is the prior probability, and we know P(S|S1)=P(S|S1)=1. Choose universal prior probability: P(S) = 2 -K(S), denoted as m(x) Continue in Lecture 7 on this topic.