1 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing CSE 599 Lecture 7: Information Theory, Thermodynamics and Reversible Computing.

1 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing CSE 599 Lecture 7: Information Theory, Thermodynamics and Reversible Computing F What have we done so far? íTheoretical computer science: Abstract models of computing ç Turing machines, computability, time and space complexity F Physical Instantiations 1. Digital Computing l Silicon switches manipulate binary variables with near-zero error 2. DNA computing l Massive parallelism and biochemical properties of organic molecules allow fast solutions to hard search problems 3. Neural Computing l Distributed networks of neurons compute fast, parallel, adaptive, and fault-tolerant solutions to hard pattern recognition and motor control problems

2 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Overview of Today’s Lecture F Information theory and Kolmogorov Complexity íWhat is information? íDefinition based on probability theory íError-correcting codes and compression íAn algorithmic definition of information (Kolmogorov complexity) F Thermodynamics íThe physics of computation íRelation to information theory íEnergy requirements for computing F Reversible Computing íComputing without energy consumption? íBiological example íReversibe logic gates  Quantum computing (next week!)

3 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Information and Algorithmic Complexity F 3 principal results: íShannon’s source-coding theorem ç The main theorem of information content ç A measure of the number of bits needed to specify the expected outcome of an experiment íShannon’s noisy-channel coding theorem ç Describes how much information we can transmit over a channel ç A strict bound on information transfer íKolmogorov complexity ç Measures the algorithmic information content of a string ç An uncomputable function

4 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing What is information? F First try at a definition… F Suppose you have stored n different bookmarks on your web browser. F What is the minimum number of bits you need to store these as binary numbers? F Let I be the minimum number of bits needed. Then, 2 I  n  I  log 2 n F So, the “information” contained in your collection of n bookmarks is I 0 = log 2 n

5 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Deterministic information I 0 F Consider a set of alternatives: X = {a 1, a 2, a 3, …a K } íWhen the outcome is a 3, we say x = a 3 F I 0 (X) is the amount of information needed to specify the outcome of X  I 0 (X) = log 2  X  ç We will assume base 2 from now on (unless stated otherwise) ç Units are bits (binary digits) F Relationship between bits and binary digits íB = {0, 1} íX = B M = set of all binary strings of length M  I 0 (X) = log  B M   log  2 M   M bits

6 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Is this definition satisfactory? F Appeal to your intuition… F Which of these two messages contains more “information”? “Dog bites man” or “Man bites dog”

7 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Is this definition satisfactory? F Appeal to your intuition… F Which of these two messages contains more “information”? “Dog bites man” or “Man bites dog” F Same number of bits to represent each message! F But, it seems like the second message contains a lot more information than the first. Why?

8 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Enter probability theory… F Surprising events (unexpected messages) contain more information than ordinary or expected events í“Dog bites man” occurs much more frequently than “Man bites dog F Messages about less frequent events carry more information F So, information about an event varies inversely with the probability of that event F But, we also want information to be additive íIf message xy contains sub-parts x and y, we want: ç I(xy) = I(x) + I(y) F Use the logarithm function: log(xy) = log(x) + log(y)

9 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing New Definition of Information F Define the information contained in a message x in terms of log of the inverse probability of that message: íI(x) = log(1/P(x)) = - log P(x) F First defined rigorously and studied by Shannon (1948) í“A mathematical theory of communication” – electronic handout (PDF file) on class website. F Our previous definition is a special case: íSuppose you had n equally likely items (e.g. bookmarks) íFor any item x, P(x) = 1/n íI(x) = log(1/P(x)) = log n íSame as before (minimum number of bits needed to store n items)

10 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Review: Axioms of probability theory F Kolmogorov, 1933 íP(a) >= 0where a is an event íP(l) = 1where l is the certain event íP(a + b) = P(a) + P(b)where a and b are mutually exclusive F Kolmogorov (axiomatic) definition is computable íProbability theory forms the basis for information theory íClassical definition based on event frequencies (Bernoulli) is uncomputable:

11 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Review: Results from probability theory F Joint probability of two events a and b: P(ab) F Independence íEvents a and b are independent if P(ab) = P(a)P(b) F Conditional probability: P(a|b) = probability that event a happens given that b has happened íP(a|b) = P(ab)/P(b) íP(b|a) = P(ba)/P(a) = P(ab)/P(a) F We just proved Bayes’ Theorem: íP(a) is called the a priori probability of a  P(a  b) is called the a posteriori probability of a

12 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Summary: Postulates of information theory 1. Information is defined in the context of a set of alternatives. The amount of information quantifies the number of bits needed to specify an outcome from the alternatives 2.The amount of information is independent of the semantics (only depends on probability) 3.Information is always positive 4.Information is measured on a logarithmic scale  Probabilities are multiplicative, but information is additive

13 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing In-Class Example F Message y contains duplicates: y = xx F Message x has probability P(x) F What is the information content of y? íIs I(y) = 2 I(x)?

14 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing In-Class Example F Message y contains duplicates: y = xx F Message x has probability P(x) F What is the information content of y? íIs I(y) = 2 I(x)? F I(y) = log(1/P(xx)) = log[1/(P(x|x)P(x))] = log(1/P(x|x)) + log(1/P(x)) = 0 + log(1/P(x)) = I(x) F Duplicates convey no additional information!

15 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Definition: Entropy F The average self-information or entropy of an ensemble X= {a 1, a 2, a 3, …a K } íE  expected (or average) value

16 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Properties of Entropy F 0 <= H(X) <= I 0 (X)  Equals I 0 (X) = log  X  if all the a k ’s are equally probable íEquals 0 if only one a k is possible F Consider the case where k = 2 íX = {a 1, a 2 } íP(a 1 ) =  ; P(a 2 ) = 1–  í

17 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Examples F Entropy is a measure of randomness of the source producing the events F Example 1 : Coin toss: Heads or tails with equal probability íH = -(½ log ½ + ½ log ½) = -(½ (-1) + ½ (-1)) = 1 bit per coin toss F Example 2 : P(heads) = ¾ and P(tails) = ¼ íH = -(¾ log ¾ + ¼ log ¼) = 0.811 bits per coin toss íAs things get less random, entropy decreases íRedundancy and regularity increases

18 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Question F If we have N different symbols, we can encode them in log(N) bits. Example: English - 26 letters  5 bits F So, over many, many messages, the average cost/symbol is still 5 bits. F But, letters occur with very different probabilities! “A” and “E” much more common than “X” and “Q”. The log(N) estimate assumes equal probabilities. F Question: Can we encode symbols based on probabilities so that the average cost/symbol is minimized?

19 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Shannon’s noiseless source-coding theorem F Also called the fundamental theorem. In words: íYou can compress N independent, identically distributed (i.i.d.) random variables, each with entropy H, down to NH bits with negligible loss of information (as N  ) íIf you compress them into fewer than NH bits you will dramatically lose information F The theorem:  Let X be an ensemble with H(X) = H bits. Let H  (X) be the entropy of an encoding of X with allowable probability of error  íGiven any  > 0 and 0 N o,

20 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Comments on the theorem F What do the two inequalities tell us? í ç The number of bits that we need to specify outcomes x with vanishingly small error probability  does not exceed H +  ç If we accept a vanishingly small error, the number of bits we need to specify x drops to N(H +  ) í ç The number of bits that we need to specify outcomes x with large allowable error probability  is at least H – 

21 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Source coding (data compression) F Question: How do we compress the outcomes X N ? íWith vanishingly small probability of error íHow do we assign the elements of X such that the number of bits we need to encode X N drops to N(H +  ) F Symbol coding: Given x = a3 a2 a7 … a5 íGenerate codeword (x) = 01 1010 00 íWant I o ( (x)) ~ H(X) F Well-known coding examples íZip, gzip, compress, etc. íThe performance of these algorithms is, in general, poor when compared to the Shannon limit

22 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Source-coding definitions F A code is a function : X  B + íB = {0, 1} íB +  the set of finite strings over B ç B + = {0, 1, 00, 01, 10, 11, 000, 001, …} í (x) = (x 1 ) (x 2 ) (x 3 ) … (x N ) F A code is uniquely decodable (UD) iff í : X +  B + is one-to-one F A code is instantaneous iff íNo codeword is the prefix of another í (x 1 ) is not a prefix of (x 2 )

23 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Huffman coding F Given X = {a 1, a 2, …a K }, with associated probabilities P(a k ) F Given a code with codeword lengths n 1, n 2, …n k íThe expected code length F No instantaneous, UD code can achieve a smaller than a Huffman code í

24 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Constructing a Huffman code F Feynman example: Encoding an alphabet íCode is instantaneous and UD: 00100001101010 = ANOTHER F Code achieves close to Shannon limit íH(X) = 2.06 bits; = 2.13 bits 1

25 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Information channels F I(X;Y) is the average mutual information between X and Y F Definition: Channel capacity íThe information capacity of a channel is: C = max[I(X;Y)] F The channel may add noise íCorrupting our symbols channel xy InputOutput H(X)  entropy of input ensemble X I(X;Y)  what we know about X given Y

26 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Example: Channel capacity Problem: A binary source sends  equiprobable messages in a time T, using the alphabet {0, 1} with a symbol rate R. As a result of noise, a “0” may be mistaken for a “1”, and a “1” for a “0”, both with probability q. What is the channel capacity C? XY Channel is discrete and memoryless

27 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Example: Channel capacity (con’t) Assume no noise (no errors) T is the time to send the string, R is the rate The number of possible message strings is 2 RT The maximum entropy of the source is H o = log(2 RT ) bits The source rate is (1/T) H o = R bits per second The entropy of the noise (per transmitted bit) is H n = qlog[1/q] + (1–q)log[1/(1–q)] The channel capacity C (bits/sec) = R – RH n = R(1 – H n ) C is always less than R (a fixed fraction of R)! We must add code bits to correct the received message

28 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing How many code bits must we add? F We want to send a message string of length M íWe add codebits to M, thereby increasing its length to M c íHow are M, M c, and q related? F M = M c (1 – H n ) íIntuitively, from our example íAlso see pgs. 106 – 110 of Feynman íNote: this is an asymptotic limit ç May require a huge M c

29 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Shannon’s Channel-Coding Theorem F The Theorem: íThere is a nonnegative channel capacity C associated with each discrete memoryless channel with the following property: For any symbol rate R 0, there is a protocol that achieves a rate >= R and a probability of error <=  F In words: íIf the entropy of our symbol stream is equal to or less than the channel capacity, then there exists a coding technique that enables transmission over the channel with arbitrarily small error íCan transmit information at a rate H(X) <= C F Shannon’s theorem tells us the asymptotically maximum rate íIt does not tell us the code that we must use to obtain this rate íAchieving a high rate may require a prohibitively long code

30 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Error-correction codes F Error-correcting codes allow us to detect and correct errors in symbol streams íUsed in all signal communications (digital phones, etc) íUsed in quantum computing to ameliorate effects of decoherence F Many techniques and algorithms íBlock codes íHamming codes íBCH codes íReed-Solomon codes íTurbo codes

31 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Hamming codes F An example: Construct a code that corrects a single error íWe add m check bits to our message ç Can encode at most (2 m – 1) error positions íErrors can occur in the message bits and/or in the check bits ç If n is the length of the original message then 2 m – 1 >= (n + m) íExamples: ç If n = 11, m = 4: 2 4 – 1= 15 >= (n + m) = 15 ç If n = 1013, m = 10: 2 10 – 1= 1023 >= (n + m) = 1023

32 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Hamming codes (cont.) F Example: An 11/15 SEC Hamming code íIdea: Calculate parity over subsets of input bits ç Four subsets: Four parity bits íCheck bit x stores parity of input bit positions whose binary representation holds a “1” in position x: ç Check bit c1: Bits 1,3,5,7,9,11,13,15 ç Check bit c2: Bits 2,3,6,7,10,11,14,15 ç Check bit c3: Bits 4,5,6,7,12,13,14,15 ç Check bit c4: Bits 8,9,10,11,12,13,14,15 F The parity-check bits are called a syndrome íThe syndrome tells us the location of the error Position in message binarydecimal 0001 1 0010 2 0011 3 0100 4 0101 5 0110 6 0111 7 1000 8 1001 9 1010 10 1011 11 1100 12 1101 13 1110 14 1111 15

33 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Hamming codes (con’t) F The check bits specify the error location F Suppose check bits turn out to be as follows: íCheck c1 = 1 (Bits 1,3,5,7,9,11,13,15) ç Error is in one of bits 1,3,5,7,9,11,13,15 íCheck c2 = 1 (Bits 2,3,6,7,10,11,14,15) ç Error is in one of bits 3,7,11,15 íCheck c3 = 0 (Bits 4,5,6,7,12,13,14,15) ç Error is in one of bits 3,11 íCheck c4 = 0 (Bits 8,9,10,11,12,13,14,15) ç So error is in bit 3!!

34 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Hamming codes (cont.) F Example: Encode 10111011011 íCode position:15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 íCode symbol: 1 0 1 1 1 0 1 c4 1 0 1 c3 1 c2 c1 íCodeword:1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 íNotice that we can generate the code bits on the fly! F What if we receive 101100111011101? íc4 = 1 101100111011101 íc3 = 0 101100111011101 íc2 = 1 101100111011101 íc1 = 1 101100111011101 íThe error is in location 1011 = 11 10

35 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Kolmogorov Complexity (Algorithmic Information) F Computers represent information as stored symbols íNot probabilistic n the Shannon sense) íCan we quantify information from an algorithmic standpoint? F Kolmogorov complexity K(s) of a finite binary string s is the single, natural number representing the minimum length (in bits) of a program p that generates s when run on a Universal Turing machine U íK(s) is the algorithmic information content of s íQuantifies the “algorithmic randomness” of the string F K(s) is an uncomputable function íSimilar argument to the halting problem ç How do we know when we have the shortest program?

36 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Kolmogorov Complexity: Example F Randomness of a string defined by shortest algorithm that can print it out. F Suppose you were given the binary string x: “11111111111111….11111111111111111111111” (1000 1’s) F Instead of 1000 bits, you can compress this string to a few tens of bits, representing the length |P| of the program: íFor I = 1 to 1000 ç Print “1” F So, K(x) <= |P| F Possible project topic: Quantum Kolmogorov complexity?

37 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing 5-minute break… Next: Thermodynamics and Reversible Computing

38 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Thermodynamics and the Physics of Computation F Physics imposes fundamental limitations on computing íComputers are physical machines íComputers manipulate physical quantities íPhysical quantities represent information F The limitations are both technological and theoretical íPhysical limitations on what we can build ç Example: Silicon-technology scaling íMajor limiting factor in the future: Power Consumption íTheoretical limitations of energy consumed during computation ç Thermodynamics and computation

39 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Principal Questions of Interest F How much energy must we use to carry out a computation? íThe theoretical, minimum energy F Is there a minimum energy for a certain rate of computation? íA relationship between computing speed and energy consumption F What is the link between energy and information? íBetween information–entropy and thermodynamic–entropy F Is there a physical definition for information content? íThe information content of a message in physical units

40 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Main Results F Computation has no inherent thermodynamic cost íA reversible computation, that proceeds at an infinitesimal rate, consumes no energy F Destroying information requires kTln2 joules per bit íInformation-theoretic bits (not binary digits) F Driving a computation forward requires kTln(r) joules per step ír is the rate of going forward rather than backward

41 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Basic thermodynamics F First law: Conservation of energy í(heat put into system) + (work done on system) = increase in energy of a system   Q +  W =  U íTotal energy of the universe is constant  Second law: It is not possible to have heat flow from a colder region to a hotter region i.e.  Q/T >= 0  Change in Entropy  S =  Q/T íEquality holds only for reversible processes íThe entropy of the universe is always increasing

42 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Heat engines F A basic heat engine: Q 2 = Q 1 – W íT 1 and T 2 are temperatures íT 1 > T 2 F Reversible heat engines are those that have: íNo friction íInfinitesimal heat gradients F The Carnot cycle: Motivation was steam engine íReversible  Pumps heat  Q from T 1 to T 2  Does work W =  Q

43 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Heat engines (cont.)

44 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing The Second Law F No engine that takes heat Q 1 at T 1 and delivers heat Q 2 at T 2 can do more work than a reversible engine íW = Q 1 – Q 2 = Q 1 (T 1 – T 2 ) / T 1 F Heat will not, by itself, flow from a cold object to a hot object

45 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Thermodynamic entropy F If we add heat  Q reversibly to a system at fixed temperature T, the increase in entropy of the system is  S =  Q/T F S is a measure of degrees of freedom íThe probability of a configuration ç The probability of a point in phase space íIn a reversible system, the total entropy is constant íIn an irreversible system, the total entropy always increases

46 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Thermodynamic versus Information Entropy F Assume a gas containing N atoms íOccupies a volume V 1 íIdeal gas: No attraction or repulsion between particles F Now shrink the volume íIsothermally (at constant temperature, immerse in a bath) íReversibly, with no friction F How much work does this require?

47 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Compressing the gas F From mechanics íwork = force × distance íforce = pressure × (area of piston) ívolume change = (area of piston) × distance íSolving: F From gas theory íThe idea gas law: íN  number of molecules ík  Boltzmann’s constant (in joules/Kelvin) F Solving:

48 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing A few notes F W is negative because we are doing work on the gas:V 2 < V 1 íW would be positive if the gas did work for us F Where did the work go? íIsothermal compression ç The temperature is constant (same before and after) íFirst law: The work went into heating the bath íSecond law: We decreased the entropy of the gas and increased the entropy of the bath

49 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Free energy and entropy F The total energy of the gas, U, remains unchanged íSame number of particles íSame temperature F The “free energy” F e, and the entropy S both change íBoth are related to the number of states (degrees of freedom) ç F e = U – TS F For our experiment, change in free energy is equal to the work done on the gas and U remains unchanged  F e is the (negative) heat siphoned off into the bath

50 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Special Case: N = 1 F Imagine that our gas contains only one molecule íTake statistical averages of same molecule over time rather than over a population of particles íHalve the volume ç F e increases by +kTln2 ç S decreases by kln2 ç But U is constant F What’s going on? íOur knowledge of the possible locations of the particle has changed! íFewer places that themolecule can be in, now that volume has been halved íThe entropy, a measure of the uncertainty of a configuration, has decreased

51 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Thermodynamic entropy revisited F Take the probability of a gas configuration to be P íThen S ~ klnP ç Random configurations (molecules moving haphazardly) have large P and large S ç Ordered configurations (all molecules moving in one direction) have small P and small S F The less we know about a gas… íthe more states it could be in íand the greater the entropy F A clear analogy with information theory

52 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing The fuel value of knowledge F Analysis is from Bennett: Tape cells with particles coding 0 (left side) or 1 (right side) F If we know the message on a tape íThen randomizing the tape can do useful work ç Increasing the tape’s entropy What is the fuel value of the tape (i.e. what is the fuel value of our knowledge)?

53 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Bennett’s idea F The procedure íTape cell comes in with known particle location íOrient a piston depending on whether cell is a 0 or a 1 íParticle pushes piston outward ç Increasing the entropy by kln2 ç Providing free energy of kTln2 joules per bit íTape cell goes out with randomized particle location

54 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing The energy value of knowledge F Define fuel value of tape = (N – I)kTln2 íN is the number of tape cells íI is information (Shannon) F Examples íRandom tape (I = N) has no fuel value íKnown tape (I = 0) has maximum fuel value

55 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Feynman’s tape-erasing machine F Define the information in the tape to be the amount of free energy required to reset the tape íThe energy required to compress each bit to a known state íOnly the “surprise” bits cost us energy ç Doesn’t take any energy to reset known bits íCost to erase the tape: IkTln2 joules For known bits, just move the partition (without changing the volume)

56 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Reversible Computing F A reversible computation, that proceeds at an infinitesimal rate, destroying no information, consumes no energy íRegardless of the complexity of the computation íThe only cost is in resetting the machine at the end íErasing information costs energy F Reversible computers are like heat engines íIf we run a reversible heat engine at an infinitesimal pace, it consumes no energy other than the work that it does

57 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Energy cost versus speed F We want our computations to run in finite time íWe need to drive the computation forward ç Dissipates energy (kinetic, thermal, etc.) F Assume we are driving the computation forward at a rate r íThe computation is r times as likely to go forward as go backward F What is the minimum energy per computational step?

58 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Energy-driven computation F Computation is a transition between states íState transitions have an associated energy diagram ç Assume forward state E 2 has a lower energy than backward state E 1 ç “A” is the activation energy for a state transition íThermal fluctuations cause the computer to move between states ç Whenever the energy exceeds “A” We also used this model in neural networks (e.g. Hopfield networks)

59 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing State transitions  The probability of a transition between states differing in positive energy  E is proportional to exp(–  E/kT) F Our state transitions have unequal probabilities íThe energy required for a forward step is (A – E 1 ) íThe energy required for a backward step is (A – E 2 )

60 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Driving computation by energy differences F The (reaction) rate r depends only on the energy difference between successive states íThe bigger (E 1 – E 2 ), the more likely the state transitions, and the faster the computation F Energy expended per step = E 1 – E 2 = kTlnr

61 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Driving computation by state availability F We can drive a computation even if the forward and backward states have the same energy íAs long as there are more forward states than backward states F The computation proceeds by diffusion íMore likely to move into a state with greater availability íThermodynamic entropy drives the computation

62 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Rate-Driven Reversible Computing: A Biological Example F Protein synthesis is an example... íof (nearly) reversible computation íof the copy computation íof a computation driven forward by thermodynamic entropy F Protein synthesis is a 2-stage process í1. DNA forms mRNA í2. mRNA forms a protein F We will consider step 1

63 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing DNA F DNA comprises a double-stranded helix íEach strand comprises alternating phosphate and sugar groups íOne of four bases attaches to each sugar ç Adenine (A) ç Thymine (T) ç Cytosine (C) ç Guanine (G) í(base + sugar + phosphate) group is called a nucleotide F DNA provides a template for protein synthesis íThe sequence of nucleotides forms a code

64 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing RNA polymerase F RNA polymerase attaches itself to a DNA strand íMoves along, building an mRNA strand one base at a time F RNA polymerase catalyzes the copying reaction íWithin the nucleus there is DNA, RNA polymerase, and triphosphates (nucleotides with 2 extra phosphates), plus other stuff íThe triphosphates are ç adenosine triphosphate (ATP) ç cytosine triphosphate (CTP) ç guanine triphosphate (GTP) ç uracil triphosphate (UTP)

65 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing mRNA F The mRNA strand is complementary to the DNA íThe matching pairs are DNARNA A U T A C G G C F As each nucleotide is added, two phosphates are released íBound as a pyrophosphate

66 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing The process

67 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing RNA polymerase is a catalyst F Catalysts influence the rate of a biochemical reaction íBut not the direction F Chemical reactions are reversible íRNA polymerase can unmake an mRNA strand ç Just as easily as it can make one ç Grab a pyrophosphate, attach to a base, and release F The direction of the reaction depends on the relative concentrations of the pyrophosphates and triphosphates íMore triphosphates than pyrophosphates: Make RNA íMore pyrophosphates than triphosphates: Unmake RNA

68 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing DNA, entropy, and states F The relative concentrations of pyrophosphate and triphosphate define the number of states available íCells hydrolyze pyrophosphate to keep the reactions going forward F How much energy does a cell use to drive this reaction? íEnergy = kTlnr = (S 2 – S 1 )T ~ 100kT/bit

69 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Efficiency of a representation F Cells create protein engines (mRNA) for 100kT/bit F 0.03µm transistors consume 100kT per switching event F Think of representational efficiency íWhat does each system get for 100kT? F Digital logic uses an impoverished representation í10 4 switching events to perform an 8-bit multiply ç Semiconductor scaling doesn’t improve the representation íWe pay a huge thermodynamic cost to use discrete math

70 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Example 2: Computing using Reversible Logic Gates F Two reversible gates: controlled not (CN) and controlled controlled not (CCN). ABA’ B’ 000 0 010 1 101 1 111 0 A B C A’ B’ C’ 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 1 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 0 CCN is complete: we can form any Boolean function using only CCN gates: e.g. AND if C = 0

71 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Next Week: Quantum Computing F Reversible Logic Gates and Quantum Computing íQuantum versions of CN and CCN gates íQuantum superposition of states allows exponential speedup F Shor’s fast algorithm for factoring and breaking the RSA cryptosystem F Grover’s database search algorithm F Physical substrates for quantum computing

72 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing Next Week… F Guest Lecturer: Dan Simon, Microsoft Research íIntroductory lecture on quantum computing and Shor’s algorithm F Discussion and review afterwards F Homework # 4 due: submit code and results electronically by Thursday (let us know if you have problems meeting the deadline) F Sign up for project and presentation times F Feel free to contact instructor and TA if you want to discuss your project F Have a great weekend!

1 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing CSE 599 Lecture 7: Information Theory, Thermodynamics and Reversible Computing.

Similar presentations

Presentation on theme: "1 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing CSE 599 Lecture 7: Information Theory, Thermodynamics and Reversible Computing."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing CSE 599 Lecture 7: Information Theory, Thermodynamics and Reversible Computing.

Similar presentations

Presentation on theme: "1 R. Rao, Week 3: Information Theory, Thermodynamics, and Reversible Computing CSE 599 Lecture 7: Information Theory, Thermodynamics and Reversible Computing."— Presentation transcript:

Similar presentations

About project

Feedback