# Introduction to Algorithm

## Presentation on theme: "Introduction to Algorithm"— Presentation transcript:

Introduction to Algorithm
Instructor: Dr. Bin Fu Office: ENGR (Third Floor) Textbook: Introduction to Algorithm (Second Edition) by Cormem, Leiserson, Rivest and Stein

Why study algorithms and performance?
Performance often draws the line between what is feasible and what is impossible. Algorithmic mathematics provides a language for talking about program behavior. (e.g., by using big-O –notation) In real life, many algorithms, though different from each other, fall into one of several paradigms (discussed shortly). These paradigms can be studied.

Why these particular algorithms ??
In this course, we will discuss problems, and algorithms for solving these problems.

Why these algorithms (cont.)
1. Main paradigms: a) Greedy algorithms b) Divide-and-Conquers c) Dynamic programming d) Brach-and-Bound (mostly in AI ) e) Etc etc. 2. Other reasons: a) Relevance to many areas: • E.g., networking, internet, search engines…

Topics Recursive Equations Divide and Conquer Method
Dynamic Programming Method Basic and Advanced Data Structures Graph Algorithms Approximation Algorithms NP-Complete Theory Randomized Algorithms

Grade 4-5 assignments (35%) Midterm (20%) Final (25%)
Exercises and Attendance for the class (20%)

Advantage for a good algorithm designer
It helps you develop efficient software It is easy to switch from one area to another in computer science

2.2 Analyzing Algorithms RAM: Random-access machine,
RAM: Random-access machine, It takes one step for accessing memory once. The instructions are executed one by one sequentially Running time: the total number of steps expressed as the function of input size

The problem of sorting Input: sequence 〈a1, a2, …, an〉 of numbers.
Output: permutation of the input numbers: Example: Input: Output:

Running time The running time depends on the input:
Parameterize the running time by the size of the input n • Seek upper bounds on the running time T(n) for the input size n, because everybody likes a guarantee.

Kinds of analyses Worst-case: (usually)
T(n) = maximum time of algorithm on any input of size n. Average-case: (sometimes) T(n) = expected time of algorithm over all inputs of size n. Need assumption of statistical distribution of inputs. Best-case: (bogus) Cheat with a slow algorithm that works fast on some input.

Machine-independent time
What is insertion sort’s worst-case time? • It depends on the speed of our computer: • relative speed (on the same machine), • absolute speed (on different machines). BIG IDEA: • Ignore machine-dependent constants. • Look at growth of T(n) as n → ∞ . “Asymptotic Analysis”

Bubble Sort Algorithm Compare 1st two elements and exchange them if they are out of order. Move down one element and compare 2nd and 3rd elements. Exchange if necessary. Continue until end of array. Pass through array again, repeating process and exchanging as necessary. Repeat until a pass is made with no exchanges. See pr9-04.cpp

Bubble Sort Example Array numlist3 contains 17 23 5 11
Compare values 17 and 23. In correct order, so no exchange. Compare values 23 and 11. Not in correct order, so exchange them. 17 23 5 11 5. Not in correct order,

Bubble Sort Example Array numlist3 contains 17 5 23 11
Compare values 23 and 11. Not in correct order, so exchange them. 17 5 23 11

Bubble Sort Example Array numlist3 contains 17 5 11 23
Compare values 23 and 11. Not in correct order, so exchange them. 17 5 11 23

Bubble Sort Example (continued)
After first pass, array numlist3 contains In order from previous pass Compare values 17 and 23. In correct order, so no exchange. 5 17 11 23 11. Not in correct order, so exchange them.

Bubble Sort Example (continued)
After second pass, array numlist3 contains Compare values 5 and 11. In correct order, so no exchange. Compare values 17 and 23. In correct order, so 5 11 17 23 Compare values 11 and 17. In correct order, so In order from previous passes No exchanges, so array is in order

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Sorting Problem Given a series of integers 7, 2, 5, 3, 6, 9, 8, 1
Arrange them by the increasing order: 1< 2< 3 < 5 < 6 < 7< 8 < 9

Merge for Sorting Convert the sorting for 7, 2, 5, 3, 6, 9, 8, 1 into
7, 2, 5, 3, and 6, 9, 8, 1 Sort 7, 2, 5, 3 into 2< 3 < 5 < 7 Sort 6, 9, 8, 1 into 1< 6 < 8 <9 Merge 2< 3 < 5 < 7 1< 6 < 8 <9  2< 3 < 5 < 7  3 < 5 < 7  5 < 7   6 < 8 < < 8 < <8 < <8 <9 < <2< <2<3<5

Merge for Sorting Merge 2< 3 < 5 < 7 1< 6 < 8 <9
 2< 3 < 5 < 7  3 < 5 < 7  5 < 7  7 6 < 8 < < 8 < <8 < <8 <9 < <2< <2<3<5   8< <9 1<2<3<5<6 1<2<3<5<6< <2<3<5<6<7<8<9

Time analysis P: T(n): the number of steps to sort n elements T(1)=1
T(n)=2T(n/2)+n for n>1 P1 P2

Time analysis P: T(n)=2T(n/2)+n =2(2T(n/4)+n/2)+n=4T(n/4)+n+n
=4(2T(n/8)+n/4)+n+n=8T(n/8)+n+n+n =… = =O(n (log n)) P1 P2

Every layer costs steps
Total #layers is log n Total time is n (log n) Merge time #Nodes P n P P n/2 ……

Exponentiation Problem
Compute

Polynomial Problem Compute Compute a general polynomial:

Exercise Draw the tree for merge sorting
15, 11, 4, 22, 31, 55, 71, 12, 7, 2, 5, 3, 6, 9, 8, 1 Point out the number of comparison that you use.

Growth of Functions and Recursion Equations
Chapter 3-4 Growth of Functions and Recursion Equations 我們要如何知道一個演算法的優劣？ 要如何分辨兩個演算法何者較佳？ 在這個章節中將介紹一些表示法，今後都將用本章所介紹的表示法來表示演算法的時間複雜度。

O-notation: f(n) = O(g(n))，g(n) is an asymptotically upper bound for f(n)。
O(g(n)) = {f(n)| if there are positive constants c and n0 such that 0  f(n)  c2 g(n) for all large n  n0 } O(n2)的意義是說：當 n 大到某個程度之後，所需要花的時間””最慘””只會跟 n2 成正比。

Example: 3n2 - 6n = O(n2)。

O-notation Drop low-order terms; ignore leading constants. Example:
we say that T(n)= O( g(n) ) iff there exists positive constants , and such that 0<T(n) < g(n) for all n > n0 Usually T(n) is running time, and n is size of input

Simplified Master Theorem
Let be a recursive equation on the nonnegative integers, where a> 0, b > 1, c>0, and r>=0 are constants, Then, 1. If , then 2. If , then 3. If , then

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Sum Sum Let

Sum From So,

Sum Assume that d is a constant.
Case 1: d> (Main term is Right region) Case 2: d= (Each term is same ) Case 3: d< (Main term is at left region)

Order Assume that d, r are constants. Case 1: d>1. Case 2: d=1.

Simplified Master Theorem
Let be a recursive equation on the nonnegative integers, where a> 0, b > 1, c>0, and r>0 are constants, Then, 1. If , then 2. If , then 3. If , then

Layers Layer

Proof. The number of nodes in the j-th layer is
The size of a node in the j-the layer is The cost of each node in the j-th layer is The total cost at j-th layer is

Proof. The number of layers is The total cost at all layers is

Proof. The total cost at all layers is

Proof. The total cost at all layers is Where d is the constant

Proof. Case 1. So, Therefore, the total cost is

Proof. Case 2. So, Therefore, the total cost is

Proof. Case 3. So, Therefore, the total cost is

Simplified Master Theorem
Let Then, 1. If , then (The main cost is the bottom layers region) 2. If , then (Every layer has roughly the same cost) 3. If , then (The main cost is the top layers region)

3.1 Asymptotic notation Θ-notation: f(n) = Θ(g(n))，g(n) is an asymptotically tight bound for f(n)。 Θ(g(n)) = {f(n)| there exists constants c1，c2，and n0 such that 0  c1 g(n)  f(n)  c2 g(n) for all large n  n0 } 假設有個演算法的執行時間為Θ(n2)， 我們可以說：當 n 大到某個程度之後，所需要的執行時間會跟 n2 成正比。

Example: Prove 3n2 - 6n = Θ(n2)。

Example: Prove 3n2 - 6n = Θ(n2)。
Proof: We need to find constants c1,，c2 and n0 such that: c1n2  3n2 - 6n  c2n2，(for all nn0) Divide n2 c1  3 - 6/n  c2 Select c1=2，c2=3 and n0=6

Note： f(n) = Θ(g(n)) iff g(n)= Θ(f(n))，For example: n2=(3n2-6n)
O-notation: f(n) = O(g(n))，g(n) is an asymptotically upper bound for f(n)。 O(g(n)) = {f(n)| if there are positive constants c and n0 such that 0  f(n)  c2 g(n) for all large n  n0 } O(n2)的意義是說：當 n 大到某個程度之後，所需要花的時間””最慘””只會跟 n2 成正比。

Θ(g(n))  O(g(n)) f(n) = Θ(g(n)) implies f(n) = O(g(n)) 6n = O(n)，6n = O(n2) Computational time O(n2) means the time in the worst case is O(n2)

Note： f(n) = Θ(g(n)) if and only if (f(n)=O(g(n))) & (f(n)=Ω (g(n)))
Ω-notation: f(n) = Ω(g(n))，g(n) is an asymptotically lower bound for f(n)。 Ω(g(n)) = {f(n)| there are positive constants c and n0 such that 0  cg(n)  f(n) for all n  n0 } Note： f(n) = Θ(g(n)) if and only if (f(n)=O(g(n))) & (f(n)=Ω (g(n))) Ω(n2)的意義則是：當 n 大到某個程度之後，所需要花的時間””至少””會跟 n2 成正比。

tight bound upper bound lower bound 用函式的圖形來表示剛剛介紹的三種表示法。
Θ是tight bound，O是upper bound， Ω是lower bound。 tight bound upper bound lower bound

o-notation: f(n) = o(g(n)) (little-oh of g of n)
o(g(n)) = {f(n)| for every positive constant c，there exists constant n0 > 0 such that 0  f(n) < cg(n) for all n  n0 } 2n = o(n2)，but 2n2  o(n2) f(n) = o(g(n)) can be also written

Comparison of functions
functions: Ω Θ O o numbers:  =  < Transitivity，Reflexivity，Symmetry Two real numbers are always comparable，two functions may not be comparable Example： f(n)=n and g(n)=n1+sin n f(n)=n g(n)=n1+sin(n) 在某些時候 f(n) 比較大，某些時候 g(n) 比較大

Appendix A: Summation formulas

Simplified Master Theorem
Let Then, 1. If , then (The main cost is the bottom layers region) 2. If , then (Every layer has roughly the same cost) 3. If , then (The main cost is the top layers region)

Exercise For the following two equations. Identify the main cost region and solution with the simplified master theorem :

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Divide and Conquer Method
The most-well known algorithm design strategy: Divide instance of problem into two or more smaller instances Solve smaller instances recursively Obtain solution to original (larger) instance by combining these solutions

Simplified Master Theorem
Let be a recursive equation on the nonnegative integers, where a> 0, b > 1, c>0, and r>0 are constants, Then, 1. If , then 2. If , then 3. If , then

Layers Layer

Integer Multiplication
12x 47=564 by the algorithm below 12 47 84 48 564

Integer Multiplication
A = B = The elementary school algorithm: a1 a2 … an b1 b2 … bn (d10) d11d12 … d1n (d20) d21d22 … d2n … … … … … … … (dn0) dn1dn2 … dnn Efficiency: n2 one-digit multiplications

Karatsuba’s Algorithm
Using the classical pen and paper algorithm two n digit integers can be multiplied in O(n2) operations. Karatsuba came up with a faster algorithm. Let A and B be two integers with A = A110k + A0, A0 < 10k B = B110k + B0, B0 < 10k C = A*B = (A110k + A0)(B110k + B0) = A1B1102k + (A1B0 + A0 B1)10k + A0B0

A trivial analysis T(n)=4T(n/2)+O(n) T(n)=O( )

Simplified Master Theorem
Let Then, 1. If , then (The main cost is the bottom layers region) 2. If , then (Every layer has roughly the same cost) 3. If , then (The main cost is the top layers region)

3 Multiplications Instead this can be computed with 3 multiplications
T0 = A0B0 T1 = (A1 + A0)(B1 + B0) T2 = A1B1 C = T2102k + (T1 - T0 - T2)10k + T0

Complexity of Algorithm
Let T(n) be the time to compute the product of two n-digit numbers using Karatsuba’s algorithm. Assume n = 2k. T(n) = (nlg(3)), lg(3)  1.58 T(n)  3T(n/2) + cn

Matrix Multiplication
Regular method takes time O(n*n*n)

IDEA: r s a b e f = x t u c d g h r =ae+bg s =af +bh t =ce+dg u =cf +dh

Strassen’s algorithm Multiply 2×2 matrices with only 7 recursive mults. P1 = a ⋅ ( f – h) P2 = (a + b) ⋅ h P3 = (c + d) ⋅ e P4 = d ⋅ (g – e) P5 = (a + d) ⋅ (e + h) P6 = (b – d) ⋅ (g + h) P7 = (a – c) ⋅ (e + f )

Strassen’s Algorithm r = P5 + P4 – P2 + P6 s = P1 + P2 t = P3 + P4
u = P5 + P1 – P3 – P7

Problem Verify one of the four equations in the Strassen’s algorithm:
u =P5 + P1 – P3 – P7

Strassen observed [1969] that the product of two matrices can be computed as follows:
C00 C A00 A B00 B01 = * C10 C A10 A B10 B11 M1 + M4 - M5 + M M3 + M5 = M2 + M M1 + M3 - M2 + M6

M1 = (A00 + A11)  (B00 + B11) M2 = (A10 + A11)  B00 M3 = A00  (B01 - B11) M4 = A11  (B10 - B00) M5 = (A00 + A01)  B11 M6 = (A10 - A00)  (B00 + B01) M7 = (A01 - A11)  (B10 + B11)

Analysis of Strassen’s Algorithm
If n is not a power of 2, matrices can be padded with zeros. Number of multiplications: T(n) = 7T(n/2)+O( ), T(1) = 1 Solution: T(n) = nlog 27 ≈ n2.807 vs. n3 of brute-force alg.

Problem Verify two of the four equations in the Strassen’s algorithm:
t = P3 + P4 and u =P5 + P1 – P3 – P7

Heap 3 7 5 10 11 6 Father<= children The root is the smallest
Father<= children The root is the smallest Perfect binary tree

Heap 3 7 5 10 11 6 Father<= children The root is the smallest
Father<= children The root is the smallest Perfect binary tree (every layer except the bottom is filled up)

Heap Operations Insertion Deletion: Remove root (take the least)

Heap Insertion 3

Insertion Adjustment 3 Adjust it on the path from new leaf to root

Heap Insertion 1 Except new leaf, all adjusted nodes get smaller values Insertion does not damage the heap

Deletion (Remove Root)

Deletion Adjustment 5 7 3 10 11 6 Take the last leaf to the root
Take the last leaf to the root Adjust on the path from root to a leaf

Deletion Adjustment 3 Deletion does not damage heap

Heap Representation A heap with no more than n elements uses array h of size n. The children of h[i] is h[2i] and h[2i+1] The h[i/2] is the father of h[i]

Heap Representation A heap with no more than n elements uses array h of size n. Prove By Induction: left[i]=2i right[i]=2i+1 father(i)=i/2

Bottom-Up-Adjust(i){ if (h[i]<h[parent(i)]){ swap between h[i] and h[father(i)]; Bottom-Up-Adjust(parent(i)) }

Insert Insert(date, heapsize){ Put data at h[heapsize+1];

Top-Down-Adjust(i){ let h[child] be minimal(h[left(i)], h[right(i)]). if (h[i]>h[child]){ swap between h[i] and h[child]; Top-Down-Adjust(child) }

Delete Delete(heapsize){ move h[heapsize] to h[1]; Top-Down-Adjust(1);
}

Complexity of Heap Operations
A heap has n elements. The depth of heap is O(log n) Insertions takes O(log n) steps Deletion takes O(log n) steps

Heap Sorting Input: a1, a2, …, an Build Heap: n insertions
Cost: O(nlog n) Remove from heap: n deletions Total cost: O(nlog n)

ATM Traffic Shaping vc1 queue vc2 queue scheduler Outgoing Incoming
vcn queue

Tele-communication Phones Phones
Each vc is considered as one phone connection Switch Gateway

Traffic in one virtual circuit
Incoming packets: Outgoing packets after shaping: Time: p7 p1 p2 p3 p4 p5 p6 p1 p2 p3 p4 p5 p6 p7

Interval Packet Delay Inter packet time gap is big enough
Every virtual circuit i has minimal inter delay const inter_delay_i >inter_delay_i time Packet 3 Packet 4

A Trivial Algorithm After a packet in one vc, set the ready time for sending the next packet : ready_time=current_time+inter_delay Periodically check all queues, and send packet without inter delay violation

Drawback of the Algorithm
Much time is wasted for checking those vcs which have no packet ready

Another Algorithm Check if there is at least one queue ready to send
Use a heap to select a queue with the least ready time current_time>ready_time Heap:

Heap for Selecting Queue
Each vc has <ready_time, vc_i> to enter heap Heap is built based on the order of ready_time

Apply Heap to Traffic Control

When to insert to heap? When a queue just has one new packet, or
When a queue just sends out one packet and still has packets waiting

When to delete from heap?
Outgoing bandwidth is available, and The least ready_time in heap is expired

Drawback of One Heap Solution
It can not prevent greedy VC It may ignore some VCs Traffic control is not predictable

Two Heaps Design fairness timing vc1 queue vc1000 queue

Two Heaps Functions Time Heap: Control the inter packet delay
<time_ready, vc_i> Fairness Heap: Balance the service among all VCs <service_got, vc_i>

Adjust servic_got for fairness
Each vc has a weight w_i > 1 When a packet is sent, its service_got=service_got + w_i When a queue just has one packet, service_got=max(service_got, time_stamp) The service is reverse proportional to weight w_i

Problem 1 7 3 10 11 6 5 Draw the steps to insert element 2.
Draw the steps to insert element 2. After 2 is inserted, draw the steps to remove the root

Dynamic Programming

Dynamic Programming P(n) P(m1) P(m2) …. P(mk) S1 S2 …. Sk
DP 與 divide-and-conquer 法類似，是一種利用遞迴概念解題的演算法設計方式。 所謂遞迴概念解題如投影片的圖示所顯示的：把一個參數為 n 的題目 P(n) （參數可能是由一個數或是好幾個數所組成）想辦法分解成 k (k  1)個同型但參數較小的問題，P(m1), P(m2), …. , P(mk)，依照遞迴的理念假設這個子問題的答案為 S1, S2, …. , Sk，若此時我們有辦法將 S1, S2, …. , Sk合成 P(n) 的答案 S，那麼表示遞迴設計的理念成功。 所謂 “子問題間不是獨立的而是重疊的” 是指兩個子問題 P(mi) 與 P(mj) 當它們繼續用遞迴的方式分解下去時, P(mi) 的某一個子問題可能與P(mj) 的某一個子問題一模一樣。 因此 DP 設計的一個主要考量即是要避免重複計算這些一樣子問題。 Recursion: Like divide-and- conquer . Overlap in subproblems: Not like divide-and-conquer

Matrix Multiplication (definition)
Given a series of matrices A1, A2, … , An, matrix Ai has size pi1 pi, find a way to compute A1A2…An so that it has least number of multiplications Example： A1  A2  A3  A4 pi : (A1(A2(A3A4))), (A1((A2A3)A4)), ((A1A2 )( A3A4)), ((A1(A2 A3))A4), ((( A1A2)A3)A4). 5 ways to compute the product of 4 matrices : 要將 n 個串在一起的矩陣乘起來，由於矩陣相乘滿足結合律，因此計算的次序不影響算完的結果。 例如 4 個串起來的矩陣共有 5 種計算它們乘積的方式。 雖然最後的乘積是一樣，但是當每一個矩陣大小不一時，不同的計算次序計算量可能差很多。 本題即是要找到一個計算量最少的計算方式。

Matrix Multiplication(Example)
(A1(A2(A3A4))) A1(A2A3A4)  A2(A3A4)  A3A4 cost = 13*5* *89* *3*34 = = A1  A2  A3  A4 (A1(A2(A3A4))), costs = 26418 (A1((A2A3)A4)), costs = 4055 ((A1A2 )( A3A4)), costs = 54201 ((A1(A2 A3))A4), costs = 2856 ((( A1A2)A3)A4), costs = 10582 例如上述四個矩陣相乘的 5 種計算方式，它們的計算量，最好與最差的，可以差到 6 倍之多。

Catalan Number For any n, # ways to fully parenthesize the product of a chain of n+1 matrices = # binary trees with n nodes. = # n pairs of fully matched parentheses. = n-th Catalan Number = C(2n, n)/(n +1) = (4n/n3/2) 首先我們先瞭解計算 n 個串在一起的矩陣的乘積，總共會有多少種計算方式。 實際上這個數字是組合數學裡非常有名的一個數，稱為 Catalan number，而且這個數與很多種組合數學結構的個數是一樣的（如上面投影片所示）。 計算 n+1 個串在一起的矩陣的乘積的個數會等於第 n 個 Catalan number，而這個數會隨的大小呈指數成長，因此我們可以得到一個結論即是：用暴力的方式將所有可能的計算方式找出，再一一算出每一個計算方式的計算量是不可行的。

Multiplication Tree A1  A2  A3  A4 (A1(A2(A3A4))) (A1((A2A3)A4))

Multiplication Design (1)
If T is an optimal solution for A1, A2, … , An T : k T1 T2 1, …, k k+1, …, n 假設 T 為計算 A1A2 …. An 的最佳乘法樹，又假設 T 的根節點的編號為 k（亦即此最佳解最後一個乘法運算，為輸入序列的第 k 個乘法運算），又假設根的左右兩個子樹分別為 T1 與 T2 。 那麼很容易用反證法可以證明： T1 與 T2 分別為計算 A1A2 …. Ak 與 Ak+1Ak+2 …. An 這兩個子問題的最佳乘法樹。 上面這個觀察可以繼續遞迴的套用在這兩個子問題上，一直到子問題的輸入矩陣序列只剩下一個矩陣為止。 then, T1 (resp. T2) is an optimal solution for A1, A2, … , Ak (resp. Ak+1, Ak+2, … , An).

Multiplication Design (2)
Let m[i, j] be the minmum number of scalar multiplications needed to compute the product Ai…Aj , for 1  i  j  n. If the optimal solution splits the product Ai…Aj = (Ai…Ak)(Ak+1…Aj), for some k, i  k < j, then m[i, j] = m[i, k] + m[k+1, j] + pi1 pk pj . Hence, we have : 令 m[i, j] 為計算 AiAi+1 …. Aj 這個子問題的最佳解的計算量，其中1  i  j  n，那麼根據前一張投影片的推論，可以得到本張投影第二項的式子。 但問題是，我們不知道 k 是多少，只知道 k 的範圍為i  k < j ，因此我們只好把所有 k 可能的值都代入式子算一次，而其中最小的即是 m[i, j]的值，因此我們可以得到黃色匡內的遞迴公式。 此遞迴公式的停止條件（或說初始條件）為當 i = j 時，此時輸入矩陣序列只有一個矩陣，所以不需任何乘法計算，因此其值為零。 m[i, j] = mini  k < j{m[i, k] + m[k+1, j] + pi1 pk pj } = 0 if i = j

Matrix Multiplication (Example)
Consider an example with sequence of dimensions <5,2,3,4,6,7,8> m[i, j] = mini  k < j{m[i, k] + m[k+1, j] + pi1 pk pj } 以上為利用此遞迴公式，計算一個有六個矩陣的序列的計算過程。 請注意計算的方式為 j  i 的值從 1 到 n = 6的方向計算 m[i, j]，因為這個方向為子問題由小到大的方向進行，我們稱這樣的計算方式為由下往上（bottom-up）的方式。 計算完成以後，我們所要求的最佳解的計算量會在 m[1, 6] 的位置，在此例，其值為 348。

Matrix Multiplication (Find Solution)
m[i, j] = mini  k < j{m[i, k] + m[k+1, j] + pi1 pk pj } s[i, j] = a value of k that gives the minimum s [1,6] [2,6] A1 [2,5] A6 [2,4] A5 根據遞迴公式只能算出最佳解的值（即最少計算量），要求出最佳解的作法，仍須進一步的計算，不過實際上最佳解已隱藏在之前的計算中。 要找到最佳解，我們利用到另一陣列 s[i, j] 。我們讓 s[i, j] = k，其中 k為計算 m[i, j] 時，得到最小值所代入的參數值。 有了 s[i, j]，我們就可以由上往下的方式（top-down）找出最佳的乘法樹（如動畫所示）。 A4 [2,3] A1(((( A2A3)A4)A5) A6) A2 A3

m[i, j] =mini  k < j{m[i, k] + m[k+1, j] + pi1 pk pj }
Analysis m[i, j] =mini  k < j{m[i, k] + m[k+1, j] + pi1 pk pj } To fill the entry m[i, j], it needs (ji) operations. Hence the execution time of the algorithm is Time: (n3) Space: (n2)

Steps for Developing DP algorithm
Characterize the structure of an optimal solution. Derive a recursive formula for computing the values of optimal solutions. Compute the value of an optimal solution in a bottom-up fashion (top-down is also applicable). Construct an optimal solution in a top-down fashion. 發展DP演算法的四個基本步驟，可對照之前範例一一解說。 第一項所提到的 “找尋最佳解的結構” 與下一張投影片所要提到的 optimal substructure 有關。 第二項提到的遞迴公式，是設計一DP演算法的核心產物，後續的程式設計階段（即第三與第四項）幾乎是制式的工作。 第三項提到由下往上計算子問題最佳解的值，是指一般習慣性的作法，要由上往下計算亦未嘗不可（下一張投影片亦會提到這個問題）。

Elements of Dynamic Programming
Optimal substructure (a problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems) Overlapping subproblems Memoization 第一項 optimal substructure 可用之前範例標題為 “Matrix-Chain Multiplication (設計1)” 之投影片來說明。此特性為遞迴的解 optimization problems 的成功要件，因此不是 DP 的專利，其他屬於用遞迴方式設計演算法的策略如 divide-and-conquer、greedy algorithms 等，也需要這個特性成立才行。 有些演算法教科書亦將 optimal substructure 的性質稱為 principle of optimality。 第二項 overlapping subproblems 在頭幾張投影片也曾提過，此為 DP 的特性。雖說 DP 是用遞迴方式設計的演算法，但也因為這個特性，一般我們不用遞迴程式實做 DP 演算法（真要用遞迴程式實做也可以，但要小心，請參見下一項說明），主要是因為純用遞迴程式實做 DP 演算法，有些子問題可能會被重複計算好幾次，使得程式非常沒有效率。一般我們用陣列（array）這種資料結構來實做 DP 演算法以確保每一子問題只解一次，亦即將算過的子問題的答案放在陣列裡，當需要用到該子問題的答案時，只要到陣列內去拿出來用即可，而不需要重複計算。 第三項提到既然我們用陣列來 “記住” 解過的子問題的答案，那麼在有些情形之下我們可以利用陣列以 top-down 的方式（或是說用一遞迴程式）來實做 DP 的遞迴公式。會需要這樣做的時機是因為，由下往上的（bottom-up）的實做方式嚴格來講也是一種暴力的計算方式，因為這種計算方式把一個問題的所有可能子問題，不管後面用的到用不到，全部計算出來。當子問題個數不多時，這樣做還沒什麼大問題（例如之前矩陣序列相乘問題，總共只有 n 取 2 種不同子問題），但是當子問題空間相當龐大時這個作法就相當沒有效率。Top-down 的作法是，當我們解某一子問題時，我們可以先到陣列去查該子問題是否曾經被解過，若是已經被解過，只要回傳存在陣列的答案即可，若是還沒被解過，就繼續用遞迴的方式解該子問題。這樣做的好處是保證不但每一個子問題只被解一次，而且我們只解那些需要用到的子問題。

Longest Common Subsequence (Def.)
Given two sequences X = <x1, x2, … , xm> and Y = <y1, y2, … , yn> find a maximum-length common subsequence of X and Y. Example 1：Input: ABCBDAB BDCABA C.S.’s: AB, ABA, BCB, BCAB, BCBA … Longest: BCAB, BCBA, … Length = 4 左下方的圖形是以另一種方式來看 LCS 這個問題，將輸入的兩個序列一上一下放好，想辦法在上下兩序列間畫一些線段，越多越好，但這些線段需滿足以下兩個要求： 1. 線段兩端必須是相同的字母 2. 兩線段不可以相交也不可以有相同端點。 A B C B D A B B D C A B A Example 2 : vintner writers

Longest Common Subsequence (Design 1)
Let Z= < z1, z2, … , zk> be a LCS of X = <x1, x2, … , xm> and Y = <y1, y2, … , yn>. If zk  xm, then Z is a LCS of <x1, x2, … , xm1> and Y. If zk  yn, then Z is a LCS of X and <y1, y2, … , yn1>. If zk = xm = yn, then <z1, z2, … , zk1> is a LCS of <x1, x2, … , xm1> and <y1, y2, … , yn1>. 本張投影片論證 LCS 問題具 optimal substructure 的特性。

Longest Common Subsequence (Design 2)
Let L[i, j] be the length of an LCS of the prefixes Xi = <x1, x2, … , xi> and Yj = <y1, y2, … , yj>, for 1  i  m and 1  j  n. We have : L[i, j] = 0 if i = 0, or j = 0 = L[i1, j1] + 1 if i , j > 0 and xi = yj = max(L[i, j1], L[i1, j]) if i , j > 0 and xi  yj 由前一張投影片有關 LCS 具 optimal substructure 的特性，我們可以得到我們所要的遞迴公式（黃色底匡內）。

Longest Common Subsequence
L[i, j] = 0 if i = 0, or j = 0 = L[i1, j1] + 1 if i , j > 0 and xi = yj = max(L[i, j1], L[i1, j]) if i , j > 0 and xi  yj A B C B D A B B D C A B A Time: (mn) Space: (mn) 由遞迴公式我們可以用一個二維陣列實做這演算法。 陣列的大小需為 m  n，其中 m, n 分別為輸入兩字串的長度。 從遞迴公式可以看出，每算一個子問題的 [i, j] 答案需參考另外其鄰近三個子問題即左邊 [i, j1] 、上面 [i1, j] 、以及左上 [i1, j1] 的答案（如投影片右手邊遞迴公式下的示意圖所顯示的），因此計算時間的複雜度等同空間複雜度。 投影片亦顯示一範例計算兩長度分別為 6、7 之字串的 LCS 的長度。直接運用 LCS 遞迴公式在一 6  7 陣列上，由左而右、由上而下，可計算出 LCS 的長度為 4。在這裡，我們省略掉第零列及第零行（裡面元素均全部為零）。 整個計算過程亦可幫我們找到一 LCS，而不是只有其長度。只要我們在陣列的每一個位置另外記錄每一子問題 [i, j] 的答案是由其哪一個子問題的答案而得到。在投影片的例子裡，紅色數值代表其值由左上而來（注意: 此時 xi = yj），藍色數值代表其值由左邊及上面的最大值而來。要找一 LCS 只要從 [m, n] 開始，想辦法走一路徑到任一子問題 [0, j] 或[i, 0]的位置 (注意: 此時 L 值= 0)。路徑尋找的規則為： 在紅色數值位子，一律走斜角。 在藍色數值位子，往上面或左邊數值較大的位子走。 依此規則走出來的路徑所經過的紅色數值的子問題的座標位子若依序為: [i1, j1], [i2, j2],… [ik, jk] 那麼 xi1, xi2,…, xik（注意: 此一子字串亦等於yj1, yj2,…, yjk）即 LCS。 投影片亦顯示這樣的一條路徑及其相對的 LCS。上述規則 2 當上面與左邊數值一樣大時，往那邊走都可以。因此，可能可以走出相多條路，這代表解不唯一。 LCS: BCBA

Longest Common Subsequence
L[i, j] = 0 if i = 0, or j = 0 = L[i1, j1] + 1 if i , j > 0 and xi = yj = max(L[i, j1], L[i1, j]) if i , j > 0 and xi  yj A B C B D A B B D C A B A Time: (mn) Space: (mn) 由遞迴公式我們可以用一個二維陣列實做這演算法。 陣列的大小需為 m  n，其中 m, n 分別為輸入兩字串的長度。 從遞迴公式可以看出，每算一個子問題的 [i, j] 答案需參考另外其鄰近三個子問題即左邊 [i, j1] 、上面 [i1, j] 、以及左上 [i1, j1] 的答案（如投影片右手邊遞迴公式下的示意圖所顯示的），因此計算時間的複雜度等同空間複雜度。 投影片亦顯示一範例計算兩長度分別為 6、7 之字串的 LCS 的長度。直接運用 LCS 遞迴公式在一 6  7 陣列上，由左而右、由上而下，可計算出 LCS 的長度為 4。在這裡，我們省略掉第零列及第零行（裡面元素均全部為零）。 整個計算過程亦可幫我們找到一 LCS，而不是只有其長度。只要我們在陣列的每一個位置另外記錄每一子問題 [i, j] 的答案是由其哪一個子問題的答案而得到。在投影片的例子裡，紅色數值代表其值由左上而來（注意: 此時 xi = yj），藍色數值代表其值由左邊及上面的最大值而來。要找一 LCS 只要從 [m, n] 開始，想辦法走一路徑到任一子問題 [0, j] 或[i, 0]的位置 (注意: 此時 L 值= 0)。路徑尋找的規則為： 在紅色數值位子，一律走斜角。 在藍色數值位子，往上面或左邊數值較大的位子走。 依此規則走出來的路徑所經過的紅色數值的子問題的座標位子若依序為: [i1, j1], [i2, j2],… [ik, jk] 那麼 xi1, xi2,…, xik（注意: 此一子字串亦等於yj1, yj2,…, yjk）即 LCS。 投影片亦顯示這樣的一條路徑及其相對的 LCS。上述規則 2 當上面與左邊數值一樣大時，往那邊走都可以。因此，可能可以走出相多條路，這代表解不唯一。 LCS: BCBA

Optimal Polygon Triangulation
v0 v1 v6 v2 v5 在一凸多邊形 P 的任一對非相鄰頂點上畫線段，一定會經過多邊型內部，我們稱這樣的線段為多邊型的一個弦 (如投影片紅色虛線所示)。所謂一凸多邊型 P 的三角化 T，是指可將 P 分解成一堆不相交的三角形的一組弦（請參見投影片的例子）。其中每一三角形的頂點是 P 的頂點而其邊是由 P 的邊或是 T 中的弦所構成。 一個凸多邊形的三角化可以有很多種，但由數學歸納法可以證明，任一 n 多邊形的三角化內一定剛好有 n3 個弦與 n2 個三角形。 假設每一三角形有一權重由其頂點或邊決定（例如可以定義為三角形的邊長和），所謂最佳三角化的問題，即是要找一三角化使得其內三角形的權重和為最小。 v3 v4 Find a triangulation s.t. the sum of the weights of the triangles in the triangulation is minimized.

Optimal Polygon Triangulation (Design 1)
If T is an optimal solution for v0, v1, … , vn v0 vk vn T1 T2 考慮一 n+1 邊凸多邊形 P，假設其頂點編號依逆時針編號為 v0, v1, … , vn，並以 v0, v1, … , vn 表示 P。又假設 T 為 P 的某一最佳三角化。 這三角化中一定有一三角形以線段 v0vn 為其一邊，假設 vk 為這三角形的另一頂點。由示意圖可以看出，這三角形將 P 分成兩個較小的多邊形。 將線段 v0vn 以及線段 vkvn 從 T 中拿掉，可以看出剩下的線段很自然的分成兩組，並且各為這兩個較小多邊形的最佳三角化。因此，這問題具有 optimal substructure 的特性。 then, T1 (resp. T2) is an optimal solution for v0, v1, … , vk (resp. vk, vk+1, … , vn), 1  k < n.

Optimal Polygon Triangulation (Design 2)
Let t[i, j] be the weight of an optimal triangulation of the polygon vi1, vi,…, vj, for 1  i < j  n. If the triangulation splits the polygon into vi1, vi,…, vk and vk, vk+1, … ,vn for some k, then t[i, j] = t[i, k] + t[k+1, j] + w(vi1 vk vj) . Hence, we have : t[i, j] = mini  k < j{t[i, k] + t[k+1, j] + w(vi1 vk vj) } = 0 if i = j 根據前一張投影片的敘述，很容易就可以得到這問題的一個 DP 遞迴公式。 仔細觀察可以看出，這個遞迴公式與之前討論的矩陣序列相乘的問題的遞迴公式幾乎一模一樣。實際上，若把多邊形的每一邊 vi1vi 對應到一大小為 pi1pi 的矩陣 Ai 並將 w(vi1vkvj) 定為 pi1pkpj，那麼這兩個公式就完全一樣。因此，矩陣序列相乘問題可看成為這問題的一個特例。 因為後續的兩個步驟（用遞迴公式算最佳三角化的權重，與找出一最佳三角化）與之前討論類似，在此予以省略。

Catalan Number Segner's recurrence formula gives :

Problem Draw the dynamic programming table to find the Longest Common Sequence between BACAC and CABC.

Data Structures Linked List Heap
Application of Heap in an Industry Product Program=Data Structure + Algorithm

Link List 8 10 15

Node Structure struct listnode{ type data; struct node *nextPtr; }
Dynamic Memory Allocation: Apply for the memory when it is needed Release memory when it is not needed a

Insertion: add a new node to the link list Deletion: remove a node from the current link list

Linked List to Implement
The linked list is increasing order for characters startPtr c e h

Insertion Create a new node Find a place to insert
Apply for a new piece of memory Adjust the nearby pointers

After g is inserted The linked list is increasing order for characters
startPtr c e g h

Find Location to Insert
startPtr previousPtr currentPtr g c e h

Find Location to Insert
startPtr g c e h

Deletion Find the node to delete
Adjust the pointers nearby the deleted node Release the memory for the deleted node

Find Location to Insert
startPtr c e h

Remove the node and release memory
startPtr c h

After e is deleted The linked list is increasing order for characters
startPtr c h

Queue First in, First Out

Three Important Operations
Supporting the 3 operations are the foundation of modern data base: Search Insertion Deletion

Tree Tree is a 2-dimensional structure 47 25 77 11 43 93
Binary tree: root, left_child, right_child Numbers in the left <= Numbers in the right.

Data structure for one node
struct treenode{ struct treenode *leftptr; int data; struct treenode *rightptr; } typedef struct treenode TreeNode; typedef TreeNode *TreeNodePtr;

47 25 77 11 43 93

47 25 77 11 43 93

Operations Insertion Traverse: inorder preorder postorder

Insertion void insertNode(TreeNodePtr *treePtr, int value){
if (treePtr is empty) { allocate memory and put here } else if (value<(*treePtr)data) insert at the left tree else insert at the right tree

void insertNode( TreeNodePtr *treePtr, int value ) {
if ( *treePtr == NULL ) { *treePtr = malloc( sizeof( TreeNode ) ); if ( *treePtr != NULL ) { ( *treePtr )->data = value; ( *treePtr )->leftPtr = NULL; ( *treePtr )->rightPtr = NULL; } else printf( "No memory available.\n“); } else if ( value < ( *treePtr )->data ) insertNode( &( ( *treePtr )->leftPtr ), value ); else if ( value > ( *treePtr )->data ) insertNode( &( ( *treePtr )->rightPtr ), value ); else printf( "dup" ); }

void inOrder( TreeNodePtr treePtr )
{ if ( treePtr != NULL ) { inOrder( treePtr->leftPtr ); printf( "%3d", treePtr->data ); inOrder( treePtr->rightPtr ); }

void preOrder( TreeNodePtr treePtr )
{ if ( treePtr != NULL ) { printf( "%3d", treePtr->data ); preOrder( treePtr->leftPtr ); preOrder( treePtr->rightPtr ); }

void postOrder( TreeNodePtr treePtr )
{ if ( treePtr != NULL ) { postOrder( treePtr->leftPtr ); postOrder( treePtr->rightPtr ); printf( "%3d", treePtr->data ); }

Problem Implement the function to find the largest and least elements in the binary tree: void max(TreeNodePtr *treePtr, int *largest, int *least)

Binary tree When the binary is not balanced, it takes O(n) steps for search, insert, or delete. Search: O( n) steps in the worst case Insert: O(n) steps in the worst case Delete: O(n) steps in the worst case

Maintaining Balance Binary Search Tree
Height governed by Initial order Sequence of insertion/deletion Need a structure that tends to maintain balance How? Grow in ‘width’ first, then height Accommodate horizontal growth More data at each level Nodes of two forms One data member and two children (“Two node”) Two data members and three children (“Three node”)

2-3 Tree Each node which is not a leaf has either 2 or 3 sons
Every path from the root to a leaf has the same length.

Depth and Size for 2-3 tree
Let d be the depth of a 2-3 tree The k-level has and nodes A depth d=(log n) 2-3 tree can hold n nodes at leave level

2-3 Tree Nodes S L M S L S >S L > L S

2-3 Tree Nodes S:L

2-3 Tree Nodes S:L:M

Search in 2-3 Tree Search (a,r){
if (r only has leaf children) return r else { if a<= S then search(a, left_child) else if (a<=L) then search(a, mid_child) else search(a, right_child) }

Insertion(36) 40:100 20:40 60:80:100 60 80 100 20 40

Insertion(36) 40:100 20:36:40 60:80:100 60 80 100 20 36 40

Insertion(36) 40:100 20:36:40 60:80:100 60 80 100 20 36 40

Insertion(50) 40:100 20:36:40 60:80:100 60 80 100 20 36 40

Insertion(50) 40:100 20:36:40 60:80:100 80 100 20 36 40 50 60

Insertion(50) 36:60:100 20:36:40 50:60 80:100 80 100 20 36 40 50 60

Insertion insertion(a){ use search(root,a) to find the node r
make a as son of r if (r has four sons) adjust the tree from r up to root by addson(r) }

Insertion and Splits on the path to root
split stops here split starts here

Addson(v) Addson(v){ create a new node v’
make the two rightmost sons of v to sons of v’ if (v has no father) { create a new root r make v and v’ the left and right sons of r } else {make v’ a son of father(v) to the right of v if (father(v) has four sons) then addson(father(v)) }}

Computational steps for insertion
Assume the tree has n nodes on the leave level Insertion operates the nodes from the root to a leaf The path from the root to leaf has (log n) nodes The number of steps for insertion is O(log n)

Deletion(80) 36:50:100 20:36 40:50 60:80:100 80 100 20 36 40 50 60

Deletion(80) 36:50:100 20:36 40:50 60:100 100 20 36 40 50 60

Deletion(4) 5:9 3:5 7:9 1:3 4:5 6:7 8:9 9 1 3 4 5 6 7 8

Deletion(4) 5:9 3:5 7:9 1:3 4:5 6:7 8:9 9 1 3 5 6 7 8

Deletion(4) 5:9 3:5 7:9 1:3:5 6:7 8:9 9 1 3 5 6 7 8

Deletion(4) 5:9 3:7:9 1:3 6:7 8:9 9 1 3 5 6 7 8

Deletion(4) 3:7:9 1:3 6:7 8:9 9 1 3 5 6 7 8

Deletion Stops at this level merge starts at this level

Deletion Stops at this level merge starts at this level

Deletion Stops at this level

Deletion Stops at this level

Deletion Stops at this level

Deletion Stops at this level

Deletion Delete(r,a){ remove the son of r with value a
call RemoveSon( r) to recursively adjust the tree (roughly along the path from r to root) }

RemoveSon(r) RemoveSon( r){ if (r has one child) {
let r’ be a brother of r if (r’ has 3 sons) let r get a son from r’ else {make the son of r son of r’ let f be the father of r remove r RemoveSon (f) }

Search S:L:M

Insertion and Splits on the path to root
split stops here split starts here

Deletion Stops at this level merge starts at this level

Problem: show how to delete 7
5:9 3:5 7:9 1:3 4:5 6:7 8:9 9 1 3 4 5 6 7 8

Problem: show how to delete 7
5:9 3:5 7:9 1:3 4:5 6:7 8:9 9 1 3 4 5 6 7 8

Binomial Heaps

Operations Insert(H,x) Minimum(H) Extract-Min(H) Union(H1, H2)
Decrease-Key(H,x, k) Delete(H,x)

Binomial trees: B0 has a single node Bk: Ex i Bk-1 B1 B2 B3

Lemma1 For the binomial tree Bk, 1. there are 2k nodes,
2. the height if the tree is k, 3. there are exactly nodes at depth i for i= 0, 1, …., k, 4. the root has degree k, which is greater than that of any other node; moreover if the children of the root are numbered from left to right by k-1, k-2, …, 0, child I is the root of a subtree Bi Proof: 1) By induction, 2k-1 + 2k-1 = 2k 2) By induction, 1 + (k-1) = k 3) , by induction

Binomial heaps: Corollary:
The maximum degree of any node in an n-node binomial tree is lg(n) Binomial heaps: H: a set of binomial trees satisfying the following: 1. Each binomial tree in H is heap-ordered: the key of a node is greater than or equal to the key of its parent 2. There is at most one binomial tree in H whose root has a given degree Note: By 2. an n-node binomial heap H consists of at most binomial trees

Representation of binomial heaps
head[H] 10 1 6 12 25 8 14 29 11 18 17 38 27

/ 10 / 1 2 / 6 3 (b) head[H] p 12 1 25 / 8 2 14 1 29 / key degree child 18 / sibling 11 1 17 / 38 / 27 /

Operations on binomial heaps
Creating a new binomial heap Finding the minimum key Uniting 2 binomial heaps Time: O( lg n)

Bk-1 Bk-1 Binomial-Heap-Merge Y, z : Bk-1 trees

12 7 15 28 33 41 25 37 3 6 44 10 29 8 17 31 48 22 23 32 24 45 55 18 30 50 (a) head[H1] head[H2] Binomial-Heap-Merge Output of sorted degree x next-x (b) head[H] 12 18 7 3 15 6 25 37 28 33 8 29 10 44 41 30 23 22 48 31 17 Case 3 50 45 32 24 55

sibling[next-x] x next-x (c) head[H] 12 7 3 15 6 25 37 28 33 8 29 10 44 18 41 30 23 22 48 31 17 Case 2 45 32 24 50 55 prev-x x next-x (d) head[H] 12 7 3 15 6 37 28 33 8 29 10 44 18 25 41 30 23 22 48 31 17 Case 4 50 45 32 24 55

prev-x x next-x (d) head[H] 12 3 15 6 28 33 8 29 10 44 18 7 37 41 30
23 22 48 31 17 25 Case 3 32 24 50 45 55 prev-x x next-x (d) head[H] 12 3 6 8 29 10 44 18 15 7 37 30 23 22 48 31 17 28 33 Case 1 25 50 45 32 24 41 55

Case1

Case 3 Case 4

next-x (a) prev-x x sibling[next-x] prev-x x next-x …. b …. …. a c d b …. a c d Case 1 Bk Bl Bk Bl next-x (b) prev-x x sibling[next-x] prev-x x next-x …. …. …. a b c d a b c d Case 2 Bk Bk Bk Bk Bk Bk

next-x (c) prev-x x sibling[next-x] prev-x x next-x …. b …. …. a c d a b d c Case 3 Bk Bk Bl Bk Bl Bk Bk+1 next-x (d) prev-x x sibling[next-x] prev-x x …. …. …. a b c d a c d b Case 4 Bk Bk Bl Bk Bl Bk Bk+1

Insert a node Extracting the node with minimum key

(a) head[H] 37 10 1 6 16 12 25 41 28 13 8 14 29 26 23 18 77 11 17 38 42 27 x (b) head[H] 1 37 10 6 16 12 25 41 28 13 8 14 29 26 23 18 77 38 42 11 17 27

(c) head[H] 37 10 head[H’] 25 12 18 16 23 26 42 6 29 14 17 38 11 27 8 41 28 13 77 25 6 29 14 17 38 11 27 8 12 18 41 37 (d)head[H] 16 23 26 42 13 10 28 77

Decreasing a key Deleting a key

(a)head[H] 25 12 18 6 29 14 17 38 11 27 8 41 37 13 10 28 77 z 16 y 7 23 42 (b)head[H] 25 12 18 6 29 14 17 38 11 27 8 41 37 z 10 y 7 28 13 16 23 77 42

(c)head[H] 25 12 18 6 z 41 37 y 7 8 14 29 11 17 38 10 28 13 27 16 23 77 42

Minimum Spanning Trees (Greedy Algorithms)
235

Graph Graph: A set of nodes V A set of edges E from V x V V={ } E={ }

Path Graph G=(V,E) A path is a series of edges linked one by one Loop:

Tree A graph is connected if every two nodes have a path to connect them A tree is a connected graph without loop

Connected Graph Tree Every connected graph can be converted into tree by removing some edges Removing one edge on a loop does not damage the connectivity.

A tree is a minimal connected graph
Removing any edge on a tree damages the connectivity Proof. Tree T=(V,E). Let (v1, v2) be removed from T. T  T’=(V, E-{(v1,v2)}). If T’ is still connected, T has a loop containing v1 and v2 . Contradiction!

Number of edges in a tree
Each tree has node with only one edge Proof. Start from one node to build a path. Meet a node with only one edge. Otherwise, it has loop. Each tree of n nodes has n-1 edges Proof. By induction. It is true for n=1,2. Assume it is true at case n. At case n+1, find the node with one edge. Remove it. By inductive assumption, it has n-1 edges.

Unique path on tree Every two nodes in a tree have a unique path.
Proof. If there are different path, there is a loop.

Many graph problems have weighted edges
Weighed Graph a b c d e f g h 3 5 1 9 4 7 2 G=(V,E) Many graph problems have weighted edges All weights are positive value here 相當多實際的問題可以模式成圖形理論問題，而這些圖形理論問題很多又都考慮在有權重的圖形上，例如最短路徑問題與本章要討論的 minimum spanning tree (MST) 問題。 在此我們假設給定的圖形在每一個邊上都有一權重，而且這些權重都大於 0。在很多應用問題上，例如電路、交通網路、電腦網路，邊的權重都代表建構此一連線的代價，因此權重都大於 0 的假設應該算合理。當然有時候有些問題會模式成一邊有負權重的圖形問題，這時原假設於正權重而設計出的演算法可能會不適用於此新的狀況，我們在後面會再討論這個問題。 一個邊有權重的無向圖可以用投影片所畫的圖形來表示，小圓圈代表頂點，線段代表邊，邊旁的數字代表其權重。 243

Minimum Spanning Trees (MST)
Find the lowest-cost way to connect all of the points (the cost is the sum of weights of the selected edges). The solution must be a tree. (Why ?) A spanning tree : a subgraph that is a tree and connect all of the points. A graph may contains exponential number of spanning trees. (e.g. # spanning trees of a complete graph = nn-2.) MST 問題最早來自於要找一最省成本的方式，將所有的頂點連起來。通常給定的圖形一定本身要連通的，甚至就是完全圖（complete graph；指任何兩點間都有連線），邊的權重代表要將邊的兩端點建立一直接連線所需的代價。所謂將所有的頂點連起來，是指選一些邊使得這些邊與所有頂點形成原圖的一個連通子圖，而所需成本就是這些選定邊的權重和。 因我們假設邊的權重都是正的，因此解一定形成一棵樹。 若一圖的子樹包含原圖所有頂點，我們稱這種樹為 spanning tree。 因此這題就等價於要找一棵邊權重和最小之 spanning tree 即 MST。 一個圖所含的 spanning trees 個數可能為頂點數 n 的指數函數，因此暴力法不可行。 244

A High-Level Greedy Algorithm for MST
while(T=(V,A) is not a spanning tree of G) { select a safe edge for A ; } The algorithm grows a MST one edge at a time and maintains that A is always a subset of some MST. An edge is safe if it can be safely added to without destroying the invariant. How to check that T is a spanning tree of G ? How to select a “safe edge” edge ? 有兩個有名的找 MST 的演算法，分別由 Kruskal 與 Prim 兩人設計出來。他們的演算法都被歸類於 greedy 演算法，而且其本概念可由本張投影片的 pseudo code 來顯示。 演算法每一次加一個邊進來，並保證加進來的邊為某一 MST 的部分集合，這樣的邊稱為 safe edge。 當然這 pseudo code 的層級相當高，還有幾個細節問題待解決。 第一個問題較容易解決，因為一 spanning tree 剛好有 |V|1 個邊，所以只要檢查選進來的邊數是否為 |V|1 即可。 如何保證加進來的邊為 safe edge 較為麻煩，而且兩個演算法用的方式稍有不同，但其基本原理是一樣的，這原理即下一張投影片的主題。 245

  MST Basic Lemma Let V = V1 + V2 and V1 and V2 have no intersection
(V1,V2 ) = {uv | u  V1& v  V2 }. if xy  (V1,V2 ) and w(xy) = min {w(uv)| uv  (V1,V2 )}, then xy is contained in a MST. a b c d e f g h 3 5 1 9 4 7 2 這個引理（lemma）是兩個 MST 演算法保證選到 safe edge 的基本原理。 符號 V = V1 + V2 代表頂點集 V 的一個分割，即 V1 , V2 的聯集為 V 交集為空。一般我們稱 V1 , V2 間的所有邊為原圖 G 的一個 cut，一個邊xy 的權重只要是某一 cut 的最小權重，即保證 xy 為某一 MST 的一個邊。 可用所附圖例，並隨意畫幾個分割來瞭解這個引理。 V1 V2 246

Proof Edge xy selected with the minimal w(xy) connecting V1 and V2 is an extension toward MST. Otherwise, add xy to the MST and replace another edge connecting V1 and V2. This makes adding xy is an extension toward MST.

Kruskal’s Algorithm (pseudo code 1)
for( each edge in order by nondecreasing weight ) if( adding the edge to A doesn't create a cycle ){ add it to A; if( | A| == n1 ) break; } How to check that adding an edge does not create a cycle? Kruskal’s algorithm 乾脆依照邊的權重大小，由小到大一個一個拿進來考量。若加進來的邊與之前選的邊集合會產生一個迴圈，那顯然這個邊不是 safe edge。若是不會產生迴圈，利用 MST 基本引理，可證明這個邊為 safe edge。 因此剩下的問題為：如何有效率的檢查是否加進來的邊會產生一個迴圈。 248

Kruskal’s Algorithm (Example 1/3)
b 3 5 1 9 4 7 2 f a e c d g h 這張以及以下幾張投影片用一個例子顯示 Kruskal’s algorithm 建構一 MST 的過程。 紅色的 4 個邊為權重最小的 4 個邊，因為加這 4 個邊不會產生迴圈，因此 Kruskal’s algorithm 會將這 4 個邊加入集合 A 中。 249

Kruskal’s Algorithm (Example 1/3)
b 3 5 1 9 4 7 2 f a e c d g h 這張以及以下幾張投影片用一個例子顯示 Kruskal’s algorithm 建構一 MST 的過程。 紅色的 4 個邊為權重最小的 4 個邊，因為加這 4 個邊不會產生迴圈，因此 Kruskal’s algorithm 會將這 4 個邊加入集合 A 中。 250

Kruskal’s Algorithm (Example 1/3)
b 3 5 1 9 4 7 2 f a e c d g h 這張以及以下幾張投影片用一個例子顯示 Kruskal’s algorithm 建構一 MST 的過程。 紅色的 4 個邊為權重最小的 4 個邊，因為加這 4 個邊不會產生迴圈，因此 Kruskal’s algorithm 會將這 4 個邊加入集合 A 中。 251

Kruskal’s Algorithm (Example 1/3)
b 3 5 1 9 4 7 2 f a e c d g h 這張以及以下幾張投影片用一個例子顯示 Kruskal’s algorithm 建構一 MST 的過程。 紅色的 4 個邊為權重最小的 4 個邊，因為加這 4 個邊不會產生迴圈，因此 Kruskal’s algorithm 會將這 4 個邊加入集合 A 中。 252

Kruskal’s Algorithm (Example 1/3)
b 3 5 1 9 4 7 2 f a e c d g h 這張以及以下幾張投影片用一個例子顯示 Kruskal’s algorithm 建構一 MST 的過程。 紅色的 4 個邊為權重最小的 4 個邊，因為加這 4 個邊不會產生迴圈，因此 Kruskal’s algorithm 會將這 4 個邊加入集合 A 中。 253

Kruskal’s Algorithm (Example 1/3)
b 3 5 1 9 4 7 2 f a e c d g h 這張以及以下幾張投影片用一個例子顯示 Kruskal’s algorithm 建構一 MST 的過程。 紅色的 4 個邊為權重最小的 4 個邊，因為加這 4 個邊不會產生迴圈，因此 Kruskal’s algorithm 會將這 4 個邊加入集合 A 中。 254

Kruskal’s Algorithm (Example 2/3)
b 3 5 1 9 4 7 2 f a e c d g h 再來考慮權重為 3 的兩個邊，ab 與 dg，注意加入 ab 會產生迴圈，因此只有 dg 加入 A 中。 255

Kruskal’s Algorithm (Example 3/3)
b 3 5 1 9 4 7 2 f a e c d g h 再來被考慮的邊為權重為 4 的邊，注意加入 de 會產生迴圈，但 bf 及 gh 不會。此時已有 n1 = 7 個邊加入 A 中，所以 Kruskal’s algorithm 會停並得一 MST 權重和為 17。 MST cost = 17 256

Kruskal’s Algorithm (pseudo code 2)
A = ; initial(n); // for each node x construct a set {x} for( each edge xy in order by nondecreasing weight) if ( ! find(x, y) ) { union(x, y); add xy to A; if( | A| == n1 ) break; } 這張投影片主要描述：如何有效率的檢查加入一個邊 xy 到邊集合 A 中是否會產生一個迴圈。 會產生迴圈的充要條件為 x 與 y 在子圖 (V, A) 的同一 connected component (c.c.) 裡。一般要檢查這個條件是否成立，只要在子圖中做一次 depth-first search 即可。但這個檢查可能會做 |E| 次，因此做這麼多次depth-first search 太沒效率。 一個漂亮的作法為利用一種實做 disjoint sets 的高級資料結構（請參考 “Data structures for disjoint sets” 一章）。 演算法一開始 A 為空集合，所以可以看成在子圖 (V, A) 中，每一個頂點自成一個 c.c.。依序加入邊到後，子圖的 c.c. 個數漸漸減少，一直到最後演算法停時只剩下一個 c.c.。 投影片裡描述的實做方式，即是把每一個 c.c. 看成是一個由 c.c. 內的頂點所成的集合，顯然這些集合為 disjoint。因此加入邊 xy 會產生一個迴圈等價於 x 與 y 在同一個 c.c.，也等價於 x 與 y 在同一個集合裡。又加入一個邊 xy 到 A，相當於將 x 與 y 所在的集合聯集起來。 find(x, y) = true iff. x and y are in the same set union(x, y): unite the two sets that contain x and y, respectively. 257

Prim’s Algorithm (pseudo code 1)
ALGORITHM Prim(G) // Input: A weighted connected graph G=(V,E) // Output: A MST T=(V, A) VT  { v0 } // Any vertex will do; A  ; for i  1 to |V|1 do find an edge xy  (VT, VVT ) s.t. its weight is minimized among all edges in (VT, VVT ); VT  VT  { y } ; A  A  { xy } ; Prim’s algorithm 找 A 的 safe edge 的方式稍有不同，此演算法利用一個逐漸增大的頂點集 VT，來構成一個 cut (VT, VVT) ，並找出這 cut 權重最小的邊 xy 加入 A，根據 MST 基本引理 xy 必為某一 MST 的邊，又我們注意到 VT 一直保持為 A 的邊的端點所成的集合，所以 A必包含在某一 MST 內且 xy 為A 的 safe edge。 258

Prim’s Algorithm (Example 1/8)
b 3 5 1 9 4 7 2 f a e c d g h 以下我們用一例來說明 Prim’s algorithm 的運作過程。 紫色點代表 VT 內的點。 紫色與藍色點間的最小權重邊為 ac，所以 ac 會被加入 A 中。 259

Prim’s Algorithm (Example 2/8)
b 3 5 1 9 4 7 2 f a e c d g h 邊 ac 已加入 A 中 （以紅色邊表示 A 中的邊）。 紫色與藍色點間的最小權重邊為 bc，所以 bc 會被加入 A 中。 260

Prim’s Algorithm (Example 3/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 bf，所以 bf 會被加入 A 中。 261

Prim’s Algorithm (Example 4/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 ef，所以 ef 會被加入 A 中。 262

Prim’s Algorithm (Example 5/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 eg，所以 eg 會被加入 A 中。 263

Prim’s Algorithm (Example 6/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 dg，所以 dg 會被加入 A 中。 264

Prim’s Algorithm (Example 7/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 gh，所以 gh 會被加入 A 中。 265

Prim’s Algorithm (Example 8/8)
b 3 5 1 9 4 7 2 f a e c d g h 最後已有 n1 = 7 個邊加入 A 中，所以 演算法會停並得一 MST 權重和為 17。 MST cost = 17 266

Prim’s Algorithm (pseudo code 2)
Built a priority queue Q for V with key[u] =   uV; key[v0] = 0; [v0] = Nil; // Any vertex will do While (Q  ) { u = Extract-Min(Q); for( each v  Adj(u) ) if (v  Q && w(u, v) < key[v] ) { [v] = u; key[v] = w(u, v); Change-Priority(Q, v, key[v]); } 之前的 pseudo code 1 主要描述 Prim’s algorithm 的運作概念，這裡的pseudo code 2 較偏向實做。 在 pseudo code 1，有一 loop每一次要找 VT 與 V VT 間權重最小之邊，而在 pseudo code 2 則利用一 priority queue Q 來達成這個目標，但要注意 Q 是建構在頂點上而不是在邊上，並利用一陣列 key[u] 來記錄每一點的priority。將 Q 建構在邊上其實也可以，只是運算會複雜很多。 在 while loop 裡面，每一次從 Q 裡面拿一點 u 相當於在 pseudo code 1裡將 u 加入點集VT（亦相當於在範例裡，點由藍色變成紫色）。而所找到的 VT 與 V VT 間權重最小之邊的兩個端點分別為 u 與 [u]。 輸出的 MST 記錄在陣列 [ ] 裡，此樹可看成是一 rooted tree，以 v0 為根，每一點 u 的 [u] 紀錄 u 在此 rooted tree 的父節點。 一個點 u 的 key[u] 值在點還沒從 Q 拿出前可能會改變，因此呼叫副程式 Change-Priority( ) 是需要的。 267

Minimum Spanning Tree (analysis)
Let n = |V( G)|, m =|E( G)|. Execution time of Kruskal’s algorithm: (use union-find operations with bionomial heap) O(m log m ) = O(m log n ) Running time of Prim’s algorithm: adjacency lists + (binary or ordinary) heap: O((m+n) log n ) = O(m log n ) Kruskal’s algorithm 的瓶頸在對邊的權重做排序。 Prim’s algorithm 相當於在給定的圖形上做一種圖形搜尋，可稱為 priority first search，因為演算法根據頂點的 key 值的大小做搜尋，並且每一點與邊，頂多被探測兩次。因此時間複雜度與所用圖形表示的資料結構有很大的關係。照圖形演算法的習慣，當邊相當密集時，可以使用 adjacency matrix 來表示一個圖，當邊較稀疏時一般建議用 adjacency lists 來表示。但時間複雜度還與用哪一種實體資料結構來實做 priority queue 有關，根據 pseudo code 2，演算法總共需呼叫 O(n) 次 Extract-Min( )， O(m)次 Change-Priority( )。而使用不同實做 priority queue 的方式，每一個運算的平均執行時間複雜度不太一樣，總結如下： 使用資料結構 Extract-Min( ) Change-Priority( ) binary heap O(log n) O(log n) unsorted list O(n) O(1) Fibonacci heap O(log n) O(1) 因此，當使用 adjacency matrix表示圖形時，O(n2) 的時間複雜度是逃不掉的，所以此時的 priority queue 不如用最粗糙的 unsorted list 來實做還比較有效率。 268

Find the Minimum spanning with Prim’s algorithm Start from the node a
Find the Minimum spanning with Prim’s algorithm Start from the node a. Show the steps b 1 6 2 10 7 4 5 8 3 9 f a e c d g h 以下我們用一例來說明 Prim’s algorithm 的運作過程。 紫色點代表 VT 內的點。 紫色與藍色點間的最小權重邊為 ac，所以 ac 會被加入 A 中。 269

Problem 1 1.Give asymptotic upper and lower bounds for T(n) in each of the following recurrences. Assume that T(n) is constant for n2. Make your bounds as tight as possible, and justify your answers. a) T(n)=8T(n/2)+ b) T(n)=2T(n/4)+ c) T(n)=T(n-1)+ d) T(n)= T( ) +1

Problem 1. a) Upper bound (by Simplified Master Theorem ) Lower Bound
(by the recursion)

Problem 1. b) Upper bound (by Simplified Master Theorem Case 2)
Lower Bound (by the recursion tree analysis)

Problem 1. c) Upper bound Lower Bound

Problem 1. c) We have

Problem 1.d) Upper bound Lower Bound

Problem 1. d) Upper bound (by Simplified Master Theorem Case 3)
Lower Bound (by the recursion tree)

Problem 2 2. Let A[0...n-1] be an array of n distinct integers. A pair (A[i], A[j]) is said to be an inversion if these numbers are out of order, i.e., i<j but A[i]>A[j]. Design an O(n log n) time algorithm for counting the number of inversions.

Solution Revise the merge sorting.
When merge to sorted sub-array, compare two front elements. a) remove the front left element if it is less than or equal to the front on the right. b) increase the counter by the number of elements in the left half if front left is larger than right, and remove the front right.

Problem 3: Bubble, Merge, and Heap Sortings
int bubblesort(int *a, int size). b) int mergesort(int *a, int size) c) int generate(int *a, int size) d) Test both with 10, 100, 1000, 10000, , and 1000,000, 4000,000 integers.

Merge void merge(long int *a, long int lo, long int m, long int hi){
long int i, j, k; i=0; j=lo; // copy first half of array a to auxiliary array b while (j<=m) b[i++]=a[j++]; i=0; k=lo; // copy back next-greatest element at each time while (k<j && j<=hi) if (b[i]<=a[j]) a[k++]=b[i++]; else a[k++]=a[j++]; // copy back remaining elements of first half (if any) while (k<j) a[k++]=b[i++]; }

Mergesort void mergesort(long int *a, long int lo, long int hi){
if (lo<hi) { long int m=(lo+hi)/2; mergesort(&a[0], lo, m); mergesort(&a[0], m+1, hi); merge(&a[0], lo, m, hi); }

The beginning of the program
#include <iostream> using namespace std; #include <time.h> #define ARRAYSIZE long int array[ARRAYSIZE]; long int b[ARRAYSIZE]; void merge(long int a[], long int lo, long int m, long int hi); void mergesort(long int a[], long int lo, long int hi); void swap( long int *element1Ptr, long int *element2Ptr ); void bubbleSort( long int *array, const long int size );

Main int main(void) { time_t t1,t2; long int option, i;
printf("Enter 1 for merge sort or 2 for bubble sort\n"); scanf("%d", &option); for(i = 0; i < ARRAYSIZE; i++) array[i] = rand(); /* load random values */ if(option == 1){ t1 = time(NULL); mergesort(&array[0], 0, ARRAYSIZE - 1); t2 = time(NULL); } else{ t1 = time(NULL); bubbleSort(&array[0], ARRAYSIZE); return 0;

Homework 2 The knapsack problem is that given a set of positive integers {a1,…, an}, and a knapsack of size s, find a subset A of {a1,…, an} such that the sum of elements in A is the largest, but at most s. Part 1. Use the dynamic programming method to design the algorithm for the knapsack problem. Prove the correctness of your algorithm. Show the computational time of your algorithm carefully.

Homework 2 Part 2. Use C++ to implement the function below
int knapsack(int *a, //the input integers int n, //the number of input integers int s, //knapsack size int *subset, //subset elements int &size_of_subset //the number of items in the subset ) Test your program for the following knapsack problem: Input list: 5, 23, 27, 37, 48, 51, 63, 67, 71, 75, 79, 83, 89, 91, 101, 112, 121, 132, 137, 141, 143, 147, 153, 159, 171, 181, 190, 191 with knapsack size 595. Print out the subset and the sum of its elements. Also print out your source code.

Single-Source Shortest Paths

Shortest-paths with Source s (Example)
10 10 1 1 10 1 1 x w x w t t 10 1 10 1 10 u 10 y y v 1 v 10 1 1 z z Original Graph G Shortest-paths with Source s

Shortest-path problem
Find the shortest path in a graph。 G=(V,E) is a Weighted Directed Graph Weight function w: ER assigns weight to each edge。 p=(v0,v1,…,vk) is a path from v0 to vk。 此處新介紹一個定義，Weighted graph即是在每一個邊上加上一個實數作為權重， 可正，可負，可零並無什麼特別限制。 另外在此Path的定義僅是方便下列路徑距離及最短距離的定義， 並非各處的路徑均如此定義。

Shortest-path problem
Define Define the shortest distance from node u to node v

Shortest-path tree rooted at s
For Graph G=(V,E), its Shortest-path tree rooted at s is G’=(V’,E’)，which satisfies： V’ is the set of all nodes reachable from s。 G’ is a tree with s as its root。 In G’, a path from s to v is the shortest path from s to v in G。 V’是s可達的點集合，代表for every v’ in V’, there exists a path from s to v’。 Rooted tree是一種有向圖，只有一個點in degree = 0 其餘均為 1， in degree 0的點稱之為root。

Shortest-path tree rooted at s (Example)
10 10 1 1 10 1 1 x w x w t t 10 1 10 1 10 u 10 y y v 1 v 10 1 1 z z Original Graph G Shortest-path tree rooted at s

Predecessor graph For graph G=(V,E)，follow table π to build Gπ=(Vπ,Eπ), which statisfies： π[s]=NIL， and s∈Vπ。 If π[v]≠NIL, then(π[v],v)∈Eπ and v∈Vπ 。 Shortest-path tree rooted at s is an example of Predecessor graph。 Single-source shortest path演算法主要利用不斷更新Predecessor graph， 讓他最後變成Shortest-path tree rooted at s來求出s到其他點的最短距離。

Predecessor graph Example
π[t] π[u] π[v] π[w] π[x] π[y] π[z] NIL s t x v s s 10 10 1 1 10 1 1 x w x w t t 10 1 10 1 10 在此舉的例子同前，但補列對應的表格。 u 10 y y v 1 v 10 1 1 z z Original Graph G Shortest-path tree rooted at s

Initialize-Single-Source Algorithm
Define d[v]to be the shortest distance from s to v。 Let π[v] be the node before reaching v on the shortest path from s to v。 Initially，d[v]=∞，π[v]=NIL，d[s]=0。 Except the shortest path from s to s, everything else is unknown。

Initialize-Single-Source Algorithm
Initialize-Single-Source(G,s) { for each vertex v∈V[G] do d[v]∞ π[v]NIL d[s]0 }

Relaxation Algorithm Use the edge (u,v) to improve the current known shortest path。 Relax(u,v,w) { if d[v]>d[u]+w(u,v) then d[v]d[u]+w(u,v) π[v]u }

Relaxation Example After Relax(u,v,w) Before Relax(u,v,w) s
if w(u,v)=2 (<3) Renew sv shortest path and π[v]u s 4 6 w(u,v) u v 4 7 s w(u,v) if w(u,v)=4 (>3) Do not update sv shortest distance 紅線代表目前已知的最短路徑。 如果Relax(u,v,w)發現經由已知的su最短路徑加上(u,v)抵達v， 可以比原先已知sv的最短路徑距離更短， 則可以更新sv的最短路徑資訊。 u v 4 7 w(u,v) u v

Shortest Path and Relaxation
Triangular inequality： For every edge (u,v)，δ(s,v)<= δ(s,u)+w(u,v)。 Upper bound propert： δ(s,v)<= d[v]，d[v] is always the upper for the shortest distance sv。If d[v]=δ(s,v)，then Relaxation does not update d[v]。 δ(s,v)代表s到v的最短距離。

Shortest Path and Relaxation
No Path： If there is no path from s to v, then d[v]=δ(s,v)=∞。 Convergence Property： If the shortest path sv has edge (u,v) and d[u]=δ(s,u)，Then Relax(u,v,w) makes d[v]=δ(s,v)。

Shortest Path and Relaxation
Path-relaxation Property： If p=(v0,v1,…,vk) is a shortest path s=v0vk, , then excuting Relax(v0,v1,w)， Relax(v1,v2,w)…， Relax(vk-1,vk,w) can achieve d[vk]=δ(s,vk)。 Predecessor graph Property： After a series of Relaxation，for every node v，when d[v]=δ(s,v)，the corresponding Predecessor graph Gπ is a Shortest-path tree rooted at s。

Bellman-Ford Algorithm
It computes the shortest paths for the graph without negative loop。 Bellman-Ford(G,w,s) { Initialize-Single-Source(G,s) for i = 1 to |V-1| do for each edge (u,v)∈E do Relex(u,v,w) for each edge (u,v)∈E do if d[v]>d[u]+w(u,v) then return false //Negative loop return true //Success }

Bellman-Ford Algorithm Example
5 5 (a) -2 (b) -2 6 6 6 8 8 -3 -3 s s 7 7 -4 -4 7 7 2 7 2 (a)是初始狀態，(b)是first iteration之後的狀況，(c)跟(d)類推。 塗成藍色的虛線邊是對應的Predecessor graph所有的邊， 點內的數字代表d[v]，是現存最短路徑長度。 9 9

Bellman-Ford Algorithm Example
5 5 (c) -2 (d) (e) -2 6 4 2 4 6 6 8 8 -3 -3 s s 7 7 -4 -4 7 7 7 2 2 7 2 -2 9 9

Bellman-Ford Algorithm Analysis
Correctness：For each edge, Relaxation can compute the next reachable node’s shortest path in the Shortest-path tree rooted at s。By path-relaxation property，After |V|-1，All Shortest simple path destination v，d[v]=δ(s,v)。 能檢測出有負迴圈，是因為做了|V|-1次relaxation之後， 仍然可以透過經過額外的邊讓d[v]下降， 則代表最短路徑長度已經超過|V|-1， 由鴿籠原理可以得知必有迴圈存在， 但正回圏不存在最短路徑上， 因故有負迴圈存在。

Bellman-Ford Algorithm Analysis
Time Complexity： Initialize-Single-Source takes O(|V|) steps。 For each edge, it spends O(|V|) time Relaxation and costs O(|E||V|) steps。 Finally, spends O(|E|) to check if it has negative loop。 Total time : O(|V||E|)。

Dijkstra Algorithm Can only handle graph without negative edge。
It is faster than Bellman-ford algorithm，and select an order to do Relaxation。 Use Priority queue for implementation。 Main idea：Use the convergence property。

Dijkstra Algorithm Q: Priority queue with d as the key Dijkstra(G,w,s)
{ Initialize-Single-Source(G,s) Q=V[G] while Q is not empty do u=Extract-Min(Q) for each v∈adj[u] do Relax(u,v,w) } 每次藉由Extract-min所得到的點v，d[v]已經是最短路徑長度了。

Dijkstra Algorithm Example
10 5 1 2 3 7 9 4 6 (a) (b) s (a)是初始狀態，(b)是first iteration之後的狀況，(c)跟(d)類推。 塗成藍色的虛線邊是對應的Predecessor graph所有的邊， 點內的數字代表d[v]，是現存最短路徑， 綠色的點代表已經將該點的所有邊Relax過了。

Dijkstra Algorithm Example
8 11 5 7 10 1 2 3 9 4 6 (d) s 14 (c)

Dijkstra Algorithm Example
8 9 5 7 10 1 2 3 4 6 (e) s

Dijkstra Algorithm Analysis
Use different Priority queue，has different cost。 Use Linear array，Cost O(|V|2) steps。 Use Binary heap，Costs O(|E|log|V|) steps。 Use Fibonacci heap，Costs O(|E|+|V|log|V|) steps。

Single-source shortest paths in DAGs
Different from Bellman-Ford. Follow certain order to do Relaxation，Can find the shortest path in shorter time。 DAG-Shortest-Path(G,w,s) { Topologically sort V[G] Initialize-Single-Source(G,s) for each u taken in topological order do for each v∈adj[u] do Relax(u,v,w) } Costs O(|V|+|E|) steps。 Topological sort僅需O(|V|+|E|)的時間 而Initialize-Single-Source花去O(|V|) 最後的迴圈花去O(|V|+|E|)的時間

DAG-Shortest-Path Example
6 1 s 5 2 7 -1 -2 3 4 2 (b) 6 1 (a)是初始狀態，(b)是first iteration之後的狀況，(c)跟(d)類推。 塗成藍色的虛線邊是對應的Predecessor graph所有的邊， 點內的數字代表d[v]，是現存最短路徑， 綠色的點代表已經將該點的所有邊Relax過了。 s 5 2 7 -1 -2 3 4 2

DAG-Shortest-Path Example
(c) 6 1 s 5 2 7 -1 -2 2 6 3 4 2 (d) 6 1 s 5 2 7 -1 -2 2 6 6 4 3 4 2

DAG-Shortest-Path Example
6 1 s 5 2 7 -1 -2 2 6 5 4 3 4 2 (f) 6 1 s 5 2 7 -1 -2 2 6 5 3 3 4 2 (g) 6 1 s 5 2 7 -1 -2 2 6 5 3 3 4 2

Problem: apply Dijskstra algorthm to find the shortest paths to all nodes from s. Show how d[v] changes at every v. s 10 1 10 1 x w t 10 1 10 u 10 y v 1 10 1 z

Bipartite Matching Lecture 3: Jan 17

Bipartite Matching A graph is bipartite if its vertex set can be partitioned into two subsets A and B so that each edge has one endpoint in A and the other endpoint in B. A B A matching M is a subset of edges so that every vertex has degree at most one in M.

Maximum Matching The bipartite matching problem:
Find a matching with the maximum number of edges. A perfect matching is a matching in which every vertex is matched. The perfect matching problem: Is there a perfect matching?

First Try Greedy method? (add an edge with both endpoints unmatched)

Key Questions How to tell if a graph does not have a (perfect) matching? How to determine the size of a maximum matching? How to find a maximum matching efficiently?

Existence of Perfect Matching
Hall’s Theorem [1935]: A bipartite graph G=(A,B;E) has a matching that “saturates” A if and only if |N(S)| >= |S| for every subset S of A. N(S) S

Bound for Maximum Matching
What is a good upper bound on the size of a maximum matching? König [1931]: In a bipartite graph, the size of a maximum matching is equal to the size of a minimum vertex cover. König [1931]: In a bipartite graph, the size of a maximum matching is equal to the size of a minimum vertex cover. Min-max theorem NP and co-NP Implies Hall’s theorem.

Algorithmic Idea? Any idea to find a larger matching?

Augmenting Path Given a matching M, an M-alternating path is a path that alternates between edges in M and edges not in M. An M-alternating path whose endpoints are unmatched by M is an M-augmenting path.

Optimality Condition What if there is no more M-augmenting path?
If there is no M-augmenting path, then M is maximum! Prove the contrapositive: A bigger matching  an M-augmenting path Consider Every vertex in has degree at most 2 A component in is an even cycle or a path Since , an M-augmenting path!

Algorithm Key: M is maximum  no M-augmenting path
How to find efficiently? How to find efficiently?

Finding M-augmenting paths
Orient the edges (edges in M go up, others go down) An M-augmenting path  a directed path between two unmatched vertices

Complexity At most n iterations
An augmenting path in time by a DFS or a BFS Total running time

Minimum Vertex Cover Hall’s Theorem [1935]:
A bipartite graph G=(A,B;E) has a matching that “saturates” A if and only if |N(S)| >= |S| for every subset S of A. König [1931]: In a bipartite graph, the size of a maximum matching is equal to the size of a minimum vertex cover. Idea: consider why the algorithm got stuck…

Faster Algorithms Observation: Many short and disjoint augmenting paths. Idea: Find augmenting paths simultaneously in one search.

Randomized Algorithm Matching Determinants Randomized algorithms
Bonus problem 1 (50%): Given a bipartite graph with red and blue edges, find a deterministic polynomial time algorithm to determine if there is a perfect matching with exactly k red edges.

Application of Bipartite Matching
Isaac Jerry Darek Tom Marking Tutorials Solutions Newsgroup Job Assignment Problem: Each person is willing to do a subset of jobs. Can you find an assignment so that all jobs are taken care of?

Application of Bipartite Matching
With Hall’s theorem, now you can determine exactly when a partial chessboard can be filled with dominos.

Application of Bipartite Matching
Latin Square: a nxn square, the goal is to fill the square with numbers from 1 to n so that: Each row contains every number from 1 to n. Each column contains every number from 1 to n.

Application of Bipartite Matching
Now suppose you are given a partial Latin Square. Can you always extend it to a Latin Square? With Hall’s theorem, you can prove that the answer is yes.

Homework 2 Problem 1. Bitonic Euclidean Traveling Saleman problem.

Problem 1 Define C(i, j): the minimal cost of tour from i to 1( to leftmost) and from 1 to j (to rightmost). Identify the recursion for C(i,j)

Problem 1 Define C(i, j): the minimal cost of tour from i to 1( to leftmost) and from 1 to j (to rightmost). Identify the recursion for C(i,j) Sort those points by x-coordinates 1,…,n

Recursion Case i>j+1 i-1 i 1 j

Recursion Case j>i+1 i 1 j j-1

Recursion Case j=i+1 i 1 j i-1

Recursion Case j=i+1 i 1 i-1 j

Recursion Case j=i i-1 1 i=j i-2

Recursion Case j=i i-1 1 i=j i-2

Recursion Case j>i+1 i 1 j-1 j

Recursion Case i=j+1 i-2 i 1 j=i-1

Recursion Case i=j+1 i-2 1 i j=i-1

Time Each C(i,j) needs to deal with O(1) cases. Output C(n,n).
Total time is

Problem 2 Printing Neatly problem. The extra space each line is
Minimize the sum of the cube of extra space for all lines except the last.

Problem 2 Define a line extra space cube for printing word i, word i+1,…, word j: Define C(k) to be the cost for printing word k, word k+1,…, word n.

Recursion If word k, word k+1,…, word n can fit into one row, then C(k)=0. Otherwise, assume h is the maximal number of words from k to fit into one row:

Time Each C(k) takes O(n) time. Total time is

Problem: Find an augmenting path to improve the red matching

Midterm >=90: 2 80-89: 3 70-79: 4 60-70: 5 <60 : 2

Problem 1 Solve the following recursive equations with big-O notation:
T(n)=T(n-2)+n^3, with T(1)=1. T(n)=16T(n/2)+n^2， with T(1)=1.

Simplified Master Theorem
Let be a recursive equation on the nonnegative integers, where a> 0, b > 1, c>0, and r>0 are constants, Then, 1. If , then 2. If , then 3. If , then

Problem 1 a) T(n)=T(n-1)+n^3, with T(1)=1. Soltuion: T(n)=O(n^4)
b) T(n)=16T(n/2)+n^2， with T(1)=1. Solution: T(n)=O(n^4)

Problem 2 Delete 7 5:9 3:5 7:9 1:3 4:5 6:7 8:9 9 1 3 4 5 6 7 8

Problem 2 Delete 7 5:9 3:5 7:9 1:3 4:5 6:7 8:9 9 1 3 4 5 6 8

Problem 2 Delete 7 5:9 3:5 7:9 1:3 4:5 6:8:9 9 1 3 4 5 6 8

Problem 2 Delete 7 3:5:9 1:3 4:5 6:8:9 9 1 3 4 5 6 8

Problem 3 The following is a heap. A) show the steps to insert a new element 1. b) Show the steps to remove the root after 1 is inserted. 2 7 3 11 8 6 4

Heap Insertion 2 1

Heap Insertion 2 11

Heap Insertion 2 11

Heap Insertion 1 11

Heap Deletion 11

Heap Deletion 2 3 11

Heap Deletion 2 11

Heap Deletion 2

Heap Deletion 2

Problem 5 Apply the Prim’s Algorithm to find the minimum spanning tree. Show each of your steps.

Prim’s Algorithm (pseudo code 1)
ALGORITHM Prim(G) // Input: A weighted connected graph G=(V,E) // Output: A MST T=(V, A) VT  { v0 } // Any vertex will do; A  ; for i  1 to |V|1 do find an edge xy  (VT, VVT ) s.t. its weight is minimized among all edges in (VT, VVT ); VT  VT  { y } ; A  A  { xy } ; Prim’s algorithm 找 A 的 safe edge 的方式稍有不同，此演算法利用一個逐漸增大的頂點集 VT，來構成一個 cut (VT, VVT) ，並找出這 cut 權重最小的邊 xy 加入 A，根據 MST 基本引理 xy 必為某一 MST 的邊，又我們注意到 VT 一直保持為 A 的邊的端點所成的集合，所以 A必包含在某一 MST 內且 xy 為A 的 safe edge。 374

Prim’s Algorithm (Example 1/8)
b 3 5 1 9 4 7 2 f a e c d g h 以下我們用一例來說明 Prim’s algorithm 的運作過程。 紫色點代表 VT 內的點。 紫色與藍色點間的最小權重邊為 ac，所以 ac 會被加入 A 中。 375

Prim’s Algorithm (Example 2/8)
b 3 5 1 9 4 7 2 f a e c d g h 邊 ac 已加入 A 中 （以紅色邊表示 A 中的邊）。 紫色與藍色點間的最小權重邊為 bc，所以 bc 會被加入 A 中。 376

Prim’s Algorithm (Example 3/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 bf，所以 bf 會被加入 A 中。 377

Prim’s Algorithm (Example 4/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 ef，所以 ef 會被加入 A 中。 378

Prim’s Algorithm (Example 5/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 eg，所以 eg 會被加入 A 中。 379

Prim’s Algorithm (Example 6/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 dg，所以 dg 會被加入 A 中。 380

Prim’s Algorithm (Example 7/8)
b 3 5 1 9 4 7 2 f a e c d g h 紫色與藍色點間的最小權重邊為 gh，所以 gh 會被加入 A 中。 381

Prim’s Algorithm (Example 8/8)
b 3 5 1 9 4 7 2 f a e c d g h 最後已有 n1 = 7 個邊加入 A 中，所以 演算法會停並得一 MST 權重和為 17。 MST cost = 17 382

Problem 5 5. (20%) Find an O(nlog n) time algorithm such that given two sets of integers A and B, it determines whether B is a subset of A, where n=max(|A|,|B|), which is the larger size of A and B. For examples, if A={3, 7,5} and B={3,5}, then the algorithm returns “yes”; and if A={3, 7,5} and B={2,5}, then the algorithm returns “no”.

Problem 6 (20%) This is a job scheduling problem with one machine. Each job has a specific time interval to be executed by the machine. In order to allocate some jobs to the machine, all the jobs assigned to the machine must have disjoint time intervals. For example, the list of input jobs has time intervals: [1, 3], [2, 6], [5, 9], [7,13], [11, 15]. There is an overlap between [1,3] and [2,6]. Therefore, [1,3] and [2,6] cannot be assigned to the machine together. Three jobs can be assigned to the machine without overlap as below [1,3], [5,9], and [11,15] (all intervals are disjoint) . Develop an algorithm for the scheduling problem to get the maximal number of jobs assigned to the machine. Show the time complexity of your algorithm. Hint: you may use a greedy method to solve this problem.

Improve Midterm by 20 points
Rewrite the solution for problem 6, and implement the algorithm with C++. Submit your solution with test results.

Initialize-Single-Source Algorithm
Initialize-Single-Source(G,s) { for each vertex v∈V[G] do d[v]∞ π[v]NIL d[s]0 }

Relaxation Algorithm Use the edge (u,v) to improve the current known shortest path。 Relax(u,v,w) { if d[v]>d[u]+w(u,v) then d[v]d[u]+w(u,v) π[v]u }

Relaxation Example After Relax(u,v,w) Before Relax(u,v,w) s
if w(u,v)=2 (<3) Renew sv shortest path and π[v]u s 4 6 w(u,v) u v 4 7 s w(u,v) if w(u,v)=4 (>3) Do not update sv shortest distance 紅線代表目前已知的最短路徑。 如果Relax(u,v,w)發現經由已知的su最短路徑加上(u,v)抵達v， 可以比原先已知sv的最短路徑距離更短， 則可以更新sv的最短路徑資訊。 u v 4 7 w(u,v) u v

Shortest Path and Relaxation
Triangular inequality： For every edge (u,v)，δ(s,v)<= δ(s,u)+w(u,v)。 Upper bound propert： δ(s,v)<= d[v]，d[v] is always the upper for the shortest distance sv。If d[v]=δ(s,v)，then Relaxation does not update d[v]。 δ(s,v)代表s到v的最短距離。

Shortest Path and Relaxation
No Path： If there is no path from s to v, then d[v]=δ(s,v)=∞。 Convergence Property： If the shortest path sv has edge (u,v) and d[u]=δ(s,u)，Then Relax(u,v,w) makes d[v]=δ(s,v)。

Shortest Path and Relaxation
Path-relaxation Property： If p=(v0,v1,…,vk) is a shortest path s=v0vk, , then excuting Relax(v0,v1,w)， Relax(v1,v2,w)…， Relax(vk-1,vk,w) can achieve d[vk]=δ(s,vk)。 Predecessor graph Property： After a series of Relaxation，for every node v，when d[v]=δ(s,v)，the corresponding Predecessor graph Gπ is a Shortest-path tree rooted at s。

Bellman-Ford Algorithm
It computes the shortest paths for the graph without negative loop。 Bellman-Ford(G,w,s) { Initialize-Single-Source(G,s) for i = 1 to |V-1| do for each edge (u,v)∈E do Relex(u,v,w) for each edge (u,v)∈E do if d[v]>d[u]+w(u,v) then return false //Negative loop return true //Success }

Bellman-Ford Algorithm Example
5 5 (a) -2 (b) -2 6 6 6 8 8 -3 -3 s s 7 7 -4 -4 7 7 2 7 2 (a)是初始狀態，(b)是first iteration之後的狀況，(c)跟(d)類推。 塗成藍色的虛線邊是對應的Predecessor graph所有的邊， 點內的數字代表d[v]，是現存最短路徑長度。 9 9

Bellman-Ford Algorithm Example
5 5 (c) -2 (d) (e) -2 6 4 2 4 6 6 8 8 -3 -3 s s 7 7 -4 -4 7 7 7 2 2 7 2 -2 9 9

Bellman-Ford Algorithm Analysis
Correctness：For each edge, Relaxation can compute the next reachable node’s shortest path in the Shortest-path tree rooted at s。By path-relaxation property，After |V|-1，All Shortest simple path destination v，d[v]=δ(s,v)。 能檢測出有負迴圈，是因為做了|V|-1次relaxation之後， 仍然可以透過經過額外的邊讓d[v]下降， 則代表最短路徑長度已經超過|V|-1， 由鴿籠原理可以得知必有迴圈存在， 但正回圏不存在最短路徑上， 因故有負迴圈存在。

Bellman-Ford Algorithm Analysis
Time Complexity： Initialize-Single-Source takes O(|V|) steps。 For each edge, it spends O(|V|) time Relaxation and costs O(|E||V|) steps。 Finally, spends O(|E|) to check if it has negative loop。 Total time : O(|V||E|)。

Dijkstra Algorithm Can only handle graph without negative edge。
It is faster than Bellman-ford algorithm，and select an order to do Relaxation。 Use Priority queue for implementation。 Main idea：Use the convergence property。

Dijkstra Algorithm Q: Priority queue with d as the key Dijkstra(G,w,s)
{ Initialize-Single-Source(G,s) Q=V[G] while Q is not empty do u=Extract-Min(Q) for each v∈adj[u] do Relax(u,v,w) } 每次藉由Extract-min所得到的點v，d[v]已經是最短路徑長度了。

Dijkstra Algorithm Example
10 5 1 2 3 7 9 4 6 (a) (b) s (a)是初始狀態，(b)是first iteration之後的狀況，(c)跟(d)類推。 塗成藍色的虛線邊是對應的Predecessor graph所有的邊， 點內的數字代表d[v]，是現存最短路徑， 綠色的點代表已經將該點的所有邊Relax過了。

Dijkstra Algorithm Example
8 11 5 7 10 1 2 3 9 4 6 (d) s 14 (c)

Dijkstra Algorithm Example
8 9 5 7 10 1 2 3 4 6 (e) s

Problem 6 Design an algorithm to test if an undirected graph is connected. A graph is connected if there exists a path between every two vertices. For examples, the left graph is connected, but the right graph is not.

Example Vertices a,b,c,d are reachable, but e is not. c a s e d b

Problem 6 Solution Assign weight one to each edge.
Apply the minimal spanning tree algorithm The graph is connected iff the size of minimal spanning tree is n-1, where n is the number of nodes.

Problem 7 a) Design an O(n log n) time algorithm that given an array of n integers, it finds two elements a and b with |a-b|<5. b) Improve the algorithm to O(n) time if the n integers in the input are in the range from 1 to 7n.

Problem 7 Examples Connected Unconnected

Problem 7 Solution a) Apply the merge sorting. O(n log n) time
If there two neighbors have difference < 5. O(n) time. Total time is O(n log n)+O(n)=O(nlog n)

Problem 7 Solution b) Define an array int a[7n]=0;
Let a[k]=1 if k is in the list; O(n) time Check if there exists two 1s with distance less than 5 in array a[ ]. O(n) time

Problem 8 Suppose you have one machine and a set of n jobs a1, a2, …, an to process on that machine. Each job aj has a processing time tj, and a profit pj, and a deadline dj. The machine can process only one job at a time, and job aj must run uninterruptedly for tj consecutive time units. If job aj is completed by its deadline dj, you receive a profit pj, but if it is completed after its deadline, you receive a profit 0. Give an algorithm to find the schedule that obtains the maximum amount of profit, assuming that all processing times are integers between 1 and n. What is the running time of your algorithm.

Problem 8 Solution Try dymnamic programming method.
Improve your midterm by working on it again. Due March 31 (Tuesday)

NP-completeness

NP Problems blind monkey

Hamiltonian Path Problem
Given n cities Does it exist a path through each city exactly once. ORD PVD MIA DFW SFO LAX LGA HNL

Hamiltonian Path Hamiltonian path goes through each node exactly once
HAMPATH={G| G is a directed graph with a Hamiltonian path}

Polynomial n: input size
is a polynomial of n, where c does not depend on n. Examples:

Class P P is the complexity class consisting of all decision problems that have polynomial-time algorithms

Polynomial-Time Decision Problems
Decision problems: output is 1 or 0 (“yes” or “no”) Examples: Is a given circuit satisfiable? Does a text T contain a pattern P? Does an instance of 0/1 Knapsack have a solution with benefit at least K? Does a graph G have an MST with weight at most K?

The Complexity Class P A complexity class is a collection of languages
P is the complexity class consisting of all decision problems that have polynomial-time algorithms For each problem L in P, there is a polynomial-time decision algorithm A for L. If n=|x|, for x in L (decision with “yes”), then A runs in p(n) time on input x. The function p(n) is some polynomial

Verifier A verifier for a language L is an algorithm V,
L={w| V accepts <w,c> for some string c} For the verifier V for L, c is a certificate of w if V accepts <w,c> If the verifier V for the language L runs in polynomial time, V is the polynomial time verifier for L.

Verifier for Hamiltonian Path
For <G,s,t>, a certificate is a list of nodes of G: Verifier: check if m is the number of nodes of G Check if all nodes are all different check if each is a directed edge of G for i=1,…,m-1 If all pass, accept . Otherwise, reject.

NP example (2) Problem: Decide if a graph has an Hamilton tour with weight K Verification Algorithm: Test that Tour containing all nodes Test that Tour has weight at most K Analysis: Verification takes O(n) time, so this algorithm runs in polynomial time in non-deterministic algorithms. Thinking about this way: if we have such a tour, we can verify that.

Class NP NP is the class of languages that have polynomial time verifiers. Examples: HAMPATH is in NP

Clique Problem Given undirected graph G, a clique is a set of nodes of G such that every two nodes are connected by an edge. A k-clique is a clique with k nodes

Clique Problem CLIQUE={<G,k>| G iss an undirected graph with k-clique} CLIQUE is in NP.

Subset Sum Problem SUBSET-SUM={<S,t>| S= and for some , we have

Polynomial Time Computable
A function is a polynomial time computable function if some polynomial time algorithm A exists that outputs for input w.

Polynomial Time Reduction
Assume that A and B are two languages. A is polynomial time mapping reducible to A if a polynomial time computable function f exists such that

Transitivity If and , then

Boolean Formula A literal is either a boolean variable or its negation: A clause is the disjunction of several literals Conjunctive normal form is the conjunction of several clauses

3SAT A 3nd conjunctive normal formula (3nd-formula) is a conjunction form with at most 3 literals at each clause 3SAT={ | is satisfiable 3nd-formula}

3SAT to CLIQUE Example:

Outline P and NP NP-completeness Definition of P Definition of NP
Alternate definition of NP NP-completeness Definition of NP-complete and NP-hard The Cook-Levin Theorem

More Outline Some NP-complete problems Problem reduction
SAT (and CNF-SAT and 3SAT) Vertex Cover Clique Hamiltonian Cycle

What is a problem A language is a set of strings
A problem is a collection of instances An instance can be coded into a string A language=a problem Size of the problem refers to the length of string Algorithm that solves a problem A Turing machine accepts a language

Traveling Saleman Problem
Given n cities Find a shortest path through each city exactly once. ORD PVD MIA DFW SFO LAX LGA HNL 849 802 1387 1743 1843 1099 1120 1233 337 2555 142

Running Time Revisited
Input size, n All the polynomial-time algorithms studied so far in this course run in polynomial time using this definition of input size. ORD PVD MIA DFW SFO LAX LGA HNL 849 802 1387 1743 1843 1099 1120 1233 337 2555 142

NP Problems blind monkey

Problem Given the formula f=
construct a graph G such that f is satisfiable iff G has a clique of size 3.

An Interesting Problem
A Boolean circuit is a circuit of AND, OR, and NOT gates; the CIRCUIT-SAT problem is to determine if there is an assignment of 0’s and 1’s to a circuit’s inputs so that the circuit outputs 1.

CIRCUIT-SAT is in NP Non-deterministically choose a set of inputs and the outcome of every gate, then test each gate’s I/O. If there is an input assignment, we can verify that in polynomial time.

NP-Completeness Reduction: transfer a language to a subset of another language. P-reduction means the process of transferring each string can be done in polynomial time. NP-complete class L: L is in NP. For each language M in NP, we can take an input x for M, transform it in polynomial time to an input x’ for L such that x is in M if and only if x’ is in L. L is NP-hard if it’s harder than NP-complete. NP poly-time L

Cook-Levin Theorem Cook’s Theorem: CIRCUIT-SAT is NP-complete. NP
Proof: We already showed it is in NP. To prove it is NP-complete, we have to show that every language in NP can be reduced to it. Let M be in NP, and let x be an input for M. Let y be a certificate that allows us to verify membership in M in polynomial time, p(n), by some algorithm D. Let S be a circuit of size at most O(p(n)2) that simulates a computer (details omitted…) NP poly-time M CIRCUIT-SAT

Cook-Levin Proof We can build a circuit that simulates the verification of x’s membership in M using y. S Let W be the working storage for D (including registers, such as program counter); let D be given in RAM “machine code.” Simulate p(n) steps of D by replicating circuit S for each step of D. Only input: y. Circuit is satisfiable if and only if x is accepted by D with some certificate y Total size is still polynomial: O(p(n)3). D D D < p(n) W W W cells Output 0/1 from D S S Inputs y y y n x x x p(n) steps

Some Thoughts about P and NP
CIRCUIT-SAT NP-complete problems live here Some Thoughts about P and NP Belief: P is a proper subset of NP. Implication: the NP-complete problems are the hardest in NP. Why: Because if we could solve an NP-complete problem in polynomial time, we could solve every problem in NP in polynomial time. That is, if an NP-complete problem is solvable in polynomial time, then P=NP. Since so many people have attempted without success to find polynomial-time solutions to NP-complete problems, showing your problem is NP-complete is equivalent to showing that a lot of smart people have worked on your problem and found no polynomial-time algorithm.

Circuit  Formula Circuit

Logic Demorgan Law:

Truth table for y y x2

Convert to CNF Conversion:

Convert to CNF Conversion:

3SAT The SAT problem is still NP-complete even if the formula is a conjunction of disjuncts, that is, it is in conjunctive normal form (CNF). The SAT problem is still NP-complete even if it is in CNF and every clause has just 3 literals (a variable or its negation): (a+b+¬d)(¬a+¬c+e)(¬b+d+e)(a+¬c+¬e) Reduction from SAT .

Problem Given the formula f=
construct a graph G such that f is satisfiable iff G has a clique of size 3.

Showing NP-Completeness
x x x x x x x x 1 1 2 2 3 3 4 4 12 22 32 11 13 21 23 31 33

Problem Reduction A language M is polynomial-time reducible to a language L if an instance x for M can be transformed in polynomial time to an instance x’ for L such that x is in M if and only if x’ is in L. Denote this by ML. A problem (language) L is NP-hard if every problem in NP is polynomial-time reducible to L. (another way to define NP-hard. A problem (language) is NP-complete if it is in NP and it is NP-hard. CIRCUIT-SAT is NP-complete: CIRCUIT-SAT is in NP For every M in NP, M  CIRCUIT-SAT. Inputs: 1 Output:

Problem Reduction A general problem M is polynomial-time reducible to a general problem L if an instance x of problem M can be transformed in polynomial time to an instance x’ of problem L such that the solution to x is yes if and only if the solution to x’ is yes. Denote this by ML. A problem (language) L is NP-hard if every problem in NP is polynomial-time reducible to L. A problem (language) is NP-complete if it is in NP and it is NP-hard. CIRCUIT-SAT is NP-complete: CIRCUIT-SAT is in NP For every M in NP, M  CIRCUIT-SAT. Inputs: 1 Output:

Transitivity of Reducibility
If A  B and B  C, then A  C. An input x for A can be converted to x’ for B, such that x is in A if and only if x’ is in B. Likewise, for B to C. Convert x’ into x’’ for C such that x’ is in B iff x’’ is in C. Hence, if x is in A, x’ is in B, and x’’ is in C. Likewise, if x’’ is in C, x’ is in B, and x is in A. Thus, A  C, since polynomials are closed under composition. Types of reductions: Local replacement: Show A  B by dividing an input to A into components and show how each component can be converted to a component for B. Component design: Show A  B by building special components for an input of B that enforce properties needed for A, such as “choice” or “evaluate.”

CNF-SAT A Boolean formula is a formula where the variables and operations are Boolean (0/1): (a+b+¬d+e)(¬a+¬c)(¬b+c+d+e)(a+¬c+¬e) OR: +, AND: (times), NOT: ¬ SAT: Given a Boolean formula S, is S satisfiable, that is, can we assign 0’s and 1’s to the variables so that S is 1 (“true”)? Easy to see that CNF-SAT is in NP: Non-deterministically choose an assignment of 0’s and 1’s to the variables and then evaluate each clause. If they are all 1 (“true”), then the formula is satisfiable.

CNF-SAT is NP-complete
Reduce CIRCUIT-SAT to CNF-SAT. Given a Boolean circuit, make a variable for every input and gate. Create a sub-formula for each gate, characterizing its effect. Form the formula as the output variable AND-ed with all these sub-formulas: Example: m((a+b)↔e)(c↔¬f)(d↔¬g)(e↔¬h)(ef↔i)(m ↔kn)… Inputs: a e h The formula is satisfiable if and only if the Boolean circuit is satisfiable. b k i f c Output: m g j n d

3SAT The SAT problem is still NP-complete even if the formula is a conjunction of disjuncts, that is, it is in conjunctive normal form (CNF). The SAT problem is still NP-complete even if it is in CNF and every clause has just 3 literals (a variable or its negation): (a+b+¬d)(¬a+¬c+e)(¬b+d+e)(a+¬c+¬e) Reduction from SAT .

Vertex Cover A vertex cover of graph G=(V,E) is a subset W of V, such that, for every edge (a,b) in E, a is in W or b is in W. VERTEX-COVER: Given a graph G and an integer K, does G have a vertex cover of size at most K? VERTEX-COVER is in NP: Non-deterministically choose a subset W of size K and check that every edge is covered by W.

Vertex-Cover is NP-complete
Reduce 3SAT to VERTEX-COVER. Let S be a Boolean formula in CNF with each clause having 3 literals. For each variable x, create a node for x and ¬x, and connect these two: For each clause Ci = (a+b+c), create a triangle and connect the three nodes. truth setting component x ¬x i2 clause satisfying component i1 i3

Vertex-Cover is NP-complete
Completing the construction Connect each literal in a clause triangle to its copy in a variable pair. E.g., for a clause Ci = (¬x+y+z) Let n=# of variables Let m=# of clauses Set K=n+2m G has 3m+2n vertices x ¬x y ¬y z ¬z i2 i1 i3

Vertex-Cover is NP-complete
Example: (a+b+c)(¬a+b+¬c)(¬b+¬c+¬d) Graph has vertex cover of size K=4+6=10 iff formula is satisfiable. a ¬a b ¬b c ¬c d ¬d 12 22 32 11 13 21 23 31 33

Proof : Vertex-Cover is NP-complete
We need to prove the following two statements: Suppose there is an assignment of Boolean values that satisfies S, then we need to prove that there is a k cover. Suppose the special graph has a k<=n+2m cover, we need to prove that the Boolean expression is satisfiable.

Why? (satisfiable  cover)
Suppose there is an assignment of Boolean values that satisfies S Build a subset of vertices that contains each literal that is assigned 1 by satisfying assignment For each clause, the satisfying assignment must assign one to at least one of the summands (may be shared by other clauses). Include the other two vertices in the vertex cover (not share with other). The cover has size n + 2m (as required).

Is What We Described a Cover?
Each edge in a truth setting component (x+¬x) is covered. Each edge in a clause satisfying component is covered Two of three edges incident on a clause satisfying component is covered. An edge (incident to a clause satisfying component) not covered by a vertex in the component must be covered by a node in cover C labeled with a literal, since the corresponding literal is 1 (by how we chose the vertices to be covered in the clause satisfying components) (Choose two from each clause and chose one that has true value in each truth setting component.)

Why? (cover  satisfiable)
Suppose there is a cover C with size at most n + 2m For this special graph, any cover must contain at least one vertex from each truth setting component, and two from each clause satisfying component, so size is at least n + 2m (so exactly that) So, one edge incident to any clause satisfying component is not covered by a vertex in the clause satisfying component. This edge must be covered by the other endpoint, which is labeled with a literal. We can associate the literal associated with this node 1 and each clause in S is satisfied, hence S is satisfied

Why? (cover  satisfiable)
This is the complete proof. Bottom line: S is satisfiable iff G has a vertex cover of size at most n + 2m. Bottom line 2: Vertex Cover is NP-Complete

Clique This graph has a clique of size 5
A clique of a graph G=(V,E) is a subgraph C that is fully-connected (every pair in C has an edge). CLIQUE: Given a graph G and an integer K, is there a clique in G of size at least K? CLIQUE is in NP: non-deterministically choose a subset C of size K and check that every pair in C has an edge in G. This graph has a clique of size 5

CLIQUE is NP-Complete G’ G Reduction from VERTEX-COVER.
A graph G has a vertex cover of size K if and only if it’s complement has a clique of size n-K. G G’

Some Other NP-Complete Problems
SET-COVER: Given a collection of m sets, are there K of these sets whose union is the same as the whole collection of m sets? NP-complete by reduction from VERTEX-COVER SUBSET-SUM: Given a set of integers and a distinguished integer K, is there a subset of the integers that sums to K?

Some Other NP-Complete Problems
0/1 Knapsack: Given a collection of items with weights and benefits, is there a subset of weight at most W and benefit at least K? NP-complete by reduction from SUBSET-SUM Hamiltonian-Cycle: Given an graph G, is there a cycle in G that visits each vertex exactly once? NP-complete by reduction from VERTEX-COVER Traveling Salesperson Tour: Given a complete weighted graph G, is there a cycle that visits each vertex and has total cost at most K? NP-complete by reduction from Hamiltonian-Cycle.

Beyond NP

Outline and Reading Co-NP PSpace
A language L is in Co-NP iff (-L) is in NP. Example, non-saitisfiable, the language is defined as all cases of Boolean expressions that are not saitisfiable. PSpace A language is in Pspace if there is a TM accept it uses only polynomial space in an offline machine.

Some facts Co-NP=?NP, P=?PSpace. PSpace=NPSpace
Do not know PSpace=NPSpace P is subset of Co-NP. P=Co-P Other facts Co-NP <= PSPACE <= EXPTIME. The validity problem for propositional logic is Co-NP-complete. Determinining whether a position in generalized checker game is a winning position for one of the players is PSPACE-complete. ML type checking is EXPTIME-complete.

Turing Machine Write on the tape and read from it
Head can move left and right Tape is infinite Rejecting and accepting states Control a b a b

Deterministic Turing Machine
7-tuple Q is the finite set of states is the input alphabet not containing special blank is the tape alphabet is the start state, is the accept state is the reject state, where

Nondeterministic Turing Machine
Q is the finite set of states is the tape alphabet is the start state, is the accept state.

Configuration Current state: q7
Current head position on the tape: 4th cell Current tape content: abab q7 a b a b

Configuration A configuration is represented by
Where is the left part of the tape content, is the right part of the tape content, a is the symbol at the head position, q is the current state

Configuration Transition
For

Configuration Transition
For

Configuration Start configuration: , where w is the input
Accepting configuration: a configuration with state Rejecting configuration: a configuration with state

Accept Computation A Turing machine M accepts input w if a sequence of configurations exists where is the start configuration of M on input w, 2. each yields , and is an accepting configuration

Language recognized by TM
For a Turing machine M, L(M) denotes the set of all strings accepted by M. A language is Turing recognizable if some Turing machine recognizes it.

Turing Recognizable Turing machine M recognizes language L

Decidability A language L is Turing decidable if there is a deterministic Turing machine M such that If x is in L, then M accepts x in finite number of steps If x is not in L, then M rejects x in finite number of steps Example: {w#w| w is in {0,1}*} is Turing decidable

Turing Decidable Turing machine M decides language L

Observation If L is Turing decidable, then L is Turing recognizable

NP-completeness A language B is NP-complete if B is in NP, and
Every A in NP is polynomial time reducible to B Theorem. If B is NP-complete and B is in P, then P=NP.

SAT A boolean formula is satisfiable if there exists assignments to its variables to make the formula true SAT={ | is satisfiable boolean formula}

Cook-Leving Theorem Theorem: SAT is NP-complete Proof.
1. SAT is in NP. 2. For every problem A in NP,

Proof The start configuration is legal The final state is accept.
The movement is legal. Each cell takes one legal symbol.

Proof 1 if The cell[i,j] holds symbol s; 0 otherwise
Time bound for the NTM M with constant k. The movement is legal. NTM M for accepting A.

Nondeterministic Turing Machine
Q is the finite set of states is the tape alphabet is the start state, is the accept state.

Configuration Transition
For

Configuration Transition
For

Configuration Start configuration: , where w is the input
Accepting configuration: a configuration with state Rejecting configuration: a configuration with state

Accept Computation A Turing machine M accepts input w if a sequence of configurations exists where is the start configuration of M on input w, 2. each yields , and is an accepting configuration

Language recognized by TM
For a Turing machine M, L(M) denotes the set of all strings accepted by M. A language is Turing recognizable if some Turing machine recognizes it.

Proof Each cell has only one symbol The symbol is selected from C:
Only one symbol is selected: It is true for all cell at all configuration:

Proof The start configuration is

Proof Accept computation has reached.
It makes sure the accept state will appear among the configuration transitions.

Proof Characterize the legal move
The whole move is legal if all windows are legal. Characterize one window is legal

Proof The state transition

Boolean Formula A literal is either a boolean variable or its negation: A clause is the disjunction of several literals Conjunctive normal form is the conjunction of several clauses

Prepare for the Final Regular language and automata
Context free language Decidability Undecidability Complexity theory

Regular Language Concepts: Automata, regular expression
Skills: Design automata to accept a regular language Disprove a language is a regular

Context-free Language
Concepts: Context-free grammar, parsing tree Skills: Design automata to accept a context-free language Disprove a language is context-free

Decidability Concepts: Turing machine, algorithm, Church-Turing Thesis, Turing recognizable, Turing Decidable Skills: Prove a language is decidable (design algorithm) Prove a language is Turing recognizable

Undecidability Concepts: Countable, Turing undecidable, reduction
Skills: Diagonal method: Prove is undeciable Use reduction to prove a language is undecidable

Complexity Concepts: Time on Turing machine PTIME(t(n))
NP-completeness Polynomial time reduction Polynomial time verifier

Complexity Skill: Prove a problem is in P Prove a problem is in NP
Use reduction to prove a problem is NP-complete.

Grade A:… B:… C: Miss exam or homework

SAT’ A conjunctive normal form is a conjunction of some clauses
SAT’={ | is satisfiable conjunctive normal form}

Cook-Leving Theorem’ Theorem: SAT’ is NP-complete
Proof. Same as that for SAT is NP-complete

3SAT A 3nd conjunctive normal formula (3nd-formula) is a conjunction form with at most 3 literals at each clause 3SAT={ | is satisfiable 3nd-formula}

3SAT is NP-complete Theorem: There is polynomial time reduction from SAT’ to 3SAT.

3SAT is NP-complete is satisfiable if and only if the following is satisfiable

3SAT is NP-complete is satisfiable if and only if the following is satisfiable

3SAT is NP-complete Convert every clause into 3cnf:

3SAT is NP-complete Conjunctive normal form
Each clause is convert into is satisfiable if and only if the following is satisfiable

Problem: Convert Circuit C to Formula f such that C is satisfiable iff f is satisfiable

Approximation Algorithms

Outline and Reading Approximation Algorithms for NP-Complete Problems
Approximation ratios Polynomial-Time Approximation Schemes 2-Approximation for Vertex Cover Approximate Scheme for Subset Sum 2-Approximation for TSP special case Log n-Approximation for Set Cover

Approximation Ratios Optimization Problems
We have some problem instance x that has many feasible “solutions”. We are trying to minimize (or maximize) some cost function c(S) for a “solution” S to x. For example, Finding a minimum spanning tree of a graph Finding a smallest vertex cover of a graph Finding a smallest traveling salesperson tour in a graph

Approximation Ratios An approximation produces a solution T
T is a k-approximation to the optimal solution OPT if c(T)/c(OPT) < k (assuming a min. prob.; a maximization approximation would be the reverse)

Polynomial-Time Approximation Schemes
A problem L has a polynomial-time approximation scheme (PTAS) if it has a polynomial-time (1+)-approximation algorithm, for any fixed  >0 (this value can appear in the running time). Subset Sum has a PTAS.

Vertex Cover A vertex cover of graph G=(V,E) is a subset W of V, such that, for every (a,b) in E, a is in W or b is in W. OPT-VERTEX-COVER: Given an graph G, find a vertex cover of G with smallest size. OPT-VERTEX-COVER is NP-hard.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

A 2-Approximation for Vertex Cover
Every chosen edge e has both ends in C But e must be covered by an optimal cover; hence, one end of e must be in OPT Thus, there is at most twice as many vertices in C as in OPT. That is, C is a 2-approx. of OPT Running time: O(m) Algorithm VertexCoverApprox(G) Input graph G Output a vertex cover C for G C  empty set H  G while H has edges e  H.removeEdge(H.anEdge()) v  H.origin(e) w  H.destination(e) C.add(v) C.add(w) for each f incident to v or w H.removeEdge(f) return C

Subset Sum Given a set {x1,x2,…,xn} of integers and an integer t, find {y1,y2,…,yk} a subset of {x1,x2,…,xn} such that:

Approximate Solution for Subset Sum
Find a subset {y1,y2,…,yk} from {x1,x2,…,xn} such that y1+y2+…+yk t Minimize (y1+y2+…+yk )/(z1+z2+…+zm ), Where z1+z2+…+zm is the optimal solution such that z1+z2+…+zm t and t-(z1+z2+…+zm ) is minimal

Subset Sum To prove NP-complete: Prove is in NP
Verifiable in polynomial time Give a nondeterministic algorithm Reduction from a known NP-complete problem to subset sum Reduction from 3SAT to subset sum

Subset Sum is in NP sum = 0 A = {x1,x2,…,xn} for each x in A
y  choice(A) sum = sum + y if ( sum = t ) then success A  A – {y} done fail

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Inequality

Inequality Standard formulas Assume that , we have

Scaling factor Select Each time the difference is scaled by factor
After n time,

Trimming Example: L=< 10, 11, 12, 15, 20 ,21,22, 23, 24, 29> and
It is trimmed to L’={10, 12, 15, 20, 23, 39>

Reduction Goal: Reduce 3SAT to SUBSET-SUM. How:
Let Ф be a 3 conjunctive normal form formula. Build an instance of SUBSET-SUM problem (S, t) such that Ф is satisfiable if and only if there is a subset T of S whose elements sum to t. Prove the reduction is polynomial.

1. Algorithm Input: Ф - 3 conjunctive normal form formula
Variables: x1, x2, …, xl Clauses: c1,c2,…,ck. Output: S, t such that Ф is satisfiable iff there is T subset of S which sums to t.

1. Algorithm (cont.) x1 x2 …. xl c1 c2 ck y1 1 z1 y2 z2 … yl zl g1 h1
z1 y2 z2 yl zl g1 h1 2 g2 h2 gk hk t 4

1. Algorithm (cont.) Each row represents a decimal number.
(yi,xj), (zi,xj) – 1 if i=j, 0 otherwise (yi,cj) – 1 if cj contains variable xi, 0 otherwise (zi,cj) – 1 if cj contains variable x’i, 0 otherwise (gi,xj), (hi,xj) – 0 (gi,cj), (hi,cj) – 1 if i=j, 0 otherwise Each row represents a decimal number. S={y1,z1,..,yl,zl,g1,h1,…,gk,hk} t is the last row in the table.

2. Reduction ‘’ Given a variable assignment which satisfies
Ф, find T. If xi is true then yi is in T, else zi is in T Add gi and/or hi to T such all last k digits of T to be 4.

3. Reduction ‘’ Given T a subset of S which sums to t, find a
variable assignment which satisfies Ф. If yi is in T then xi is true If zi is in T then xi is false

4. Polynomial Table size is (k+l)2 O(n2)

Example

x1 x2 x3 c1 c2 c3 c4 y1 1 z1 y2 z2 y3 z3 g1 h1 2 g2 h2 gk hk t 4

Special Case of the Traveling Salesperson Problem
OPT-TSP: Given a complete, weighted graph, find a cycle of minimum cost that visits each vertex. OPT-TSP is NP-hard Special case: edge weights satisfy the triangle inequality (which is common in many applications): w(a,b) + w(b,c) > w(a,c) b 5 4 a c 7

A 2-Approximation for TSP Special Case
Euler tour P of MST M Algorithm TSPApprox(G) Input weighted complete graph G, satisfying the triangle inequality Output a TSP tour T for G M  a minimum spanning tree for G P  an Euler tour traversal of M, starting at some vertex s T  empty list for each vertex v in P (in traversal order) if this is v’s first appearance in P then T.insertLast(v) T.insertLast(s) return T Output tour T

A 2-Approximation for TSP Special Case - Proof
The optimal tour is a spanning tour; hence |M|<|OPT|. The Euler tour P visits each edge of M twice; hence |P|=2|M| Each time we shortcut a vertex in the Euler Tour we will not increase the total length, by the triangle inequality (w(a,b) + w(b,c) > w(a,c)); hence, |T|<|P|. Therefore, |T|<|P|=2|M|<2|OPT| Output tour T Euler tour P of MST M Optimal tour OPT (at most the cost of P ) (twice the cost of M ) (at least the cost of MST M )

Problem Convert the following spanning tree into a path so that it provides 2-approximation for the traveling saleman probelm. Point out the edges not in the tree.

Set Cover Algorithm SetCoverApprox(G) Input a collection of sets S1…Sm Output a subcollection C with same union F  {S1,S2,…,Sm} C  empty set U  union of S1…Sm while U is not empty Si  set in F with most elements in U F.remove(Si) C.add(Si) Remove all elements in Si from U return C OPT-SET-COVER: Given a collection of m sets, find the smallest number of them whose union is the same as the whole collection of m sets? OPT-SET-COVER is NP-hard Greedy approach produces an O(log n)-approximation algorithm. See § for details.

Final Exam May 11 (Tuesday) 5:45-8:25pm

Randomized Algorithm

Get the apple blind monkey

Randomized algorithm blind monkey

Randomized algorithm blind monkey

Randomized algorithm Randomized select 4 independent paths
Each path has ¼ chance to get apple Each path has 1-1/4=3/4 to get nothing It has chance to fail at all 4 paths It has at least 1-(1/3)=2/3 to get an apple from trying the 4 paths The worst case is that the monkey can get an apple after trying 13 paths

Try 6 Paths It has probability to fail at all 4 paths
It has at least =0.822 probability to get an apple from trying the 4 paths The worst case is that the monkey can get an apple after trying 13 paths

Polynomial Identity Check if a polynomial is constantly equal to zero:

Degree of polynomial The highest exponent among all monomial terms.
A single variable polynomial is converted into the format below, it has degree n For Example,

Degree of Multiple Variable Polynomial
The polynomial has multi-degree if the highest degree (exponent) of is The degree of a variable in a multiple variable polynomial is its highest exponent. For example: the following polynomial has multi-degree (30, 100)

Fact Each nonzero single variable polynomial of degree n has at n most different real roots.

Randomized algorithm Polynomial Identity
Assume the polynomial P(x) has degree n Randomly select n+1 different real numbers P(x) is zero iff all of are zero

Checking the identity of two lists
Given two lists of integers, check if they will be the same after sorting. 5,1, 9,1,4 and 1, 4, 1,9,5

Two algorithms Check after they are sorted. Time: O(n log n)
Convert into two polynomials Time: O(n)

Example For the polynomial P(x) below, if let x=1,2,3, P(1)=P(2)=P(3)=0.

Example For the polynomial P(x) below, if let x=1,2,3, then P(1)=P(2)=0, but P(3)=2.

Two variables polynomial has infinite roots
The polynomial has infinite roots. It represents the circle of radius one and center at origin.

Two variables polynomial has infinite roots
The polynomial has infinite roots. It represents the circle of radius one and center at origin.

Randomized Algorithm Randomly select a point (x,y) on the plane, if the point is not in the circle boundary, then

Convert the multiple variable polynomial
The polynomial can be converted into the format:

Convert two variables polynomial
The polynomial of multi-degree can be converted into the format: Where each has degree at most

Convert the multiple variable polynomial
For polynomial Replace y by

Convert two variables polynomial
The polynomial of multi-degree can be converted into the format: Where each has multi-degree at most

Convert two variables polynomial
For the polynomial of multi-degree Replace y by Which is and has degree at most

Convert multiple variables polynomial into single variable poly.
For the polynomial of multi-degree It can be converted into a single variable polynomial of degree Furthermore, is not zero iff is not zero

Randomized algorithm for multiple variables polynomial
Input: the polynomial of multi-degree Convert it into a single variable polynomial of degree (at most) Randomly select an integer z in Evaluate , if P(…) is zero, then is zero. Otherwise, Q(z) is zero with chance <1/1000

Big Open Problem Is there any deterministic algorithm such that
given a polynomial , the algorithm decides if it is identical to zero in steps, where c is a constant and n is the length of the input polynomial.

Degree The degree of a monomial is For example, has degree 3+21+7=31.
The degree of a multi-variable polynomial is the largest degree of its monomials after sum of product expansion.

Schwartz-Zippel Theorem
Let be a multivariate polynomial of degree d. Fix a set of integer S, and let be chosen randomly and uniformly from S. If Then with probability at most ,

Proof Basis: The number of variables is one.
The polynomial has at most d different roots. So, with probability at most , Hypothesis: if the number of variables is n, then with probability at most

Induction: The number of variables is n+1.
Write With probability at most , If with probability at most , Therefore, with probability at most

Application Find the perfect matching of a bipartite.
Convert it into determinate. Check if the determinate is zero.

Problem: Convert the multiple variable polynomial
For polynomial P(x,y)= Use the previous method to convert it into one variable polynomial Q(x) so that P(x,y) is identical to zero iff Q(x) is identical to zero