Richard Anderson Lecture 1

Richard Anderson Lecture 1
CSE 421 Algorithms Richard Anderson Lecture 1

Course Introduction Instructor Teaching Assistant
Richard Anderson, Teaching Assistant Yiannis Giotas,

All of Computer Science is the Study of Algorithms

Mechanics It’s on the web Weekly homework Midterm Final exam
Subscribe to the mailing list

Text book Algorithm Design Jon Kleinberg, Eva Tardos
Read Chapters 1 & 2

How to study algorithms
Zoology Mine is faster than yours is Algorithmic ideas Where algorithms apply What makes an algorithm work Algorithmic thinking

Introductory Problem: Stable Matching
Setting: Assign TAs to Instructors Avoid having TAs and Instructors wanting changes E.g., Prof A. would rather have student X than her current TA, and student X would rather work for Prof A. than his current instructor.

Formal notions Perfect matching Ranked preference lists Stability m1
w1 m2 w2

Examples m1: w1 w2 m2: w2 w1 w1: m1 m2 w2: m2 m1 m1: w1 w2 m2: w1 w2

Examples m1: w1 w2 m2: w2 w1 w1: m2 m1 w2: m1 m2

Intuitive Idea for an Algorithm
m proposes to w If w is unmatched, w accepts If w is matched to m2 If w prefers m to m2, w accepts If w prefers m2 to m, w rejects Unmatched m proposes to highest w on its preference list

Algorithm Initially all m in M and w in W are free
While there is a free m w highest on m’s list that m has not proposed to if w is free, then match (m, w) else suppose (m2, w) is matched if w prefers m to m2 unmatch (m2, w) match (m, w)

Does this work? Does it terminate? Is the result a stable matching?
Begin by identifying invariants and measures of progress m’s proposals get worse Once w is matched, w stays matched w’s partners get better

Claim: The algorithms stops in at most n2 steps
Why?

The algorithm terminates with a perfect matching
Why?

The resulting matching is stable
Suppose m1 prefers w2 to w1 w2 prefers m1 to m2 How could this happen? m1 w1 m2 w2

Announcements Office Hours Homework Reading Richard Anderson, CSE 582
Monday, 10:00 – 11:00 Friday, 11:00 – 12:00 Yiannis Giotas, CSE 220 Monday, 2:30-3:20 Friday, 2:30-3:20 Homework Assignment 1, Due Wednesday, October 5 Reading Read Chapters 1 & 2

Stable Matching Find a perfect matching with no instabilities
Instability (m1, w1) and (m2, w2) matched m1 prefers w2 to w1 w2 prefers m1 to m2 m1 w1 m2 w2

Intuitive Idea for an Algorithm
m proposes to w If w is unmatched, w accepts If w is matched to m2 If w prefers m to m2, w accepts If w prefers m2 to m, w rejects Unmatched m proposes to highest w on its preference list

Algorithm Initially all m in M and w in W are free
While there is a free m w highest on m’s list that m has not proposed to if w is free, then match (m, w) else suppose (m2, w) is matched if w prefers m to m2 unmatch (m2, w) match (m, w)

Does this work? Does it terminate? Is the result a stable matching?
Begin by identifying invariants and measures of progress m’s proposals get worse Once w is matched, w stays matched w’s partners get better

Claim: The algorithm stops in at most n2 steps
Why? Each m asks each w at most once

The algorithm terminates with a perfect matching
Why? If m is free, there is a w that has not been proposed to

The resulting matching is stable
Suppose m1 prefers w2 to w1 w2 prefers m1 to m2 How could this happen? m1 w1 m2 w2 m1 proposed to w2 before w1 m2 rejected m1 m2 prefers m3 to m1 m2 prefers m2 to m3

Result Simple, O(n2) algorithm to compute a stable matching Corollary
A stable matching always exists

A closer look Stable matchings are not necessarily fair m1: w1 w2 w3
w1: m2 m3 m1 w2: m3 m1 m2 w3: m1 m2 m3 m2 w2 m3 w3

Algorithm under specified
Many different ways of picking m’s to propose Surprising result All orderings of picking free m’s give the same result Proving this type of result Reordering argument Prove algorithm is computing something mores specific Show property of the solution – so it computes a specific stable matching

Proposal Algorithm finds the best possible solution for M
And the worst possible for W (m, w) is valid if (m, w) is in some stable matching best(m): the highest ranked w for m such that (m, w) is valid S* = {(m, best(m)} Every execution of the proposal algorithm computes S*

Proof Argument by contradiction
Suppose the algorithm computes a matching S different from S* There must be some m rejected by a valid partner. Let m be the first man rejected by a valid partner w. w rejects m for m1. w = best(m)

S+ stable matching including (m, w) Suppose m1 is paired with w1 in S+
m1 prefers w to w1 w prefers m1 to m Hence, (m1, w) is an instability in S+ m w m1 w1 Since m1 could not have been rejected by w1 at this point, because (m, w) was the first valid pair rejected. (m1, v1) is valid because it is in S+. m w m1 w1

The proposal algorithm is worst case for W
In S*, each w is paired with its worst valid partner Suppose (m, w) in S* but not m is not the worst valid partner of w S- a stable matching containing the worst valid partner of w Let (m1, w) be in S-, w prefers m to m1 Let (m, w1) be in S-, m prefers w to w1 (m, w) is an instability in S- m w1 w prefers m to m1 because m1 is the wvp w prefers w to w1 because S* has all the bvp’s m1 w

Could you do better? Is there a fair matching
Design a configuration for problem of size n: M proposal algorithm: All m’s get first choice, all w’s get last choice W proposal algorithm: All w’s get first choice, all m’s get last choice There is a stable matching where everyone gets their second choice

Key ideas Formalizing real world problem
Model: graph and preference lists Mechanism: stability condition Specification of algorithm with a natural operation Proposal Establishing termination of process through invariants and progress measure Underspecification of algorithm Establishing uniqueness of solution

Classroom Presenter Project
Understand how to use Pen Computing to support classroom instruction Writing on electronic slides Distributed presentation Student submissions Classroom Presenter 2.0, started January 2002 Classroom Presenter 3.0, started June 2005

Key ideas for Stable Matching
Formalizing real world problem Model: graph and preference lists Mechanism: stability condition Specification of algorithm with a natural operation Proposal Establishing termination of process through invariants and progress measure Underspecification of algorithm Establishing uniqueness of solution

Question Goodness of a stable matching:
Add up the ranks of all the matched pairs M-rank, W-rank Suppose that the preferences are completely random If there are n M’s, and n W’s, what is the expected value of the M-rank and the W-rank

What is the run time of the Stable Matching Algorithm?
Initially all m in M and w in W are free While there is a free m w highest on m’s list that m has not proposed to if w is free, then match (m, w) else suppose (m2, w) is matched if w prefers m to m2 unmatch (m2, w) match (m, w) Executed at most n2 times

O(1) time per iteration Find free m Find next available w
If w is matched, determine m2 Test if w prefer m to m2 Update matching

What does it mean for an algorithm to be efficient?

Definitions of efficiency
Fast in practice Qualitatively better worst case performance than a brute force algorithm

Polynomial time efficiency
An algorithm is efficient if it has a polynomial run time Run time as a function of problem size Run time: count number of instructions executed on an underlying model of computation T(n): maximum run time for all problems of size at most n

Polynomial Time Algorithms with polynomial run time have the property that increasing the problem size by a constant factor increases the run time by at most a constant factor (depending on the algorithm)

Why Polynomial Time? Generally, polynomial time seems to capture the algorithms which are efficient in practice The class of polynomial time algorithms has many good, mathematical properties

Ignoring constant factors
Express run time as O(f(n)) Emphasize algorithms with slower growth rates Fundamental idea in the study of algorithms Basis of Tarjan/Hopcroft Turing Award

Why ignore constant factors?
Constant factors are arbitrary Depend on the implementation Depend on the details of the model Determining the constant factors is tedious and provides little insight

Why emphasize growth rates?
The algorithm with the lower growth rate will be faster for all but a finite number of cases Performance is most important for larger problem size As memory prices continue to fall, bigger problem sizes become feasible Improving growth rate often requires new techniques

Formalizing growth rates
T(n) is O(f(n)) [T : Z+  R+] If sufficiently large n, T(n) is bounded by a constant multiple of f(n) Exist c, n0, such that for n > n0, T(n) < c f(n) T(n) is O(f(n)) will be written as: T(n) = O(f(n)) Be careful with this notation

Prove 3n2 + 5n + 20 is O(n2) Choose c = 6, n0 = 5

Lower bounds T(n) is (f(n)) Warning: definitions of  vary
T(n) is at least a constant multiple of f(n) There exists an n0, and  > 0 such that T(n) > f(n) for all n > n0 Warning: definitions of  vary T(n) is (f(n)) if T(n) is O(f(n)) and T(n) is (f(n))

Useful Theorems If lim (f(n) / g(n)) = c for c > 0 then f(n) = (g(n)) If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n))) If f(n) is O(h(n)) and g(n) is O(h(n)) then f(n) + g(n) is O(h(n))

Ordering growth rates For b > 1 and x > 0
logb n is O(nx) For r > 1 and d > 0 nd is O(rn)

Announcements Homework 2, Due October 12, 1:30 pm. Reading Chapter 3
Start on Chapter 4

Polynomial time efficiency
An algorithm is efficient if it has a polynomial run time Run time as a function of problem size Run time: count number of instructions executed on an underlying model of computation T(n): maximum run time for all problems of size at most n

Polynomial Time Algorithms with polynomial run time have the property that increasing the problem size by a constant factor increases the run time by at most a constant factor (depending on the algorithm)

Why Polynomial Time? Generally, polynomial time seems to capture the algorithms which are efficient in practice The class of polynomial time algorithms has many good, mathematical properties

Constant factors and growth rates
Express run time as O(f(n)) Ignore constant factors Prefer algorithms with slower growth rates Fundamental ideas in the study of algorithms Basis of Tarjan/Hopcroft Turing Award

Why ignore constant factors?
Constant factors are arbitrary Depend on the implementation Depend on the details of the model Determining the constant factors is tedious and provides little insight

Why emphasize growth rates?
The algorithm with the lower growth rate will be faster for all but a finite number of cases Performance is most important for larger problem size As memory prices continue to fall, bigger problem sizes become feasible Improving growth rate often requires new techniques

Formalizing growth rates
T(n) is O(f(n)) [T : Z+  R+] If sufficiently large n, T(n) is bounded by a constant multiple of f(n) Exist c, n0, such that for n > n0, T(n) < c f(n) T(n) is O(f(n)) will be written as: T(n) = O(f(n)) Be careful with this notation

Prove 3n2 + 5n + 20 is O(n2) Choose c = 6, n0 = 5

Lower bounds T(n) is (f(n)) Warning: definitions of  vary
T(n) is at least a constant multiple of f(n) There exists an n0, and  > 0 such that T(n) > f(n) for all n > n0 Warning: definitions of  vary T(n) is (f(n)) if T(n) is O(f(n)) and T(n) is (f(n))

Useful Theorems If lim (f(n) / g(n)) = c for c > 0 then f(n) = (g(n)) If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n))) If f(n) is O(h(n)) and g(n) is O(h(n)) then f(n) + g(n) is O(h(n))

Ordering growth rates For b > 1 and x > 0
logb n is O(nx) For r > 1 and d > 0 nd is O(rn)

Graph Theory G = (V, E) Undirected graphs Directed graphs
Explain that there will be some review from 326 Graph Theory G = (V, E) V – vertices E – edges Undirected graphs Edges sets of two vertices {u, v} Directed graphs Edges ordered pairs (u, v) Many other flavors Edge / vertices weights Parallel edges Self loops By default |V| = n and |E| = m

Definitions Path: v1, v2, …, vk, with (vi, vi+1) in E Distance
Simple Path Cycle Simple Cycle Distance Connectivity Undirected Directed (strong connectivity) Trees Rooted Unrooted

Graph search Find a path from s to t S = {s}
While there exists (u, v) in E with u in S and v not in S Pred[v] = u Add v to S if (v = t) then path found

Breadth first search Explore vertices in layers s in layer 1
Neighbors of s in layer 2 Neighbors of layer 2 in layer

Key observation All edges go between vertices on the same layer or adjacent layers 1 2 3 4 5 6 7 8

Bipartite A graph V is bipartite if V can be partitioned into V1, V2 such that all edges go between V1 and V2 A graph is bipartite if it can be two colored

Testing Bipartiteness
If a graph contains an odd cycle, it is not bipartite

Algorithm Run BFS Color odd layers red, even layers blue
If no edges between the same layer, the graph is bipartite If edge between two vertices of the same layer, then there is an odd cycle, and the graph is not bipartite

Richard Anderson Lecture 5 Graph Theory
CSE 421 Algorithms Richard Anderson Lecture 5 Graph Theory

Announcements Monday’s class will be held in CSE 305 Reading Chapter 3
Start on Chapter 4