# CSE 830: Design and Theory of Algorithms

## Presentation on theme: "CSE 830: Design and Theory of Algorithms"— Presentation transcript:

CSE 830: Design and Theory of Algorithms
Dr. Eric Torng Welcome to CSE 830, the Design and Theory of Algorithms. My name is Charles, and I will be your instructor for this class. The goal of the class is to teach you the fundamentals of creating and analyzing computer algorithms. Before we get to the actual material, however, I need to go through a few administrative details. While I take attendance, I have some handouts for you to be reading over.

Overview Administrative stuff… Lecture schedule overview
What is an algorithm? What is a problem? How we study algorithms Some examples

What is an Algorithm? According to the Academic American Encyclopedia:
An algorithm is a procedure for solving a usually complicated problem by carrying out a precisely determined sequence of simpler, unambiguous steps. Such procedures were originally used in mathematical calculations (the name is a variant of algorism, which originally meant the Arabic numerals and then "arithmetic") but are now widely used in computer programs and in programmed learning.

What is an Algorithm? Algorithms are the ideas behind computer programs. An algorithm is the thing that stays the same whether the program is in C++ running on a Cray in New York or is in BASIC running on a Macintosh in Katmandu! To be interesting, an algorithm has to solve a general, specified problem.

What is a problem? Definition Problem Specification Example: Sorting
A mapping/relation between a set of input instances (domain) and an output set (range) Problem Specification Specify what a typical input instance is Specify what the output should be in terms of the input instance Example: Sorting Input: A sequence of N numbers a1…an Output: the permutation (reordering) of the input sequence such that a1  a2  …  an .

Types of Problems Search: find X in the input satisfying property Y
Structuring: Transform input X to satisfy property Y Construction: Build X satisfying Y Optimization: Find the best X satisfying property Y Decision: Does X satisfy Y? Adaptive: Maintain property Y over time. Next…. go over schedule!

Two desired properties of algorithms
Correctness Always provides correct output when presented with legal input Efficiency What does efficiency mean?

Example: Odd or Even? What is the best way to determine if a number is odd or even? Some of the algorithms we can try are: Count up to that number from one and alternate naming each number as odd or even. Factor the number and see if there are any twos in the factorization. Keep a lookup table of all numbers from 0 to the maximum integer. Look at the last bit (or digit) of the number.

Example: TSP Input: A sequence of N cities with the distances dij between each pair of cities Output: a permutation (ordering) of the cities <c1’, …, cn’> that minimizes the expression S{j =1 to n-1} dj’,j’+1 + dn’,1’

Example Circuit

Not Correct!

A better algorithm? What if we always connect the closest pair of points that do not result in a cycle or a three-way branch? We finish when we have a single chain.

Now it works…

Still Not Correct!

A Correct Algorithm We could try all possible orderings of the points, then select the ordering which minimizes the total length: d =  For each of the n! permutations, Pi of the n points, if cost(Pi) < d then d = cost(Pi) Pmin = Pi return Pmin

Areas of study of algorithms
How to devise correct and efficient algorithms for solving a given problem How to express algorithms How to validate/verify algorithms How to analyze algorithms How to prove (or at least indicate) no correct, efficient algorithm exists for solving a given problem

How to devise algorithms
Something of an art form Cannot be fully automated We will describe some general techniques and try to illustrate when each is appropriate Major focus of course

Expressing Algorithms
Implementations Pseudo-code English NOT our point of emphasis in this course In this class, most algorithms will be expressed in plain English with pseudo-code used to clear up any ambiguities.

Verifying algorithm correctness
Proving an algorithm generates correct output for all inputs One technique covered in textbook Loop invariants We will do some of this in the course, but it is not emphasized as much as algorithm design or algorithm analysis

Analyzing algorithms The “process” of determining how much resources (time, space) are used by a given algorithm We want to be able to make quantitative assessments about the value (goodness) of one algorithm compared to another We want to do this WITHOUT implementing and running an executable version of an algorithm Major point of emphasis of course Question: How can we study the time complexity of an algorithm if we don’t run it or even choose a specific machine to measure it on?

The RAM Model Each "simple" operation (+, -, =, if, call) takes exactly 1 step. Loops and subroutine calls are not simple operations, but depend upon the size of the data and the contents of a subroutine. We do not want “sort” to be a single step operation. Each memory access takes exactly 1 step. Basically, any basic operation that takes a small, fixed amount of time we assume to take just one step. We measure the run time of an algorithm by counting the number of steps it takes. Why does this work? For the same reason that the “Flat Earth” model works. In our day-to-day lives we assume the Earth to be flat! Now, lets look at how we can use this.

Measuring Complexity The worst case complexity of the algorithm is the function defined by the maximum number of steps taken on any instance of size n. The best case complexity of the algorithm is the function defined by the minimum number of steps taken on any instance of size n. The average-case complexity of the algorithm is the function defined by an average number of steps taken on any instance of size n.

Best, Worst, and Average Case

Case Study: Insertion Sort
One way to sort an array of n elements is to start with an empty list, then successively insert new elements into their proper position, scanning from the “right end” back to the “left end” Loop invariant used to prove correctness How efficient is insertion sort?

Correctness Analysis What is the loop invariant? That is, what is true before each execution of the loop? for i = 2 to n key = A[i] j = i - 1 while j > 0 AND A[j] > key A[j+1] = A[j] j = j -1 A[j+1] = key

Exact Analysis Count the number of times each line will be executed:
Num Exec. for i = 2 to n (n-1) + 1 key = A[i] n-1 j = i n-1 while j > 0 AND A[j] > key ? A[j+1] = A[j] ? j = j ? A[j+1] = key n-1

Best Case

Worst Case

Average Case

Average Case (Zoom Out)

[3] [4] [4] [7] [9] [3] [7] [4] [9] [4] 1: for i = 2 to n
2: key = A[i] 3: j = i - 1 4: while j > 0 AND A[j] > key 5: A[j+1] = A[j] 6: j = j -1 7: A[j+1] = key [3] [7] [4] [9] [4] 1: i = 3 2: key = 4 3: j = 2 4: (while true!) 5: A[3] = A[2] [3] [7] [7] [9] [4] 6: j = 1 4: (while false!) 7: A[2] = 4 [3] [4] [7] [9] [4] 1: i = 4 2: key = 9 3: j = 3 4: (while false!) 7: A[4] = 9 [3] [4] [7] [9] [4] 1: i = 5 2: key = 4 3: j = 4 4: (while true!) 5: A[5] = A[4] [3] [4] [7] [9] [9] 6: j = 3 5: A[4] = A[3] [3] [4] [7] [7] [9] 6: j = 2 4: (while false!) 7: A[3] = 4 [3] [4] [4] [7] [9] [3] [7] [4] [9] [4] 1: i = 2 2: key = 7 3: j = 1 4: ( while false! ) 7: A[2] = 7

Measuring Complexity What is the best way to measure the time complexity of an algorithm? - Best case run time? - Worst case run time? - Average case run time? Which should we try to optimize?

Best Case Analysis How can we modify almost any algorithm to have a good best-case running time? Solve one instance of it efficiently.

Average case analysis Based on a probability distribution of input instances Distribution may not be accurate Often more complicated to compute than worst case analysis Often worst case analysis is a pretty good indicator for average case Though not always true: quicksort, simplex method

Best, Worst, and Average Case

Worst case analysis Need to find and understand input that causes worst case performance Typically much simpler to compute as we do not need to “average” out performance Provides guarantee that is independent of any assumptions about the input The standard analysis performed