Presentation is loading. Please wait.

Presentation is loading. Please wait.

Algorithm design and Analysis

Similar presentations


Presentation on theme: "Algorithm design and Analysis"— Presentation transcript:

1 Algorithm design and Analysis
算法设计与分析 Algorithm design and Analysis 叶 德 仕 计算机学院

2 课程基本信息 课程编号: 21190120 上课时间、地点:2011年 秋学期 上机:周四(9、10)软件学院机房 考试时间:?
周二(9、10)曹光彪 西104 、周四(7~8)曹光彪 西104 上机:周四(9、10)软件学院机房 考试时间:? 考试形式:? 学时/学分:4-2/周 学时/ 2-1 学分 9/19/2018

3 Office & Homepage Office:工商楼 215
My Homepage: Course Home: 9/19/2018

4 Examination New Grading Polices Grading Polices: Class attendance 10%
Homework (or quiz): 25% Programming Project: 20% Final Exam: 45% 课堂讨论或随堂测验、报告20% 作业 25% 大程 15% 期末考试 40% 9/19/2018

5 Algorithms Programming
9/19/2018

6 What is the position of algorithms in CS
1. Linguists: what shall we talk to the machines? 2. Algorithms: what is a good method for solving a problem fast on my computer 3. Architects: Can I build a better computer? 4. Sculptors of Machine Intelligence: Can I write a computer program that can find its own solution. 9/19/2018

7 Algorithms in Computer Science
Hardware Algorithms Compilers, Programming languages Networking, Distributed systems, Fault tolerance, Security Machine learning, Statistics, Information retrieval, AI Bioinformatics 9/19/2018

8 MIT Undergraduate Programs
9/19/2018

9 What is algorithm? (Oxford Dict.)Algorithm: From Math world
A set of rules that must be followed when solving a particular problem. From Math world A specific set of instructions for carrying out a procedure or solving a problem, usually with the requirement that the procedure terminate at some point. An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. 9/19/2018

10 2009 Charles P. Thacker For his pioneering design and realization of the Alto, the first modern personal computer, and in addition for his contributions to the Ethernet and the Tablet PC. 9/19/2018

11 What will CS be? Ed Lazowska (Washington)
Computer Science: Past, Present, and Future Ed Lazowska (Washington) Computer Science is the new Math Christos H. Papadimitriou (Berkeley) 9/19/2018

12 Algorithm Problem definition 问题 Objective 目标 (very important)
Evaluation 算法评价 Methods 方法 9/19/2018

13 Algorithm evaluation Quality: Cost:
how far away from the optimal solution ? Cost: Running time Space needed Our goal is to design algorithm with high quality, but in low cost 9/19/2018

14 Reasonable times Poly(|I|), Time polynomial in |I|, where |I| is the size of the problem instance Input size: size(x) of an instance x with rational data is the total number of bits needed for the binary prepresentation. Integer ? Rational 9/19/2018

15 Time complexity logarithmic time if T(n) = O(log n).
sub-linear time if T(n) = o(n) linear time, or O(n) time linearithmic function: T(n) = O(n log n), quasilinear time if T(n) = O(n logk n) polynomial time: T(n) = O(nk) for some constant k strongly polynomial time: the number of operations in the arithmetic model of computation is bounded by a polynomial in the number of integers in the input instance; and the space used by the algorithm is bounded by a polynomial in the size of the input. weakly polynomial time: P but not strongly P 9/19/2018

16 Time complexity Exponential time, if T(n) is upper bounded by 2poly(n)
Quasi-polynomial time: for some fixed c. Sub-exponential time if T(n) = 2o(n) Exponential time, if T(n) is upper bounded by 2poly(n) 9/19/2018

17 Hardness of problems Polynomial (e.g. n2, n log n, n3, n1000).
Easy Polynomial (e.g. n2, n log n, n3, n1000). Quasi-polynomial(e.g.:n log n, n log2n, c log7n). Sub-exponential (e.g.: 2√n, 5(n0.98)). Exponential (e.g.: 2n, 8n, n!, nn). Hard 9/19/2018

18 Running time Computer A is 100 times faster than computer B
Sort n numbers Computer A requires instructions Computer B requires 50nlgn instructions n = 1,000, 000 Computer A: 2(10^6)^2/10^9 = 2000 seconds Computer B: 50*10^6 lg 10^6/10^7 ~ 100 seconds 9/19/2018

19 Running time 10 < 1 s < 1s 4 s 100 1 s 18 min year 1,000
Very long 10,000 2 min 12 day 20 s 12 days 31710 year n 2 n n l o g n 3 2 n n ! 1 2 5 1 6 9/19/2018

20 Sorting < a a … > a , , , Input: 8 2 4 9 3 6 Output: 2 3 4 6 8 9
输入:A sequence of n number 输出:排列(permutation ) < a a > 1 2 a , , , n < a a a > 1 , 2 , , n 使得: <= <= <= a a ... a 1 2 Example: n Input: Output: 9/19/2018

21 EX. of insertion sort 8 2 4 9 3 6 9/19/2018

22 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 9/19/2018

23 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 9/19/2018

24 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 9/19/2018

25 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 9/19/2018

26 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 9/19/2018

27 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 9/19/2018

28 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 9/19/2018

29 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 9/19/2018

30 EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 2 3 4 6 8 9 done 9/19/2018

31 Insertion sort “pseudocode” A: key sorted
INSERTION-SORT (A, n) ⊳ A[1 . . n] for j ← 2 to n do key ← A[ j] i ← j – 1 while i > 0 and A[i] > key do A[i+1] ← A[i] i ← i – 1 A[i+1] = key “pseudocode” 1 i j n A: key sorted 9/19/2018

32 Analyzing algorithms Need a computational model
Random-access machine (RAM) model Instructions are executed one after another. No concurrent operations. Arithmetic: add, subtract, multiply, divide, remainder, floor, ceiling Data movement: load, store, copy Control: conditional/unconditional branch, subroutine call and return. Each of these instructions takes a constant amount of time. 9/19/2018

33 Running time Running time: Input size:
The running time of an algorithm on a particular input is the number of primitive operations or “steps” executed. line consists only of primitive operations and takes constant time Input size: number of items the total number of bits. more than one number: Graph the number of vertices and the number of edges 9/19/2018

34 Example: The input size of sorting problem is n. Worst-case running time of Insert sort is O(n2). 9/19/2018

35 Running time The running time depends on the input: an already sorted sequence is easier to sort. Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones. Generally, we seek upper bounds on the running time, because everybody likes a guarantee. 9/19/2018

36 Map of Algorithm Design
New problem Off-line problem On-line problem Polynomial Polynomial NP-C problem Quality Appro. ratio Exact Algorithm Approximate Algorithm Heuristic Improve cost running time Improve cost running time Quality Appro. ratio 9/19/2018

37 课程内容 1. 数学基础 2. 基本算法 2.1 分治 (Divide-and-Conquer)* 1.1 算法基础
1.1 算法基础 1.2 和 (SUMS) 集合运算 (Sets) 1.3 特殊数 (Stirling numbers, Harmonic numbers, Eulerian numbers et al.) 2. 基本算法 2.1 分治 (Divide-and-Conquer)* 2.1.1 Mergesort * 2.1.2 自然数相乘(Multiplication)* 2.1.3 矩阵相乘(Matrix multiplication) 2.1.4 Discrete Fourier transform and Fast Fourier transform 9/19/2018

38 课程内容 2.2 动态规划 (Dynamic Programming) 2.2.1 背包问题(Knapsack problem)
2.2.2 最长递增子序列(Longest increasing subsequence) 2.2.3 Sequence alignment 2.2.4 最长相同子序列(Longest common subsequence) 2.3.5 Matrix-chain multiplication 2.3.6 树上的独立集 (Max Independent set in tree) 9/19/2018

39 课程内容 2.3 贪婪算法 (Greedy) 2.4 NP 问题 (NP-completeness)
2.3.1 区间规划(Interval scheduling) 2.3.2 集合覆盖(Set cover) 2.3.3 拟阵(Matroids) 2.4 NP 问题 (NP-completeness) 2.4.1 The classes P and NP 2.4.2 NP-completeness and reducibility 2.4.3 NP-complete problems * 9/19/2018

40 课程内容 2.5 近似算法 (Approximate Algorithm) 2.5.1 顶点覆盖问题 (Vertex cover)
2.5.2 负载平衡问题 (Load balancing) 2.5.3 旅行商问题 (Traveling salesman problem) 2.5.4 子集和问题 (Subset sum problem) 9/19/2018

41 课程内容 3. 算法的应用 3.1 局部搜索 (Local Search)
3.1.1 The Metropolis Algorithm and Simulated Annealing 3.1.2 Local Search to Hopfield Neural Networks(Nash Equilibria) 3.1.3 Maximum Cut Approximation via Local Search 9/19/2018

42 课程内容 3.2 图论 (Graph Theorem) * 3.3计算几何学 (Computational Geometry)*
3.2.1 图论的基本知识 (Fundamental) 3.2.2 线性规划 (Linear Programming) 网络流(Network Flow),二分图,完全图的匹配 3.3计算几何学 (Computational Geometry)* 3.3.1 基本概念与折线段的性质 (Line-segment ) 3.3.2 线段的一些性质 (Segments intersects ) 3.3.3 凸包问题 (Convex Hull ) 3.3.4 最近点对问题 (The closet pair of points) 3.3.5 多边形三角剖分 (Polygon Triangulation) 9/19/2018

43 课程内容 3.4 随机算法 (Randomized Algorithm) 3.5 在线算法(Online Algorithm)
3.4.1 随机变量与期望 3.4.2 A Randomized MAX-3-SAT 3.4.3 Randomized Divide-and-Conquer 3.5 在线算法(Online Algorithm) 3.5.1 Online Skying 3.5.2 Online Hiring *:备选内容 9/19/2018

44 课程内容 9/19/2018

45 教材 Textbook: Introduction to algorithms, Second Edition. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein. The MIT Press, ISBN: Recommended: Algorithm Design. Jon Kleinberg, Éva Tardos. Addison Wesley, ISBN: Clifford Stein: Columbia University. Thomas H. Cormen: darmouth. Charles Leiserson, Ronald L. Rivest :MIT. Rolf Nevanlinna prize starts from 1982 for each four year. Rolf Nevanlinna Prize, 06 9/19/2018

46 9/19/2018

47 参考教材 Algorithms. S. Dasgupta, C.H. Papadimitriou, and U. V. Vazirani. May 2006. Combinatorial Algorithms. Jeff Erickson. University of Illinois, Urbana-Champaign. Lecture Notes. Fall 2002. Concrete Mathematics. Ronald L. Graham, Donald E. Knuth, Oren Patashnik. Addison-Wesley Publishing Company, ISBN: o 9/19/2018

48 Algorithms in Computer Science
P = NP ? Can we solve a problem efficiently? Tradeoff between quality of solution and the running time Solve a problem with optimal solution, but it might cost long time Solve a problem approximately in short time 9/19/2018

49 $1,000,000 problem P = NP ? http://www.claymath.org/millennium/
Solved???!!!! 9/19/2018

50 Algorithms in Computer Science
Selfish Routing Privacy preserve in database TSP Ad auction 9/19/2018

51 Perspective Algorithms we can find everywhere
They have been developed to easy our daily life Train/Airplane timetable schedule Routing We live in the age of information Text, numbers, images, video, audio Human Genome project Internet Electronic commerce Manufacturing 9/19/2018

52 Selfish routing Pigou's Example Suburb s, a nearby train station t.
Assuming that all drivers aim to minimize the driving time from s to t C(x) = 1 Suburb: s t C(x) = x, with x in [0, 1] 9/19/2018

53 Selfish routing We have good reason to expect all traffic to follow the lower road Social optimal? ½ to the long, wide highway, ½ to the lower road. selfish behavior need not produce a socially optimal outcome 9/19/2018

54 Braess's Paradox v C(x) = x C(x) = 1 s t C(x) = 1 C(x) = x w 9/19/2018

55 Braess's Paradox v C(x) = x C(x) = 1 C(x) = 0 s t C(x) = 1 C(x) = x w
9/19/2018

56 Braess's Paradox Paradox thus shows that the intuitively helpful action of adding a new zero-cost link can negatively impact all of the traffic! With selfish routing, network improvements can degrade network performance. 9/19/2018

57 Link attack example Re-identify the medical record of the governor of Massachussetts MA collects and publishes sanitized medical data for state employees (microdata) left circle voter registration list of MA (publicly available data) right circle looking for governor’s record join the tables: 6 people had his birth date 3 were men 1 in his zipcode regarding the US 1990 census data 87% of the population are unique based on (zipcode, gender, dob)

58 Privacy in microdata the role of attributes in microdata
explicit identifiers are removed quasi identifiers can be used to re-identify individuals sensitive attributes (may not exist!) carry sensitive information Name Birthdate Sex Zipcode Disease Andre 21/1/79 male 53715 Flu Beth 10/1/81 female 55410 Hepatitis Carol 1/10/44 90210 Brochitis Dan 21/2/84 02174 Sprained Ankle Ellen 19/4/72 02237 AIDS identifier quasi identifiers sensitive Name Birthdate Sex Zipcode Disease Andre 21/1/79 male 53715 Flu Beth 10/1/81 female 55410 Hepatitis Carol 1/10/44 90210 Brochitis Dan 21/2/84 02174 Sprained Ankle Ellen 19/4/72 02237 AIDS

59 k-anonymity k-anonymity: intuitively, hide each individual among k-1 others each QI set of values should appear at least k times in the released microdata linking cannot be performed with confidence > 1/k sensitive attributes are not considered (going to revisit this...) how to achieve this? generalization and suppression value perturbation is not considered (we should remain truthful to original values ) privacy vs utility tradeoff do not anonymize more than necessary

60 Advertisement Auction
Dutch auction Vickrey auction Ad placement 9/19/2018

61 k-anonymity example tools for anonymization generalization suppression
publish more general values, i.e., given a domain hierarchy, roll-up suppression remove tuples, i.e., do not publish outliers often the number of suppressed tuples is bounded original microdata 2-anonymous data Birthdate Sex Zipcode 21/1/79 male 53715 10/1/79 female 55410 1/10/44 90210 21/2/83 02274 19/4/82 02237 Birthdate Sex Zipcode group 1 */1/79 person 5**** suppressed 1/10/44 female 90210 group 2 */*/8* male 022**

62 TSP Trucking company with a central warehouse
Each day, it loads up the truck at the warehouse and sends it around to several locations to make deliveries. At the end of the day, the truck must end up back at the warehouse so that it ready to be loaded for the next day. To reduce the costs, the company wants to select an order of delivery stops that yields the lowest overall distance traveled by the truck. 9/19/2018

63 9/19/2018

64 9/19/2018

65 9/19/2018

66 Pizza delivery One can give a call or via internet to order a pizza for dinner We want the hot, fresh and tasty pizzas How should they delivery the pizzas upon the reception of orders?? Immediately or wait some minutes for next orders in the near places? 9/19/2018

67 The Ski problem The Ski problem [Karp 92]: A skier must decide every day she goes skiing whether to rent or buy skis, unless or until she decides to buy them. The skiier doesn’t know how many days she will go on skiing before she gets tired of this hobbie. The cost to rent skis for a day is 1 unit, while the cost to buy the skis is B units. How can she save money? 9/19/2018

68 Lost cow problem A short-sighted cow (or assume it’s dark, or foggy, or ...) is standing in front of a fence and does not know in which direction the only gate in the fence might be. How can the cow find the gate without walking too great a detour? How can two soldiers get together when lost in battlefield ? 9/19/2018

69 Erdős project – shortest path
Paul Erdős( ) has an Erdős number of zero. If the lowest Erdős number of a coauthor is X, then the author's Erdős number is X + 1. 9/19/2018

70 Nevanlinna Prize winners
NAME YEAR COUNTRY ERDÖS NUMBER Robert Tarjan USA Leslie Valiant Hungary/Gt Brtn 3 Alexander Razborov Russia Avi Wigderson Israel Peter Shor USA Madhu Sudan India/USA Jon Kleinberg USA 9/19/2018

71 Other famous people Albert Einstein 1921 Physics 2
Chen Ning Yang Physics Tsung-dao Lee Physics John F. Nash Economics 4 Edmund S. Phelps Economics 4 Shing-Tung Yau China Shiing Shen Chern China Alan Turing computer science John von Neumann mathematics David Hilbert mathematics Donald E. Knuth 9/19/2018

72 Extensions of shortest path
On k-skip Shortest Paths (SIGMOD 2011) 9/19/2018

73 History of Algorithm The word algorithm comes from the name of the 9th century Persian mathematician Abu Abdullah Muhammad ibn Musa al-Khwarizmi whose works introduced Arabic numerals and algebraic concepts. The word algorism originally referred only to the rules of performing arithmetic using Arabic numerals but evolved into algorithm by the 18th century. The word has now evolved to include all definite procedures for solving problems or performing tasks. 9/19/2018

74 History – con. The first case of an algorithm written for a computer was Ada Byron's notes on the analytical engine written in 1842, for which she is considered by many to be the world's first programmer. However, since Charles Babbage never completed his analytical engine the algorithm was never implemented on it. This problem was largely solved with the description of the Turing machine, an abstract model of a computer formulated by Alan Turing, and the demonstration that every method yet found for describing "well-defined procedures" advanced by other mathematicians could be emulated on a Turing machine (a statement known as the Church-Turing thesis). 9/19/2018

75 Why you come here? 9/19/2018

76 Requirement Come to the class (*) Ask questions Thinking:
Why it is ok now? How about other methods? 9/19/2018

77 Kinds of analyses Worst-case: (usually) Average-case: (sometimes)
T(n) = maximum time of algorithm on any input of size n. Average-case: (sometimes) T(n) = expected time of algorithm over all inputs of size n. Need assumption of statistical distribution of inputs. Best-case: (bogus) Cheat with a slow algorithm that works fast on some input. 9/19/2018

78 Uniform distribution 9/19/2018

79 Performance Measures for On-line Algorithms
Competitive ratio Max/Max ratio Smoothed Competitiveness 9/19/2018


Download ppt "Algorithm design and Analysis"

Similar presentations


Ads by Google