Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operations Research: Applications and Algorithms

Similar presentations


Presentation on theme: "Operations Research: Applications and Algorithms"— Presentation transcript:

1 Operations Research: Applications and Algorithms
Chapter 1 An Introduction to Model Building to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston 1

2 1.1 An Introduction to Modeling
Operations research (management science) is a scientific approach to decision making that seeks to best design and operate a system, usually under conditions requiring the allocation of scarce resources. Term coined during WW II when leaders asked scientists and engineers to analyze several military problems. A system is an organization of interdependent components that work together to accomplish the goal of the system. 2

3 The scientific approach to decision making requires the use of one or more mathematical models.
A mathematical model is a mathematical representation of the actual situation that may be used to make better decisions or clarify the situation. 3

4 Example 1: A Modeling Example
Eli Daisy produces the drug Wozac in batches by heating a chemical mixture in a pressurized container. Each time a batch is produced, a different amount of Wozac is produced. The amount produced is the process yield (measured in pounds). Daisy is interested in understanding the factors that influence the yield of Wozac production process. Describe a model-building process for this situation. 4

5 Example 1: Solution Daisy is first interested in determining the factors that influence the process yield. This is a descriptive model since it describes the behavior of the actual yield as a function of various factors. Daisy might determine that the following factors influence yield: Container volume in liters (V) Container pressure in milliliters (P) Container temperature in degrees centigrade (T) Chemical composition of the processed mixture 5

6 Ex. 1: Solution continued
Letting A, B, and C be the percentage of the mixture made up of chemical A, B, and C, then Daisy might find , for example, that: To determine the relationship the yield of the process would have to measured for many different combinations of the factors. Knowledge of this equation would enable Daisy to describe the yield of the production process once volume, pressure, temperature, and chemical composition were known. Yield = V +0.01P T T*P T2 – 0.001P A + 9.4B C + 19A*B A*C – 9.6B*C 6

7 Prescriptive models “prescribe” behavior for an organization that will enable it to best meet its goals. Components of this model include: objective function(s) decision variables constraints An optimization model seeks to find values of the decision variables that optimize (maximize or minimize) an objective function among the set of all values for the decision variables that satisfy the given constraints. 7

8 Ex. 1: Solution continued
The Daisy example seeks to maximize the yield for the production process. In most models, there will be a function we wish to maximize or minimize. This function is called the model’s objective function. To maximize the process yield we need to find the values of V, P, T, A, B, and C that make the yield equation (below) as large as possible. Yield = V +0.01P T T*P T2 – 0.001P2 + 11.7A + 9.4B C + 19A*B A*C – 9.6B*C 8

9 In many situations, an organization may have more than one objective.
For example, in assigning students to the two high schools in Bloomington, Indiana, the Monroe County School Board stated that the assignment of students involve the following objectives: equalize the number of students at the two high schools minimize the average distance students travel to school have a diverse student body at both high schools 9

10 Variables whose values are under our control and influence system performance are called decision variables. In the Daisy example, V, P, T, A, B, and C are decision variables. In most situations, only certain values of the decision variables are possible. For example, certain volume, pressure, and temperature conditions might be unsafe. Also, A, B, and C must be nonnegative numbers that sum to one. These restrictions on the decision variable values are called constraints. 10

11 Ex. 1: Solution continued
Suppose the Daisy example has the following constraints: Volume must be between 1 and 5 liters Pressure must be between 200 and 400 milliliters Temperature must be between 100 and 200 degrees centigrade Mixture must be made up entirely of A, B, and C For the drug to perform properly, only half the mixture at most can be product A. 11

12 Ex. 1: Solution continued
Mathematically, these constraints can be expressed: V ≤ 5 V ≥ 1 P ≤ 400 P ≥ 200 T ≤ 200 T ≥ 100 A ≥ 0 B ≥ 0 C ≥ 0 A + B + C = 1.0 A ≤ 0.5 12

13 Ex. 1: Solution continued
The Complete Daisy Optimization Model Letting z represent the value of the objection function (the yield), the entire optimization model may be written as: Maximize z = V +0.01P T T*P T2 – 0.001P A + 9.4B C + 19A*B A*C – 9.6B*C V ≤ 5 V ≥ 1 P ≤ 400 P ≥ 200 T ≤ 200 T ≥ 100 A + B + C = 1.0 A ≤ 0.5 A ≥ 0 B ≥ 0 C ≥ 0 Subject to (s.t.) 13

14 Ex. 1: Solution continued
Any specification of the decision variables that satisfies all the model’s constraints is said to be in the feasible region. For example, V = 2, P = 300, T = 150, A = 0.4, B = and C = 0.3 is in the feasible region. An optimal solution to an optimization model any point in the feasible region that optimizes (in this case maximizes) the objective function. Using LINGO, it can be determined that the optimal solution to its model is V = 5, P = 200, T = 100, A = , B = 0, C = 0.706, and z = 14

15 Ex. 1: Solution continued
Daisy problem formulation in LINGO 7.0 15

16 Ex. 1: Solution continued
Solution extract of Daisy example (shown without slack, surplus, or dual prices) using LINGO 7.0 16

17 A static model is one in which the decision variables do not involve sequences of decisions over multiple periods. A dynamic model is a model in which the decision variables do involve sequences of decisions over multiple periods. A linear model is one in which decision variables appear in the objective function and in the constraints of an optimization model. The Daisy example is a nonlinear model. In general, nonlinear models are much harder to solve. 17

18 Integer models are much harder to solve than noninteger models.
If one or more of the decision variables must be integer, then we say that an optimization model is an integer model. If all the decision variables are free to assume fractional values, then an optimization model is a noninteger model. The Daisy example is a noninteger example since volume, pressure, temperature, and percentage composition are all decision variables which may assume fractional values. Integer models are much harder to solve than noninteger models. 18

19 A deterministic model is a model in which for any value of the decision variables the value of the objective function and whether or not the constraints are satisfied is known with certainty. If this is not the case, then we have a stochastic model. If we view the Daisy example as a deterministic model, then we are making the assumption that for given values of V, P, T, A, B, and C the process yield will always be the same. Since this is unlikely, the objective function can be viewed as the average yield of the process for given decision variable values. 19

20 1.2 The Seven-Step Model-Building Process
Formulate the Problem Define the problem. Specify objectives. Determine parts of the organization to be studied. Observe the System Determine parameters affecting the problem. Collect data to estimate values of the parameters. Formulate a Mathematical Model of the Problem 20

21 Verify the Model and Use the Model for Prediction
Does the model yield results for values of decision variables not used to develop the model? What eventualities might cause the model to become invalid? Select a Suitable Alternative Given a model and a set of alternative solutions, determine which solution best meets the organizations objectives. 21

22 Present the Results and Conclusion(s) of the Study to the Organization
Present the results to the decision maker(s) If necessary, prepare several alternative solutions and permit the organization to choose the one that best meets their needs. Any non-approval of the study’s recommendations may have stemmed from an incorrect problem definition or failure to involve the decision maker(s) from the start of the project. In such a case, return to step 1, 2, or 3. 22

23 Implement and Evaluate Recommendations
Assist in implementing the recommendations. Monitor and dynamically update the system as the environment and parameters change to ensure that recommendations enable the organization to meet its goals. 23

24 1.3 CITGO Petroleum Klingman et. Al. (1987) applied a variety of management-science techniques to CITGO Petroleum. Their work saved the company an estimated $70 million per year. Focus on two aspects of the CITGO’s team’s work: A mathematical model to optimize the operation of CITGO’s refineries. A mathematical model – supply, distribution and marketing (SDM) system – used to develop an week supply, distribution and marketing plan for the entire business. 24

25 Optimizing Refinery Operations
Step 1 (Formulate the Problem) Klingman et. Al. wanted to minimize the cost of CITGO’s refineries. Step 2 (Observe the System) The Lake Charles, Louisiana, refinery was closely observed in an attempt to estimate key relationships such as: How the cost of producing each of CITGO’s products (motor fuel, no. 2 fuel oil, turbine fuel, naphtha, and several blended motor fuels) depends upon the inputs used to produce each product. The amount of energy needed to produce each product (requiring the installation of a new metering system). 25

26 The yield associated with each input – output combination
The yield associated with each input – output combination. For example, if 1 gallon of crude oil would yield 0.52 gallons of motor fuel, then the yield would be 52%. To reduce maintenance costs, data was collected on parts inventories and equipment breakdowns. Obtaining accurate data required the installation of a new data based - management system and integrated maintenance information system. Additionally, a process control system was also installed to accurately monitor the inputs and resources used to manufacture each product. 26

27 Step 3 (Formulate a Mathematical Model of the Problem) Using linear programming (LP), a model was developed to optimize refinery operations. The model: Determines the cost-minimizing method for mixing or blending together inputs to obtain desired results. Contains constraints that ensure that inputs are blended to produce desired results. Includes constraints that ensure inputs are blended so that each output is of the desired quality. Ensures that plant capacities are not exceeded and allow for inventory for each end product. 27

28 Step 4 (Verify the Model and Use the Model for Prediction) To validate the model, inputs and outputs from the refinery were collected for one month. Given actual inputs used at the refinery during that month, actual outputs were compared to those predicted by the model. After extensive changes, the model’s predicted outputs were close to actual outputs. Step 5 (Select Suitable Alternative Solutions) Running the LP yielded a daily strategy for running the refinery. For instance, the model might say, produce 400,000 gallons of turbine fuel using 300,000 gallons of crude 1 and 200,000 of crude 2. 28

29 Step 6 (Present the Results and Conclusions) and Step 7 (Implement and Evaluate Recommendations) Once the data base and process control were in place, the model was used to guide day-to-day operations. CITGO estimated that the overall benefits of the refinery system exceeded $50 million annually. 29

30 The Supply Distribution Marketing (SDM) System
Step 1 (Formulate the Problem) CITGO wanted a mathematical model that could be used to make supply, distribution, and marketing decisions such as: Where should the crude oil be purchased? Where should products be sold? What price should be charged for products? How much of each product should be held in inventory? The goal was to maximize profitability associated with these decisions. 30

31 Step 2 (Observe the System) A database that kept track of sales, inventory, trades, and exchanges of all refined goods was installed. Also regression analysis was used to develop forecasts of for wholesale prices and wholesale demand for each CITGO product. Step 3 (Formulate a Mathematical Model of the Problem) and Step 5 (Select Suitable Alterative Solutions) A minimum-cost network flow model (MCNFM) is used a determine an 11-week supply, marketing, and distribution strategy. The model makes all decisions discussed in Step 1. A typical model run involved 3,000 equations and 15,000 decision variables required only 30 seconds on an IBM 4381. 31

32 Step 4 (Verify the Model and Use the Model for Prediction)
The forecasting models are continuously evaluated to ensure that they continue to give accurate forecasts. Step 6 (Present the Results and Conclusions) and Step 7 (Implement and Evaluate Recommendations) Implementing the SDM required several organizational changes. A new vice-president was appointed to coordinate the operation of the SDM and refinery LP model. The product supply and product scheduling departments were combined to improve communications and information flow. 32

33 1.4 San Francisco Police Department Scheduling
Taylor and Huxley (1989) developed a police patrol scheduling system (PPSS). All San Francisco (SF) police precincts use the PPSS to schedule their officers. It is estimated that the PSS saves the SF police more than $5 million annually. Other cities have also adopted PPSS. Following the seven-step model here is a description of PPSS. 33

34 Step 1 The SFPD wanted a method to schedule patrol officers in each precinct that would:
Quickly produce a schedule and graphically display it. Minimize the sum of hourly shortages and surpluses Would allow precinct captains to easily fine-tune to produce the optimal schedule. Step 2 The SPFD had a sophisticated computer-aided dispatch (CAD) system to keep track of calls for police help, police travel time, police response time, and so on. 34

35 Step 3 An LP model was formulated
Primary objective was to minimize the sum of hourly shortages and surpluses. The constraints in the PPSS model reflected the limited number of officers available and the relationship of the number of officers working each hour to the shortages and surpluses for that hour. PPSS would produce a schedule that would tell the precinct captains how many officers should start work at each possible shift time. The fact that this number was required to be an integer made it far more difficult to find an optimal solution. 35

36 Step 4 Before implementing PPSS, the SFPD tested the PPSS schedule against manually created schedules. It produced approximately a 50% reduction in both shortages and surpluses. Step 5 PPSS can be used to minimize the shortages and surpluses as well as being used to experiment with shift times and work rules. Officers had wanted to switch to a 4/10 schedule versus a 5/8 schedule for years but were not permitted to because city felt it would reduce productivity. PPSS showed that it would not reduce productivity but improve productivity. 36

37 Step 6 and Step 7 Estimated that PPSS created an extra 170,000 productive hours per year, thereby saving the city of San Francisco $5.2 million dollars per year. 96% of workers preferred PPSS generated schedules PPSS enabled SFPD tot make strategic changes which made officers happier and increased productivity. Response times to calls improved 20% after PPSS was adopted. Precinct captains could fine-tune the computer- generated schedule and obtain a new schedule in 1 minute. 37

38 1.5 GE Capital GE Capital provides credit card service to 50 million accounts with an average outstanding balance of $12 billion. GE Capital led by Makuch et al. (1989) developed the PAYMENT system to reduce delinquent accounts and the cost of collecting from delinquent accounts. Step 1 The company’s goal was to reduce the delinquent accounts and the cost of processing them. To do this, GE capital needed to come up with a method of assigning scarce labor resources to delinquent accounts. 38

39 Step 2 The key to modeling delinquent accounts is the concept of a delinquency movement matrix (DMM). The DMM determines how the probability of the payment on a delinquent account during the current month depends on the following factors: size of unpaid balance action taken performance score 39

40 Step 3 GE developed a linear optimization model.
The objective function for the PAYMENT model was to maximize the expected delinquent accounts collected during the next six months. The decision variables represented the fraction of each type of delinquent account. The constraints in the PAYMENT model ensure that available resources are not overused. Constraints also relate the number of each type of delinquent account present in, say, January to the number of delinquent accounts for each type present in the next month. This dynamic aspect is crucial to the model’s success. 40

41 Step 4 PAYMENT was piloted on a $62 million portfolio for a single department store.
GE Capital managers came up with their own strategies for allocating resources (called CHAMPION). Store’s accounts were randomly assigned to the CHAMPION and PAYMENT strategies. PAYMENT used more live calls and “no action”. PAYMENT showed a 5-7% improvement over CHAMPION. Step 5 For each type of account PAYMENT tells the credit managers the fraction that should receive each type of contact. 41

42 Steps 6 and 7 PAYMENT was next applied to the 18 million accounts of the $4.6 billion Montgomery Ward store portfolio. Comparing accounts to the previous year, PAYMENT increased collections by $1.6 million per month. Overall, GE Capital estimated PAYMENT increased collections by $37 million per year and used fewer resources than previous strategies. 42

43 Operations Research: Applications and Algorithms
Chapter 2 Basic Linear Algebra to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 43

44 2.1 - Matrices and Vectors A matrix is any rectangular array of numbers If a matrix A has m rows and n columns it is referred to as an m x n matrix. m x n is the order of the matrix. It is typically written as 44

45 A = B if and only if x = 1, y = 2, w = 3, and z = 4
The number in the ith row and jth column of A is called the ijth element of A and is written aij. Two matrices A = [aij] and B = [bij] are equal if and only if A and B are the same order and for all i and j, aij = bij. A = B if and only if x = 1, y = 2, w = 3, and z = 4 45

46 Rm will denote the set all m-dimensional column vectors
Any matrix with only one column is a column vector. The number of rows in a column vector is the dimension of the column vector. Rm will denote the set all m-dimensional column vectors Any matrix with only one row (a 1 x n matrix) is a row vector. The dimension of a row vector is the number of columns. 46

47 Any m-dimensional vector (either row or column) in which all the elements equal zero is called a zero vector (written 0). Any m-dimensional vector corresponds to a directed line segment in the m-dimensional plane. For example, the two-dimensional vector u corresponds to the line segment joining the point (0,0) to the point (1,2) 47

48 The directed line segments (vectors u, v, w) are shown.
48

49 The scalar product of u and v is written:
The scalar product is the result of multiplying two vectors where one vector is a column vector and the other is a row vector. For the scalar product to be defined, the dimensions of both vectors must be the same. The scalar product of u and v is written: 49

50 The Scalar Multiple of a Matrix
Given any matrix A and any number c, the matrix cA is obtained from the matrix A by multiplying each element of A by c. Addition of Two Matrices Let A = [aij] and B =[bij] be two matrixes with the same order. Then the matrix C = A + B is defined to be the m x n matrix whose ijth element is aij + bij. Thus, to obtain the sum of two matrixes A and B, we add the corresponding elements of A and B. 50

51 This rule for matrix addition may be used to add vectors of the same dimension.
Vectors may be added using the parallelogram law or by using matrix addition. 51

52 Line segments can be defined using scalar multiplication and the addition of matrices.
If u=(1,2) and v=(2,1), the line segment joining u and v (called uv) is the set of all points in the m- dimensional plane corresponding to the vectors cu +(1-c)v, where 0 ≤ c ≤ 1. 52

53 The Transpose of a Matrix
Given any m x n matrix the transpose of A (written AT) is the n x m matrix. For any matrix A, (AT)T=A. 53

54 Matrix Multiplication
Given to matrices A and B, the matrix product of A and B (written AB) is defined if and only if the number of columns in A = the number of rows in B. The matrix product C = AB of A and B is the m x n matrix C whose ijth element is determined as follows: ijth element of C = scalar product of row i of A x column j of B 54

55 Example 1: Matrix Multiplication
Computer C = AB for Solution Because A is a 2x3 matrix and B is a 3x2 matrix, AB is defined, and C will be a 2x2 matrix. 55

56 Some important properties of matrix multiplications are:
Many computations that commonly occur in operations research can be concisely expressed by using matrix multiplication. Some important properties of matrix multiplications are: Row i of AB = (row i of A)B Column j of AB = A(column j of B) 56

57 Use the EXCEL MMULT function to multiply the matrices:
Enter matrix A into cells B1:D2 and matrix B into cells B4:C6. Select the output range (B8:C9) into which the product will be computed. In the upper left-hand corner (B8) of this selected output range type the formula: = MMULT(B1:D2,B4:C6). Press Control-Shift-Enter 57

58 2.2 Matrices and Systems of Linear Equations
Consider a system of linear equations. The variables, or unknowns, are referred to as x1, x2, …, xn while the aij’s and bj’s are constants. A set of such equations is called a linear system of m equations in n variables. A solution to a linear set of m equations in n unknowns is a set of values for the unknowns that satisfies each of the system’s m equations. 58

59 Example 5: Solution to Linear System
Show that a solution to the linear system and that is not a solution to the linear system. 59

60 Example 5 Solution To show that is a solution, x1=1 and x2=2 must be substituted in both equations. The equations must be satisfied. The vector is not a solution, because x1=3 and x2=1 fail to satisfy 2x1-x2=0 60

61 This will sometimes be abbreviated A|b.
Matrices can simplify and compactly represent a system of linear equations. This system of linear equations may be written as Ax=b and is called its matrix representation. This will sometimes be abbreviated A|b. 61

62 2.3 – The Gauss-Jordan Method
Using the Gauss-Jordan method, it can be shown that any system of linear equations must satisfy one of the following three cases: Case 1 The system has no solution. Case 2 The system has a unique solution. Case 3 The system has an infinite number of solutions. The Gauss-Jordan method is important because many of the manipulations used in this method are used when solving linear programming problems by the simplex algorithm. 62

63 Elementary row operations (ERO) transforms a given matrix A into a new matrix A’ via one of the following operations: Type 1 ERO –A’ is obtained by multiplying any row of A by a nonzero scalar. Type 2 ERO – Multiply any row of A (say, row i) by a nonzero scalar c. For some j ≠ i, let row j of A’ = c*(row i of A) + row j of A and the other rows of A’ be the same as the rows of A. Type 3 ERO – Interchange any two rows of A. 63

64 The steps to using the Gauss-Jordan method
The Gauss-Jordan method solves a linear equation system by utilizing EROs in a systematic fashion. The steps to using the Gauss-Jordan method The augmented matrix representation is A|b = 64

65 Step 1 Multiply row 1 by ½. This Type 1 ERO yields
Step 2 Replace row 2 of A1|b1 by -2(row 1 A1|b1) + row 2 of A1|b1. The result of this Type 2 ERO is A1|b1 = A2|b2 = 65

66 Step 3 Replace row 3 of A2|b2 by -1(row 1 of A2|b2) + row 3 of A2|b2 The result of this Type 2 ERO is The first column has now been transformed into A3|b3 = 66

67 Step 4 Multiply row 2 of A3|b3 by -1/3
Step 4 Multiply row 2 of A3|b3 by -1/3. The result of this Type 1 ERO is Step 5 Replace row 1 of A4|b4 by -1(row 2 of A4|b4) + row 1 of A4|b4. The result of this Type 2 ERO is A4|b4 = A5|b5 = 67

68 Step 6 Place row 3 of A5|b5 by 2(row 2 of A5|b5) + row 3 of A5|b5
Step 6 Place row 3 of A5|b5 by 2(row 2 of A5|b5) + row 3 of A5|b5. The result of this Type 2 ERO is Column 2 has now been transformed into A6|b6 = 68

69 Step 7 Multiply row 3 of A6|b6 by 6/5. The result of this Type 1 ERO is
Step 8 Replace row 1 of A7|b7 by -5/6(row 3 of A7|b7)+ row 1 of A7|b7. The result of this Type 2 ERO is A7|b7 = A8|b8 = 69

70 Step 9 Replace row 2 of A8|b8 by 1/3(row 3 of A8|b8)+ row 2 of A8|b8
Step 9 Replace row 2 of A8|b8 by 1/3(row 3 of A8|b8)+ row 2 of A8|b8. The result of this Type 2 ERO is A9|b9 represents the system of equations and thus the unique solution A9|b9 = 70

71 After the Gauss Jordan method has been applied to any linear system, a variable that appears with a coefficient of 1 in a single equation and a coefficient of 0 in all other equations is called a basic variable (BV). Any variable that is not a basic variable is called a nonbasic variable (NBV). 71

72 Summary of Gauss-Jordan Method
72

73 2.4 Linear Independence and Linear Dependence
A linear combination of the vectors in V is any vector of the form c1v1 + c2v2 + … + ckvk where c1, c2, …, ck are arbitrary scalars. A set of V of m-dimensional vectors is linearly independent if the only linear combination of vectors in V that equals 0 is the trivial linear combination. A set of V of m-dimensional vectors is linearly dependent if there is a nontrivial linear combination of vectors in V that adds up to 0. 73

74 Example 10: LD Set of Vectors
Show that V = {[ 1 , 2 ] , [ 2 , 4 ]} is a linearly dependent set of vectors. Solution Since 2([ 1 , 2 ]) – 1([ 2 , 4 ]) = (0 0), there is a nontrivial linear combination with c1 =2 and c2 = -1 that yields 0. Thus V is a linear independent set of vectors. 74

75 What does it mean for a set of vectors to linearly dependent?
A set of vectors is linearly dependent only if some vector in V can be written as a nontrivial linear combination of other vectors in V. If a set of vectors in V are linearly dependent, the vectors in V are, in some way, NOT all “different” vectors. By “different we mean that the direction specified by any vector in V cannot be expressed by adding together multiples of other vectors in V. For example, in two dimensions, two linearly dependent vectors lie on the same line. 75

76 Let A be any m x n matrix, and denote the rows of A by r1, r2, …, rm
Let A be any m x n matrix, and denote the rows of A by r1, r2, …, rm. Define R = {r1, r2, …, rm}. The rank of A is the number of vectors in the largest linearly independent subset of R. To find the rank of matrix A, apply the Gauss- Jordan method to matrix A. Let A’ be the final result. It can be shown that the rank of A’ = rank of A. The rank of A’ = the number of nonzero rows in A’. Therefore, the rank A = rank A’ = number of nonzero rows in A’. 76

77 If the rank A = m, then V is a linearly independent set of vectors.
A method of determining whether a set of vectors V = {v1, v2, …, vm} is linearly dependent is to form a matrix A whose ith row is vi. If the rank A = m, then V is a linearly independent set of vectors. If the rank A < m, then V is a linearly dependent set of vectors. 77

78 2.5 – The Inverse of a Matrix
A square matrix is any matrix that has an equal number of rows and columns. The diagonal elements of a square matrix are those elements aij such that i=j. A square matrix for which all diagonal elements are equal to 1 and all non-diagonal elements are equal to 0 is called an identity matrix. An identity matrix is written as Im. 78

79 The Gauss-Jordan Method for inverting an m x m Matrix A is
For any given m x m matrix A, the m x m matrix B is the inverse of A if BA=AB=Im. Some square matrices do not have inverses. If there does exist an m x m matrix B that satisfies BA=AB=Im, then we write B=A-1. The Gauss-Jordan Method for inverting an m x m Matrix A is Step 1 Write down the m x 2m matrix A|Im Step 2 Use EROs to transform A|Im into Im|B. This will be possible only if rank A=m. If rank A<m, then A has no inverse. 79

80 Matrix inverses can be used to solve linear systems.
The Excel command =MINVERSE makes it easy to invert a matrix. Enter the matrix into cells B1:D3 and select the output range (B5:D7 was chosen) where you want A- 1 computed. In the upper left-hand corner of the output range (cell B5), enter the formula = MINVERSE(B1:D3) Press Control-Shift-Enter and A-1 is computed in the output range 80

81 2.6 – Determinants Associated with any square matrix A is a number called the determinant of A (often abbreviated as det A or |A|). If A is an m x m matrix, then for any values of i and j, the ijth minor of A (written Aij) is the (m - 1) x (m - 1) submatrix of A obtained by deleting row i and column j of A. Determinants can be used to invert square matrices and to solve linear equation systems. 81

82 Operations Research: Applications and Algorithms
Chapter 3 Introduction to Linear Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 82

83 3.1 What Is a Linear Programming Problem?
Linear Programming (LP) is a tool for solving optimization problems. Linear programming problems involve important terms that are used to describe linear programming. 83

84 Example 1: Giapetto’s Woodcarving
Giapetto’s, Inc., manufactures wooden soldiers and trains. Each soldier built: Sell for $27 and uses $10 worth of raw materials. Increase Giapetto’s variable labor/overhead costs by $14. Requires 2 hours of finishing labor. Requires 1 hour of carpentry labor. Each train built: Sell for $21 and used $9 worth of raw materials. Increases Giapetto’s variable labor/overhead costs by $10. Requires 1 hour of finishing labor. 84

85 Ex. 1 - continued Each week Giapetto can obtain:
All needed raw material. Only 100 finishing hours. Only 80 carpentry hours. Demand for the trains is unlimited. At most 40 soldiers are bought each week. Giapetto wants to maximize weekly profit (revenues – costs). Formulate a mathematical model of Giapetto’s situation that can be used to maximize weekly profit. 85

86 Example 1: Solution The Giapetto solution model incorporates the characteristics shared by all linear programming problems. Decision variables should completely describe the decisions to be made. x1 = number of soldiers produced each week x2 = number of trains produced each week The decision maker wants to maximize (usually revenue or profit) or minimize (usually costs) some function of the decision variables. This function to maximized or minimized is called the objective function. For the Giapetto problem, fixed costs do not depend upon the the values of x1 or x2. 86

87 Ex. 1 - Solution continued
Giapetto’s weekly profit can be expressed in terms of the decision variables x1 and x2: Weekly profit = weekly revenue – weekly raw material costs – the weekly variable costs = 3x1 + 2x2 Thus, Giapetto’s objective is to chose x1 and x2 to maximize weekly profit. The variable z denotes the objective function value of any LP. Giapetto’s objective function is Maximize z = 3x1 + 2x2 The coefficient of an objective function variable is called an objective function coefficient. 87

88 Ex. 1 - Solution continued
As x1 and x2 increase, Giapetto’s objective function grows larger. For Giapetto, the values of x1 and x2 are limited by the following three restrictions (often called constraints): Each week, no more than 100 hours of finishing time may be used. (2 x1 + x2 ≤ 100) Each week, no more than 80 hours of carpentry time may be used. (x1 + x2 ≤ 80) Because of limited demand, at most 40 soldiers should be produced. (x1 ≤ 40) 88

89 The coefficients of the decision variables in the constraints are called the technological coefficients. The number on the right-hand side of each constraint is called the constraint’s right-hand side (or rhs). To complete the formulation of a linear programming problem, the following question must be answered for each decision variable. Can the decision variable only assume nonnegative values, or is the decision variable allowed to assume both positive and negative values? If the decision variable can assume only nonnegative values, the sign restriction xi ≥ 0 is added. If the variable can assume both positive and negative values, the decision variable xi is unrestricted in sign (often abbreviated urs). 89

90 Ex. 1 - Solution continued
For the Giapetto problem model, combining the sign restrictions x1≥0 and x2≥0 with the objective function and constraints yields the following optimization model: Max z = 3x1 + 2x2 (objective function) Subject to (s.t.) 2 x1 + x2 ≤ 100 (finishing constraint) x1 + x2 ≤ 80 (carpentry constraint) x ≤ 40 (constraint on demand for soldiers) x ≥ 0 (sign restriction) x2 ≥ 0 (sign restriction) 90

91 A function f(x1, x2, …, xn) of x1, x2, …, xn is a linear function if and only if for some set of constants, c1, c2, …, cn, f(x1, x2, …, xn) = c1x1 + c2x2 + … + cnxn. For any linear function f(x1, x2, …, xn) and any number b, the inequalities f(x1, x2, …, xn) ≤ b and f(x1, x2, …, xn) ≥ b are linear inequalities. 91

92 A linear programming problem (LP) is an optimization problem for which we do the following:
Attempt to maximize (or minimize) a linear function (called the objective function) of the decision variables. The values of the decision variables must satisfy a set of constraints. Each constraint must be a linear equation or inequality. A sign restriction is associated with each variable. For any variable xi, the sign restriction specifies either that xi must be nonnegative (xi ≥ 0) or that xi may be unrestricted in sign. 92

93 The fact that the objective function for an LP must be a linear function of the decision variables has two implications: The contribution of the objective function from each decision variable is proportional to the value of the decision variable. For example, the contribution to the objective function for 4 soldiers is exactly fours times the contribution of 1 soldier. The contribution to the objective function for any variable is independent of the other decision variables. For example, no matter what the value of x2, the manufacture of x1 soldiers will always contribute 3x1 dollars to the objective function. 93

94 Analogously, the fact that each LP constraint must be a linear inequality or linear equation has two implications: The contribution of each variable to the left-hand side of each constraint is proportional to the value of the variable. For example, it takes exactly 3 times as many finishing hours to manufacture 3 soldiers as it does 1 soldier. The contribution of a variable to the left-hand side of each constraint is independent of the values of the variable. For example, no matter what the value of x1, the manufacture of x2 trains uses x2 finishing hours and x2 carpentry hours 94

95 The first item in each list is called the Proportionality Assumption of Linear Programming.
The second item in each list is called the Additivity Assumption of Linear Programming. The divisibility assumption requires that each decision variable be permitted to assume fractional values. 95

96 The certainty assumption is that each parameter (objective function coefficients, right-hand side, and technological coefficients) are known with certainty. The feasible region of an LP is the set of all points satisfying all the LP’s constraints and sign restrictions. For a maximization problem, an optimal solution to an LP is a point in the feasible region with the largest objective function value. Similarly, for a minimization problem, an optimal solution is a point in the feasible region with the smallest objective function value. 96

97 3.2 – The Graphical Solution to a Two-Variable LP Problem
Any LP with only two variables can be solved graphically. The variables are always labeled x1 and x2 and the coordinate axes the x1 and x2 axes. Satisfies 2x1 + 3x2 ≥ 6 Satisfies 2x1 + 3x2 ≤ 6 97

98 Since the Giapetto LP has two variables, it may be solved graphically.
The feasible region is the set of all points satisfying the constraints 2 x1 + x2 ≤ 100 (finishing constraint) x1 + x2 ≤ (carpentry constraint) x1 ≤ 40 (demand constraint) x1 ≥ (sign restriction) x2 ≥ (sign restriction) 98

99 The set of points satisfying the Giapetto LP is bounded by the five sided polygon DGFEH. Any point on or in the interior of this polygon (the shade area) is in the feasible region. 99

100 Having identified the feasible region for the Giapetto LP, a search can begin for the optimal solution which will be the point in the feasible region with the largest z-value. To find the optimal solution, graph a line on which the points have the same z-value. In a max problem, such a line is called an isoprofit line while in a min problem, this is called the isocost line. The figure shows the isoprofit lines for z = 60, z = 100, and z = 180 100

101 A constraint is binding if the left-hand and right-hand side of the constraint are equal when the optimal values of the decision variables are substituted into the constraint. In the Giapetto LP, the finishing and carpentry constraints are binding. A constraint is nonbinding if the left-hand side and the right-hand side of the constraint are unequal when the optimal values of the decision variables are substituted into the constraint. In the Giapetto LP, the demand constraint for wooden soldiers is nonbinding since at the optimal solution (x1 = 20), x1 < 40. 101

102 A set of points S is a convex set if the line segment jointing any two pairs of points in S is wholly contained in S. For any convex set S, a point p in S is an extreme point if each line segment that lines completely in S and contains the point P has P as an endpoint of the line segment. Extreme points are sometimes called corner points, because if the set S is a polygon, the extreme points will be the vertices, or corners, of the polygon. The feasible region for the Giapetto LP will be a convex set. 102

103 Example 2 : Dorian Auto Dorian Auto manufactures luxury cars and trucks. The company believes that its most likely customers are high-income women and men. To reach these groups, Dorian Auto has embarked on an ambitious TV advertising campaign and will purchase 1-mimute commercial spots on two type of programs: comedy shows and football games. 103

104 Ex. 2: continued Each comedy commercial is seen by 7 million high income women and 2 million high-income men and costs $50,000. Each football game is seen by 2 million high- income women and 12 million high-income men and costs $100,000. Dorian Auto would like for commercials to be seen by at least 28 million high-income women and 24 million high-income men. Use LP to determine how Dorian Auto can meet its advertising requirements at minimum cost. 104

105 Example 2: Solution Dorian must decide how many comedy and football ads should be purchased, so the decision variables are x1 = number of 1-minute comedy ads x2 = number of 1-minute football ads Dorian wants to minimize total advertising cost. Dorian’s objective functions is min z = 50 x x2 Dorian faces the following the constraints Commercials must reach at least 28 million high-income women. (7x1 + 2x2 ≥ 28) Commercials must reach at least 24 million high-income men. (2x1 + 12x2 ≥ 24) The sign restrictions are necessary, so x1, x2 ≥ 0. 105

106 Ex. 2 – Solution continued
Like the Giapetto LP, The Dorian LP has a convex feasible region. The feasible region for the Dorian problem, however, contains points for which the value of at least one variable can assume arbitrarily large values. Such a feasible region is called an unbounded feasible region. 106

107 Ex. 2 – Solution continued
To solve this LP graphically begin by graphing the feasible region. 107

108 Ex. 2 – Solution continued
Since Dorian wants to minimize total advertising costs, the optimal solution to the problem is the point in the feasible region with the smallest z value. An isocost line with the smallest z value passes through point E and is the optimal solution at x1 = 3.4 and x2 = 1.4. Both the high-income women and high-income men constraints are satisfied, both constraints are binding. 108

109 Does the Dorian model meet the four assumptions of linear programming?
The Proportionality Assumption is violated because at a certain point advertising yields diminishing returns. Even though the Additivity Assumption was used in writing: (Total viewers) = (Comedy viewer ads) + (Football ad viewers) many of the same people might view both ads, double-counting of such people would occur thereby violating the assumption. The Divisibility Assumption is violated if only 1- minute commercials are available. Dorian is unable to purchase 3.6 comedy and 1.4 football commercials. The Certainty assumption is also violated because there is no way to know with certainty how many viewers are added by each type of commercial. 109

110 3.3 Special Cases The Giapetto and Dorian LPs each had a unique optimal solution. Some types of LPs do not have unique solutions. Some LPs have an infinite number of solutions (alternative or multiple optimal solutions). Some LPs have no feasible solutions (infeasible LPs). Some LPs are unbounded: There are points in the feasible region with arbitrarily large (in a maximization problem) z-values. The technique of goal programming is often used to choose among alternative optimal solutions. 110

111 It is possible for an LP’s feasible region to be empty, resulting in an infeasible LP.
Because the optimal solution to an LP is the best point in the feasible region, an infeasible LP has no optimal solution. For a max problem, an unbounded LP occurs if it is possible to find points in the feasible region with arbitrarily large z-values, which corresponds to a decision maker earning arbitrarily large revenues or profits. 111

112 For a minimization problem, an LP is unbounded if there are points in the feasible region with arbitrarily small z- values. Every LP with two variables must fall into one of the following four cases. The LP has a unique optimal solution. The LP has alternative or multiple optimal solutions: Two or more extreme points are optimal, and the LP will have an infinite number of optimal solutions. The LP is infeasible: The feasible region contains no points. The LP unbounded: There are points in the feasible region with arbitrarily large z-values (max problem) or arbitrarily small z-values (min problem). 112

113 Example 6: Diet Problem My diet requires that all the food I get come from one of the four “basic food groups”. At present, the following four foods are available for consumption: brownies, chocolate ice cream, cola and pineapple cheesecake. Each brownie costs 50¢, each scoop of ice cream costs 20 ¢, each bottle of cola costs 30 ¢,, and each piece of pineapple cheesecake costs 80 ¢. Each day, I must ingest at least 500 calories, 6 oz of chocolate, 10 oz of sugar, and 8 oz of fat. 113

114 Ex. 6 - continued The nutritional content per unit of each food is given. Formulate a linear programming model that can be used to satisfy my daily nutritional requirements at minimum cost. 114

115 Example 6: Solution As always, begin by determining the decisions that must be made by the decision maker: how much of each type of food should be eaten daily. Thus we define the decision variables: x1 = number of brownies eaten daily x2 = number of scoops of chocolate ice cream eaten daily x3 = bottles of cola drunk daily x4 = pieces of pineapple cheesecake eaten daily 115

116 Ex. 6 – Solution continued
My objective is to minimize the cost of my diet. The total cost of any diet may be determined from the following relation: (total cost of diet) = (cost of brownies) + (cost of ice cream) +(cost of cola) + (cost of cheesecake). The decision variables must satisfy the following four constraints: Daily calorie intake must be at least 500 calories. Daily chocolate intake must be at least 6 oz. Daily sugar intake must be at least 10 oz. Daily fat intake must be at least 8 oz. 116

117 Ex. 6 – Solution continued
Express each constraint in terms of the decision variables. The optimal solution to this LP is found by combining the objective functions, constraints, and the sign restrictions. The chocolate and sugar constraints are binding, but the calories and fat constraints are nonbinding. A version of the diet problem with a more realistic list of foods and nutritional requirements was one of the first LPs to be solved by computer. 117

118 3.5 A Work-Scheduling Problem
Many applications of linear programming involve determining the minimum-cost method for satisfying workforce requirements. The results may be different solved manually vs using LINDO, LINGO or Excel Solver. One type of work scheduling problem is a static scheduling problem. In reality, demands change over time, workers take vactions in the summer, and so on, so the post office does not face the same situation each week. This is a dynamic scheduling problem. 118

119 3.6 A Capital Budgeting Problem
Linear programming can be used to determine optimal financial decisions. The concept of net present value (NPV) can be used to compare the desirability of different investments. The total value of the cash flows for any investment is called by net present value, or NPV, of the investment. The NPV of an investment is the amount by which the investment will increase the form’s value. 119

120 The Excel =NPV function makes this computation easy
The Excel =NPV function makes this computation easy. Syntax is =NPV(r, range of cash flows). Often cash flows occur at irregular intervals. This makes it difficult to compute the NPV of these cash flows. The Excel XNPV function makes computing NPV’s of the regular times cash flows a snap. In many capital budgeting problems, it is unreasonable to allow the xi to be fractions: Each xi should be restricted to 0 or 1. Thus many capital budgeting problems violate the Divisibility Assumption. 120

121 3.7 Short-Term Financial Planning
LP models can often be used to aid a firm in short- or long-term financial planning. 121

122 3.8 Blending Problems Situations in which various inputs must be blended in some desired proportion to produce goods for sale are often amenable to linear programming analysis. Such problems are called blending problems. Some examples of how linear programming has been used to solve blending problems. Blending various types of crude oils to produce different types of gasoline and other outputs. Blending various chemicals to produce other chemicals Blending various types of metal alloys to produce various types of steels 122

123 3.9 Production Process Models
An LP model can be formulated to represent a simple production process. The key step is to determine how the outputs from a later stage of the process are related to the outputs from an earlier stage. A common mistake in these type of LP models is with units of measurement in constraints. If there are doubts about a constraint, then make sure that all terms in the constraint have the same units 123

124 3.10 Using Linear Programming to Solve Multiperiod Decision Problems: An Inventory Model
Up until now all the LP examples have been static, or one-period, models. Linear programming can also be used to determine optimal decisions in multiperiod, or dynamic, models. Dynamic models arise when the decision maker makes decisions at more than one point in time. In a dynamic model, decisions made during the current period influence decisions made during future periods. 124

125 Example 14: Sailco Inventory
Sailco Corporation must determine how many sailboats should be produced during each of the next four quarters. Demand is known for the next four quarters. Sailco must meet demands on time. At the beginning of the first quarter Sailco has an inventory of 10 boats. Sailco must decide at the beginning of each quarter how many boats should be produced during each quarter. Sailco can produce 40 boats (cost=$400)per quarter with regular labor hours. 125

126 Ex continued They can produce more if tat have workers work overtime. These boats cost $450 to produce. At the end of each quarter a carrying or holding cost of $20 per boat is incurred. Use linear programming to determine a production schedule to minimize the sum of production and inventory costs during the next four quarters. 126

127 Example 14: Solution For each quarter Sailco must determine the number of sailboats that should be produced by regular-time and by overtime labor. The decision variables are xt = number of sailboats produced by regular time labor during quarter t yt = number of boats produced by overtime labor during quarter t it = number of sailboats on hand at the end of quarter t Sailco’s objective function is min z = 400x x x x y y y y4 + 20i1 +20i2 +20i3+20i4 127

128 Ex. 14 – Solution continued
Two observations Let di be the demand during period t, it = it-1+(xt+yt) –dt (t = 1,2,3,4) Not that quarter t’s demand will be met on time if and only if (iff) it ≥ 0. The sign restriction ensures that each quarter’s demand will be met on time. Sailco’s constraints The first four constraints ensures that each periods regular time production does not exceed 40. Inventory constraints for each time period are added. 128

129 Ex. 14 – Solution continued
The optimal solution is reached. Some things to consider: The Proportionality assumption is violated because production cost may not be a linear function of quantity produced. Future demands may not be known with certainty, thus violating the Certainty assumption. Sailco is required to meet all demands on time. Demands could possibly be backlogged (demand is met during later period) and incur penalty costs. Ignored that fact that quarter-to-quarter variations in the quantity produced may result in extra costs (production-smoothing costs). 129

130 Ex. 14 – Solution continued
In an inventory model with a finite horizon, the inventory left at the end of the last period should be assigned a salvage value that is indicative of the worth of the final period’s inventory. 130

131 3.11 Multiperiod Financial Models
Linear programming can be used to model mulitperiod cash management problems. The key is to determine the relations of cash on hand during different periods. One real world example is to use LP to determine a bond model that maximizes profit from bon purchases ad sales subject to constraints that minimize the firm’s risk exposure. 131

132 3.12 Multiperiod Work Scheduling
Have seen that LP can be used to schedule employees in a static environment LP can also be used to schedule employee training when a firm faces a demand that changes over time 132

133 Operations Research: Applications and Algorithms
Chapter 4 The Simplex Algorithm and Goal Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 133

134 4.1 How to Convert an LP to Standard Form
Before the simplex algorithm can be used to solve an LP, the LP must be converted into a problem where all the constraints are equations and all variables are nonnegative. An LP in this form is said to be in standard form. 134

135 Example 1: Leather Limited
Leather Limited manufactures two types of leather belts: the deluxe model and the regular model. Each type requires 1 square yard of leather. A regular belt requires 1 hour of skilled labor and a deluxe belt requires 2 hours of skilled labor. Each week, 40 square yards of leather and 60 hours of skilled labor are available. Each regular belt contributes $3 profit and each deluxe belt $4. Write an LP to maximize profit. 135

136 Example 1: Solution The decision variables are: The appropriate LP is:
x1 = number of deluxe belts produced weekly x2 = number of regular belts produced weekly The appropriate LP is: max z = 4x1 + 3x2 s.t. x1 + x2 ≤ 40 (leather constraint) 2x1 +x2 ≤ 60 (labor constraint) x1, x2 ≥ 0 136

137 To convert a ≤ constraint to an equality, define for each constraint a slack variable si (si = slack variable for the ith constraint). A slack variable is the amount of the resource unused in the ith constraint. If a constraint i of an LP is a ≤ constraint, convert it to an equality constraint by adding a slack variable si to the ith constraint and adding the sign restriction si ≥ 0. 137

138 To convert the ith ≥ constraint to an equality constraint, define an excess variable (sometimes called a surplus variable) ei (ei will always be the excess variable for the ith ≥ constraint. We define ei to be the amount by which ith constraint is over satisfied. Subtracting the excess variable ei from the ith constraint and adding the sign restriction ei ≥ 0 will convert the constraint. If an LP has both ≤ and ≥ constraints, apply the previous procedures to the individual constraints. 138

139 4.2 Preview of the Simplex Algorithm
Consider a system Ax = b of m linear equations in n variables (where n ≥ m). A basic solution to Ax = b is obtained by setting n – m variables equal to 0 and solving for the remaining m variables. This assumes that setting the n – m variables equal to 0 yields a unique value for the remaining m variables, or equivalently, the columns for the remaining m variables are linearly independent. Any basic solution in which all variables are nonnegative is called a basic feasible solution (or bfs). 139

140 The following theorem explains why the concept of a basic feasible solution is of great importance in linear programming. Theorem 1 The feasible region for any linear programming problem is a convex set. Also, if an LP has an optimal solution, there must be an extreme point of the feasible region that is optimal. 140

141 4.3 Direction of Unboundedness
Consider an LP in standard form with feasible region S and constraints Ax=b and x ≥ 0. Assuming that our LP has n variables, 0 represents an n-dimensional column vector consisting of all 0’s. A non-zero vector d is a direction of unboundedness if for all xS and any c≥0, x +cdS 141

142 Any feasible x may be written as a convex combination of the LP’s bfs.
Theorem 2 Consider an LP in standard form, having bfs b1, b2,…bk. Any point x in the LP’s feasible region may be written in the form where d is 0 or a direction of unboundedness and =1 and i ≥ 0. Any feasible x may be written as a convex combination of the LP’s bfs. 142

143 4.4 Why Does LP Have an Optimal bfs?
Theorem 3 If an LP has an optimal solution, then it has an optimal bfs. For any LP with m constraints, two basic feasible solutions are said to be adjacent if their sets of basic variables have m – 1 basic variables in common. The set of points satisfying a linear inequality in three (or any number of) dimensions is a half-space. The intersection of half-space is called a polyhedron. 143

144 4.5 The Simplex Algorithm The simplex algorithm can be used to solve LPs in which the goal is to maximize the objective function. Step 1 Convert the LP to standard form Step 2 Obtain a bfs (if possible) from the standard form Step 3 Determine whether the current bfs is optimal Step 4 If the current bfs is not optimal, determine which nonbasic variable should become a basic variable and which basic variable should become a nonbasic variable to find a bfs with a better objective function value. Step 5 Use EROs to find a new bfs with a better objective function value. Go back to Step 3. 144

145 Example 2: Dakota Furniture Company
The Dakota Furniture company manufactures desk, tables, and chairs. The manufacture of each type of furniture requires lumber and two types of skilled labor: finishing and carpentry. The amount of each resource needed to make each type of furniture is given in the table below. Resource Desk Table Chair Lumber 8 board ft 6 board ft 1 board ft Finishing hours 4 hours 2 hours 1.5 hours Carpentry hours 0.5 hours 145

146 Ex. 2 - continued At present, 48 board feet of lumber, 20 finishing hours, 8 carpentry hours are available. A desk sells for $60, a table for $30, and a chair for $ 20. Dakota believes that demand for desks and chairs is unlimited, but at most 5 tables can be sold. Since the available resources have already been purchased, Dakota wants to maximize total revenue. 146

147 Example 2: Solution Define: The LP is: max z = 60x1 + 30x2 + 20x3
x1 = number of desks produced x2 = number of tables produced x3 = number of chairs produced The LP is: max z = 60x1 + 30x2 + 20x3 s.t x x x3 ≤ (lumber constraint) 4x x x3 ≤ (finishing constraint) 2x x x3 ≤ (carpentry constraint) x ≤ (table demand constraint) x1, x2, x3 ≥ 0 147

148 Ex. 2: Solution continued
Begin the simplex algorithm by converting the constraints of the LP to the standard form. Then convert the LP’s objective function to the row 0 format. To put the constraints in standard form, simply add slack variables s1, s2, s3, s4, respectively to the four constraints. Label the constraints row 1, row 2, row 3, row 4, and add the sign restrictions si ≥ 0. 148

149 Ex. 2: Solution continued
Putting rows 1-4 together in row 0 and the sign restrictions yields these equations and basic variables. Canonical Form 0 Basic Variable Row 0 z – 60x1 – 30x2 – 20x3 = 0 z = 0 Row 1 8x x x3 + s1 = 48 s1 = 48 Row 2 4x x x s2 = 20 s2 = 20 Row 3 2x x x s3 = 8 s3 = 8 Row 4 x s4 = 5 s4 = 5 149

150 Ex. 2: Solution continued
To perform the simplex algorithm, we need a basic (although not necessarily nonnegative) variable for row 0. Since z appears in row 0 with a coefficient of 1, and z does not appear in any other row, we use z as the basic variable. With this convention, the basic feasible solution for our initial canonical form has BV = {z, s1, s2, s3, s4} NBV = {x1, x2, x3}. For this initial bfs, z=0, s1=48, s2= 20, s3=8, s4=5, x1=x2=x3=0. A slack variable can be used as a basic variable if the rhs of the constraint is nonnegative. 150

151 Ex. 2: Solution continued
Once we have obtained a bfs, we need to determine whether it is optimal. To do this, we try to determine if there is any way z can be increased by increasing some nonbasic variable from its current value of zero while holding all other nonbasic variables at their current values of zero. Solving for z in row 0 yields: Z = 60x1 + 30x2 + 20x3 151

152 Ex. 2: Solution continued
For each nonbasic variable, we can use this equation to determine if increasing a nonbasic variable will increase z. Increasing any of the nonbasic variables will cause an increase in z. However increasing x1 causes the greatest rate of increase in z. If x1 increases from its current value of zero, it will have to become a basic variable. For this reason, x1 is called the entering variable. Observe x1 has the most negative coefficient in row 0. 152

153 Ex. 2: Solution continued
Next choose the entering variable to be the nonbasic variable with the most negative coefficient in row 0. Goal is to make x1 as large as possible but as it increases, the current basic variables will change value. Thus, increasing x1 may cause a basic variable to become negative. This means to keep all the basic variables nonnegative, the largest we can make x1 is min {6, 5, 4} = 4. 153

154 Ex. 2: Solution continued
Rule for determining how large an entering variable can be. When entering a variable into the basis, compute the ratio for every constraint in which the entering variable has a positive coefficient. The constraint with the smallest ratio is called the winner of the ratio test. The smallest ratio is the largest value of the entering variable that will keep all the current basic variables nonnegative. 154

155 Ex. 2: Solution continued
Always make the entering variable a basic variable in a row that wins the ratio test. To make x1 a basic variable in row 3, we use elementary row operations (EROs) to make x1 have a coefficient of 1 in row 3 and a coefficient of 0 in all other rows. This procedure is called pivoting on row 3; and row 3 is called the pivot row. The final result is that x1 replaces s3 as the basic variable for row 3. The term in the pivot row that involves the entering basic variable is called the pivot term. 155

156 Ex. 2: Solution continued
The result is The procedure used to go from one bfs to a better adjacent bfs is called an iteration of the simplex algorithm. Canonical Form 1 Basic Variable Row 0 z x x s3 = 240 z = 240 Row 1 x3 + s s3 = 16 s1 = 16 Row 2 x x s s3 = 4 s2 = 4 Row 3 x x x s3 x1 = 4 Row 4 x s4 = 5 s4 = 5 156

157 Ex. 2: Solution continued
Attempt to determine if the current bfs is optimal. Rearranging row 0 from Canonical Form 1, and solving for z yields z = 240 – 15x2 +5x3 -30s3 The current bfs is NOT optimal because increasing x3 to 1 (while holding the other nonbasic variable to zero) will increase the value of z. Making either x2 or s3 basic will cause the value of z to decrease. 157

158 Ex. 2: Solution continued
Recall the rule for determining the entering variable is the row 0 coefficient with the greatest negative value. Since x3 is the only variable with a negative coefficient, x3 should be entered into the basis. Performing the ratio test using x3 as the entering variable yields the following results (holding other NBVs to zero): From row 1, s1 ≥ 0 for all values of x3 since s1 = x3 From row 2, s2 ≥ 0 if x3 > 4 / 0.5 = 8 From row 3, x1 ≥ 0 if x3 >4 / 0.25 = 16 From row 4, s4 ≥ 0 for all values of x3 since s4 = 5 158

159 Ex. 2: Solution continued
This means to keep all the basic variables nonnegative, the largest we can make x1 is min {8,16} = 8. So, row 2 becomes the pivot row. The result of using EROs, to make x3 a basic variable in row 2. Canonical Form 2 Basic Variable Row 0 z x s s3 = 280 z = 280 Row 1 x s s s3 = 24 s1 = 24 Row 2 x2 + x s s3 = 8 x3 = 8 Row 3 x x s s3 = 2 x1 = 2 Row 4 x s4 = 5 s4 = 5 159

160 Ex. 2: Solution continued
In Canonical Form 2 BV = {z, s1, x3, x1, s4} NBV = {s3, s2, x2 } Yielding the bfs z = 280, s1=24, x3=8, x1=2, s4=5, s2=s3=x2=0 Solving for z in row 0 yields z = x2 - 10s2 -10s3 We can see that increasing x2, s2, or s3 (while holding the other NBVs to zero) will not cause the value of z to increase. 160

161 Ex. 2: Solution continued
The solution at the end of iteration 2 is therefore optimal. The following rule can be applied to determine whether a canonical form’s bfs is optimal. A canonical form is optimal (for a max problem) if each nonbasic variable has a nonnegative coefficient in the canonical form’s row 0. 161

162 4.6 Using the Simplex Algorithm to solve Minimization Problems
Two different ways the simplex method can be used to solve minimization problems. Method 1 - Consider this LP The optimal solution is the point (x1,x2) that makes z = 2x1 – 3x2 the smallest. Equivalently, this point makes max -z = - 2x x2 the largest. min z = 2x1 – 3x2 s.t. x1 + x2 ≤ 4 x1 – x2 ≤ 6 x1, x2 ≥ 0 162

163 This means we can find the optimal solution to the LP by solving this modified LP.
In solving this modified LP, use –z as the basic variable in row 0. After adding slack variables, s1 and s2 to the constraints the initial tableau s obtained. max -z = -2x1 + 3x2 s.t. x1 + x2 ≤ 4 x1 – x2 ≤ 6 x1, x2 ≥ 0 163

164 The optimal solution (to the max problem) is -z = 12, x2=4, s2=10, x1=s2=0.
Then the optimal solution to the min problem is z = -12, x2=4, s2 = 10, x1=s2=0. The min LP objective function confirms this: z =2x1-3x2=2(0)–3(4)= -12. 164

165 In summary, multiply the objective function for the min problem by -1 and solve the problem as a maximization problem with the objective function –z. The optimal solution to the max problem will give you the optimal solution for to the min problem. Remember that (optimal z-value for the min problem) = - (optimal z-value for the max problem). 165

166 Method 2 - A simple modification of the simplex algorithm can be used to solve min problems directly. Modify step 3 of the simplex algorithm If all nonbasic variables (NBV) in row 0 have nonpositive coefficients, the current bfs is optimal. If any nonbasic variable has a positive coefficient, choose the variable with the “most positive” coefficient in row 0 as the entering variable. This modification of the simplex algorithm works because increasing a nonbasic variable (NBV) with a positive coefficient in row 0 will decrease z. 166

167 4.7 Alternate Optimal Solutions
For some LPs, more than one extreme point is optimal. If an LP has more than one optimal solution, it has multiple optimal solutions or alternative optimal solutions. If there is no nonbasic variable (NBV) with a zero coefficient in row 0 of the optimal tableau, the LP has a unique optimal solution. Even if there is a nonbasic variable with a zero coefficient in row 0 of the optimal tableau, it is possible that the LP may not have alternative optimal solutions. 167

168 4.8 – Unbounded LPs For some LPs, there exist points in the feasible region for which z assumes arbitrarily large (in max problems) or arbitrarily small (in min problems) values. When this occurs, we say the LP is unbounded. An unbounded LP occurs in a max problem if there is a nonbasic variable with a negative coefficient in row 0 and there is no constraint that limits how large we can make this NBV. Specifically, an unbounded LP for a max problem occurs when a variable with a negative coefficient in row 0 has a non positive coefficient in each constraint. 168

169 4.9 The LINDO Computer Package
LINDO (Linear Interactive and Discrete Optimizer) is a user friendly computer package that can be used to solve linear, integer, and quadratic programming problems. LINDO assumes all variables are nonnegative, so nonnegative constraints are not necessary. To be consistent with LINDO, the objective function row is labeled row 1 and constraints rows 2-5. View the LINDO Help file for syntax questions 169

170 The first statement in a LINDO model is always the objective function.
To enter this problem into LINDO make sure the screen contains a blank window (work area) with the word “untitled” at the top of the work area. If necessary, a new window can be opened by selecting New from the file menu or by clicking on the File button. The first statement in a LINDO model is always the objective function. MAX or MIN directs LINDO to solve a maximization or minimization problem. 170

171 Enter the constraints by typing SUBJECT TO (or st) on the next line
Enter the constraints by typing SUBJECT TO (or st) on the next line. Then enter the constraints. LINDO assumes the < symbol means ≤ and the > symbol means ≥. There is no need to insert the asterisk symbol between coefficients and variables to indicate multiplication. 171

172 To save the file for future use, select SAVE from the File menu and when asked, replace the * symbol with the name of your choice. The file will be saved with the name you select with the suffix .LTX. DO NOT type over the .LTX suffix. Save using any path available. To solve the model, select the SOLVE command or click the red bulls eye button. When asked if you want to do a range (sensitivity) analysis, choose no. When the solution is complete, a display showing the status of the Solve command will be present. View the displayed information and select CLOSE. 172

173 The data input window will now be overlaying the Reports Window.
Click anywhere in the Reports Window to bring it to the foreground. 173

174 The LINDO output shows:
LINDO found the optimum solution after 2 iterations (pivots) The optimal z-value = 280 VALUE gives the value of the variable in the optimal LP solution. Thus the optimal solution calls for production of 2 desks, 0 tables, and 8 chairs. SLACK OR SURPLUS gives (by constraint row) the value of slack or excess in the optimal solution. REDUCED COST gives the coefficient in row 0 of the optimal tableau (in a max problem). The reduced cost of each basic variables must be 0. 174

175 4.10 Matrix Generators, LINGO and Scaling LPs
Most actual applications of LP use a matrix generator to simplify the inputting of the LP. A matrix generator allows the user to input the relevant parameters that determine the LPs objective function and constraints; it then generates the LP formulation from that information. The package LINGO is an example of a sophisticated matrix generator. LINGO is an optimization modeling language that enables users to create many constraints or objective function terms by typing one line. 175

176 An LP package may have trouble solving LPs in which there are nonzero coefficients that are either very small or very large in absolute value. In these cases, LINDO will respond with a message that the LP is poorly scaled. 176

177 4.11 Degeneracy and the Convergence of the Simplex Algorithm
Theoretically, the simplex algorithm can fail to find an optimal solution to an LP. However, LPs arising from actual applications seldom exhibit this unpleasant behavior. The following are facts: If (value of entering variable in new bfs) > 0, then (z-value for new bfs) > (z-value for current bfs). If (value of entering variable in new bfs) = 0, then (z-value for new bfs) = (z-value for current bfs). Assume that in each of the LP’s basic feasible solutions all basic variables are positive. 177

178 An LP with this property is a nondegenerate LP.
An LP is degenerate if it has at least one bfs in which a basic variable is equal to zero. Any bfs that has at least one basic variable equal to zero is a degenerate bfs. When the same bfs is encountered twice it is called cycling. If cycling occurs, then we will loop, or cycle, forever among a set of basic feasible solutions and never get to an optimal solution. 178

179 Some degenerate LPs have a special structure that enables us to solve them by methods other than the simplex. 179

180 4.12 The Big M Method The simplex method algorithm requires a starting bfs. Previous problems have found starting bfs by using the slack variables as our basic variables. If an LP have ≥ or = constraints, however, a starting bfs may not be readily apparent. In such a case, the Big M method may be used to solve the problem. 180

181 Example 4: Bevco Bevco manufactures an orange-flavored soft drink called Oranj by combining orange soda and orange juice. Each ounce of orange soda contains 0.5 oz of sugar and 1 mg of vitamin C. Each ounce of orange juice contains 0.25 oz of sugar and 3 mg of vitamin C. It costs Bevco 2¢ to produce an ounce of orange soda and 3¢ to produce an ounce of orange juice. Bevco’s marketing department has decided that each 10-oz bottle of Oranj must contain at least 20 mg of vitamin C and at most 4 oz of sugar. Use linear programming to determine how Bevco can meet the marketing department’s requirements at minimum cost. 181

182 Example 4: Solution Letting x1 = number of ounces of orange soda in a bottle of Oranj x2 = number of ounces of orange juice in a bottle of Oranj The LP is: min z = 2x1 + 3x2 st 0.5x x2 ≤ 4 (sugar constraint) x x2 ≥ 20 (Vitamin C constraint) x x2 = 10 (10 oz in 1 bottle of Oranj) x1, x2, > 0 182

183 Ex. 4 – Solution continued
The LP in standard form has z and s1 which could be used for BVs but row 2 would violate sign restrictions and row 3 no readily apparent basic variable. In order to use the simplex method, a bfs is needed. To remedy the predicament, artificial variables are created. The variables will be labeled according to the row in which they are used. Row 0: z - 2x x = 0 Row 1: x x2 + s = 4 Row 2: x x e2 = 20 Row 3: x x = 10 183

184 Ex. 4 – Solution continued
Row 0: z - 2x x = 0 Row 1: x x2 + s = 4 Row 2: x x e2 + a2 = 20 Row 3: x x a3 = 10 In the optimal solution, all artificial variables must be set equal to zero. To accomplish this, in a min LP, a term Mai is added to the objective function for each artificial variable ai. For a max LP, the term –Mai is added to the objective function for each ai. M represents some very large number. 184

185 Ex. 4 – Solution continued
The modified Bevco LP in standard form then becomes: Modifying the objective function this way makes it extremely costly for an artificial variable to be positive. The optimal solution should force a2 = a3 =0. Row 0: z - 2x x Ma2 - Ma3 = 0 Row 1: x x2 + s = 4 Row 2: x x e a = 20 Row 3: x x a3 = 10 185

186 Description of the Big M Method
Modify the constraints so that the rhs of each constraint is nonnegative. Identify each constraint that is now an = or ≥ constraint. Convert each inequality constraint to standard form (add a slack variable for ≤ constraints, add an excess variable for ≥ constraints). For each ≥ or = constraint, add artificial variables. Add sign restriction ai ≥ 0. Let M denote a very large positive number. Add (for each artificial variable) Mai to min problem objective functions or -Mai to max problem objective functions. 186

187 Since each artificial variable will be in the starting basis, all artificial variables must be eliminated from row 0 before beginning the simplex. Remembering M represents a very large number, solve the transformed problem by the simplex. If all artificial variables in the optimal solution equal zero, the solution is optimal. If any artificial variables are positive in the optimal solution, the problem is infeasible. 187

188 Ex. 4 – Solution continued
The variable with the positive coefficient in row 0 should enter the basis since this is a min problem. The ratio test indicates that x2 should enter the basis in row 2, which means the artificial variable a2 will leave the basis. Use EROs to eliminate x2 from row 1 and row 3. The ratio test indicates that x1 should enter the basis in the third row, which means then that a3 will leave the basis. 188

189 Ex. 4 – Solution continued
The optimal solution for Bevco is z=25, x1=x2 = 5, s1=1/4, e2=0. This means that Bevco can hold the cost of producing a 10-oz. bottle of Oranj to $.25 by mixing 5 oz of orange soda and 5 oz of orange juice. 189

190 If any artificial variable is positive in the optimal Big M tableau, then the original LP has no feasible solution. 190

191 4.13 The Two-Phase Simplex Method
When a basic feasible solution is not readily available, the two-phase simplex method may be used as an alternative to the Big M method. In this method, artificial variables are added to the same constraints, then a bfs to the original LP is found by solving Phase I LP. In Phase I LP, the objective function is to minimize the sum of all artificial variables. At completion, reintroduce the original LP’s objective function and determine the optimal solution to the original LP. 191

192 Because each ai ≥ 0, solving the Phase I LP will result in one of the following three cases:
The optimal value of w’ is greater than zero. The optimal value of w’ is equal to zero, and no artificial variables are in the optimal Phase I basis. In this case, drop all columns in the optimal Phase I tableau that correspond to the artificial variables. Now combine the original objective function with the constraints from the optimal Phase I tableau. This yields the Phase II LP. The optimal value of w’ is equal to zero and at least one artificial variable is in the optimal Phase I basis. 192

193 4.14 Unrestricted-in-Sign Variables
If some variables are allowed to be unrestricted in sign (urs), the ratio test and therefore the simplex algorithm are no longer valid. An LP with an unrestricted-in-sign variable can be transformed into an LP in which all variables are required to be non-negative. For each urs variable, define two new variables x’i and xni. Then substitute x’i - xni for xi in each constraint and in the objective function. Also add the sign restrictions. 193

194 The effect of this substitution is to express xi as the difference of the two nonnegative variables x’i and xni. No basic feasible solution can have both x’i ≥ 0 and xni ≥ 0. For any basic feasible solution, each urs variable xi must fall into one of the following three cases. x’i > 0 and xni = 0 x’i = 0 and xni > 0 x’i = xni = 0 194

195 4.15 Karmarkar’s Method for Solving LPs
Karmarkar’s method requires that the LP be placed in the following form and that The point be feasible for this LP. The optimal z-value for the LP equals 0. Karmarkar’s method has been shown to be a polynomial time algorithm. This implies that if an LP of size n is solved by Karmarkar’s method, then there exist positive numbers a and b such that for any n can be solved in a time of at most anb. min z = cx s.t. Kx = 0 x1 + x2+ ∙∙∙ xn = 1 xi ≥ 0 195

196 4.16 Multiattribute Decision Making in the Absence of Uncertainty: Goal Programming
In some situations, a decision maker may face multiple objectives, and there may be no point in an LP’s feasible region satisfying all objectives. In such a case, how can the decision maker choose a satisfactory decision? Goal programming is one technique that can be used. 196

197 Example 10: Burnit Goal Programming
The Leon Burnit Adveritsing Agency is trying to determine a TV advertising schedule for Priceler Auto Company. Priceler has three goals: Its ads should be seen by at least 40 million high- income men (HIM). Its ads should be seen by at least 60 million low- income people (LIP). Its ads should be seen by at least 35 million high- income women (HIW). Leon Burnit can purchase two types of ads: those shown during football games and those shown during soap operas. 197

198 Ex. 10 - continued At most, $600,000 can be spent on ads.
The advertising costs and potential audiences of a one-minute ad of each type are shown. Leon Burnit must determine how many football ads and soap opera ads to purchase for Priceler. Millions of Viewers Ad HIM LIP HIW Cost (S) Football 7 10 5 100,000 Soap Opera 3 4 60,000 198

199 Example 10: Solution Let X1 = number of minutes of ads shown during football games x2 = number of minutes of ads shown during soap operas Then any feasible solution to the following LP would meet Priceler’s goals: From the graph it can be seen that no point satisfies the budget constraint meets all threes of Priceler’s goals. min (or max) z=0x1 + 0x2 (or any other objective function) s.t x1 + 3x2 ≥ (HIM constraint) 10x1 + 5x2 ≥ (LIP constraint) 5x1 + 4x2 ≥ (HIW constraint) 100x1 + 60x2 ≥ 600 (Budget constraint) x1, x2 ≥ 0 199

200 Ex. 10 – Solution continued
Since it is impossible to meet all of Priceler’s goals, Burnit might ask Priceler to identify, for each goal, a cost that is incurred for failing to meet the goal. Burnit can then formulate an LP that minimizes the cost incurred in deviating from Priceler’s three goals. The trick is to transform each inequality constraint that represents one of Priceler’s goals into an equality constraint. 200

201 Ex. 10 – Solution continued
Since it is not known whether a given solution will undersatisfy or oversatisfy a given goal, we need to define the following variables. si+ = amount by which we numerically exceed the ith goal si- = amount by which we are numerically under the ith goal The si+ and si- are referred to as deviational variables. Rewrite the first three constraints using the deviational variables. 201

202 Ex. 10 – Solution continued
Burnit can minimize the penalty from Priceler’s lost sales by solving the following LP. The optimal solution meets goal 1 and goal 2 but fails to meet the least important goal. min z=200s s s (or any other objective function) s.t x1 + 3x2 + s1- - s1+ = (HIM constraint) 10x1 + 5x2 + s2- - s2+ = (LIP constraint) 5x1 + 4x2 + s3- - s3+ = (HIW constraint) 100x1 + 60x ≤ 600 (Budget constraint) All variables nonnegative 202

203 Pre-emptive goal programming problems can be solved by an extension of the simplex known as the goal programming simplex. The differences between the goal programming simplex and the ordinary simplex are: The ordinary simplex has a single row 0, whereas the goal programming simplex requires n row 0’s (one for each goal). In the goal programming simplex, different method is used to determine the entering variable. When a pivot is performed, row 0 for each goal must be updated. 203

204 LINDO can be used to solve preemptive goal programming problems.
A tableau will yield the optimal solution if all goals are satisfied, or if each variable that can enter the basis and reduce the value of zi’ for an unsatisfied goal i’ will increase the deviation from some goal i having a higher priority than goal i’. If a computerized goal program is used the decision maker can have a number of solutions to choose from. When a preemptive goal programming problem involves only two decision variables, the optimal solution can be found graphically. LINDO can be used to solve preemptive goal programming problems. 204

205 4.17 Using the Excel Solver to Solve LPs
Excel has the capability to solve linear programming problems. The key to solving an LP on a spreadsheet is to set up a spreadsheet that tracks everything of interest. Changing cells and target cells need to be identified. 205

206 Operations Research: Applications and Algorithms
Chapter 5 Sensitivity Analysis: An Applied Approach to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 206

207 5.1 – A Graphical Introduction to Sensitivity Analysis
Sensitivity analysis is concerned with how changes in an LP’s parameters affect the optimal solution. Reconsider the Giapetto problem from Chapter 3. Where: x1 = number of soldiers produced each week x2 = number of trains produced each week. max z = 3x1 + 2x2 2 x1 + x2 ≤ 100 (finishing constraint) x1 + x2 ≤ (carpentry constraint) x ≤ 40 (demand constraint) x1,x2 ≥ (sign restriction) 207

208 The optimal solution for this LP was z = 180, x1=20, x2= 60 (point B) and it has x1, x2, and s3 (the slack variable for the demand constraint) as basic variables. How would changes in the problem’s objective function coefficients or right-hand side values change this optimal solution? 208

209 Graphical analysis of the effect of a change in an objective function value for the Giapetto LP shows: By inspection, we can see that making the slope of the isoprofit line more negative than the finishing constraint (slope = -2) will cause the optimal point to switch from point B to point C. Likewise, making the slope of the isoprofit line less negative than the carpentry constraint (slope = -1) will cause the optimal point to switch from point B to point A. Clearly, the slope of the isoprofit line must be between -2 and -1 for the current basis to remain optimal. 209

210 A graphical analysis can also be used to determine whether a change in the rhs of a constraint will make the current basis no longer optimal. For example, let b1 = number of available finishing hours. The current optimal solution (point B) is where the carpentry and finishing constraints are binding. If the value of b1 is changed, then as long as where the carpentry and finishing constraints are binding, the optimal solution will still occur where the carpentry and finishing constraints intersect. 210

211 In the Giapetto problem to the right, we see that if b1 > 120, x1 will be greater than 40 and will violate the demand constraint. Also, if b1 < 80, x1 will be less than 0 and the nonnegativity constraint for x1 will be violated. Therefore: 80 ≤b1≤ 120 The current basis remains optimal for 80 ≤b1≤ 120, but the decision variable values and z-value will change. 211

212 For the finishing constraint, 100 +  finishing hours are available.
It is often important to determine how a change in a constraint’s rhs changes the LP’s optimal z-value. The shadow price for the ith constraint of an LP is the amount by which the optimal z-value is improved if the rhs of the ith constraint is increased by one. This definition applies only if the change in the rhs of constraint i leaves the current basis optimal. For the finishing constraint,  finishing hours are available. The LP’s optimal solution is then x1 = 20 +  and x2 = 60 –  with z = 3x1 + 2x2 = 3( ) + 2(60 - ) = . Thus, as long as the current basis remains optimal, a one- unit increase in the number of finishing hours will increase the optimal z-value by $1. So, the shadow price for the first (finishing hours) constraint is $1. 212

213 Sensitivity analysis is important for several reasons:
Values of LP parameters might change. If a parameter changes, sensitivity analysis shows it is unnecessary to solve the problem again. For example in the Giapetto problem, if the profit contribution of a soldier changes to $3.50, sensitivity analysis shows the current solution remains optimal. Uncertainty about LP parameters. In the Giapetto problem for example, if the weekly demand for soldiers is at least 20, the optimal solution remains 20 soldiers and 60 trains. Thus, even if demand for soldiers is uncertain, the company can be fairly confident that it is still optimal to produce 20 soldiers and 60 trains. 213

214 5.2 The Computer and Sensitivity Analysis
If an LP has more than two decision variables, the range of values for a rhs (or objective function coefficient) for which the basis remains optimal cannot be determined graphically. These ranges can be computed by hand but this is often tedious, so they are usually determined by a packaged computer program. LINDO will be used and the interpretation of its sensitivity analysis discussed. 214

215 Example 1: Winco Products 1
Winco sells four types of products. The resources needed to produce one unit of each are known. To meet customer demand, exactly 950 total units must be produced. Customers demand that at least 400 units of product 4 be produced. Formulate an LP to maximize profit. Product 1 Product 2 Product 3 Product 4 Raw material 2 3 4 7 Hours of labor 5 6 Sales price $4 $6 $7 $8 215

216 Example 1: Solution Let xi = number of units of product i produced by Winco. The Winco LP formulation: max z = 4x1 + 6x2 +7x3 + 8x4 s.t. x1 + x x3 + x4 = 950 x4 ≥ 400 2x1 + 3x2 + 4x3 + 7x4 ≤ 4600 3x1 + 4x2 + 5x3 + 6x4 ≤ 5000 x1,x2,x3,x4 ≥ 0 216

217 Ex. 1 – Solution continued
The LINDO output. Reduced cost is the amount the objective function coefficient for variable i would have to be increased for there to be an alternative optimal solution. MAX X1 + 6 X2 + 7 X3 + 8 X4 SUBJECT TO 2) X1 + X2 + X3 + X4 = 3) X4 >= 400 4) 2 X1 + 3 X2 + 4 X3 + 7 X4 <= 5) 3 X1 + 4 X2 + 5 X3 + 6 X4 <= END LP OPTIMUM FOUND AT STEP OBJECTIVE FUNCTION VALUE 1) VARIABLE VALUE REDUCED COST X X X X ROW SLACK OR SURPLUS DUAL PRICES 2) 3) 4) 5) NO. ITERATIONS= 217

218 Ex. 1 – Solution continued
LINDO sensitivity analysis output Allowable range (w/o changing basis) for the x2 coefficient (c2) is: £ c2 £ 6.667 Allowable range (w/o changing basis) for the rhs (b1) of the first constraint is: £ b1 £ 1000 RANGES IN WHICH THE BASIS IS UNCHANGED: OBJ COEFFICIENT RANGES VARIABLE CURRENT ALLOWABLE ALLOWABLE COEF INCREASE DECREASE X INFINITY X X X INFINITY RIGHTHAND SIDE RANGES ROW CURRENT ALLOWABLE ALLOWABLE RHS INCREASE DECREASE INFINITY 218

219 Ex. 1 – Solution continued
Shadow prices are shown in the Dual Prices section of LINDO output. Shadow prices are the amount the optimal z- value improves if the rhs of a constraint is increased by one unit (assuming no change in basis). MAX X1 + 6 X2 + 7 X3 + 8 X4 SUBJECT TO 2) X1 + X2 + X3 + X4 = 3) X4 >= 400 4) 2 X1 + 3 X2 + 4 X3 + 7 X4 <= 5) 3 X1 + 4 X2 + 5 X3 + 6 X4 <= END LP OPTIMUM FOUND AT STEP OBJECTIVE FUNCTION VALUE 1) VARIABLE VALUE REDUCED COST X X X X ROW SLACK OR SURPLUS DUAL PRICES 2) 3) 4) 5) NO. ITERATIONS= 219

220 Constraints with symbols will always have nonpositive shadow prices.
Shadow price signs Constraints with symbols will always have nonpositive shadow prices. Constraints with  will always have nonnegative shadow prices. Equality constraints may have a positive, a negative, or a zero shadow price. 220

221 Allowable Increase for rhs Allowable Decrease for rhs
For any inequality constraint, the product of the values of the constraint’s slack/excess variable and the constraint’s shadow price must equal zero. This implies that any constraint whose slack or excess variable > 0 will have a zero shadow price. Similarly, any constraint with a nonzero shadow price must be binding (have slack or excess equaling zero). For constraints with nonzero slack or excess, relationships are detailed in the table below: Type of Constraint Allowable Increase for rhs Allowable Decrease for rhs = value of slack = value of excess 221

222 When the optimal solution is degenerate (a bfs is degenerate if at least one basic variable in the optimal solution equals 0), caution must be used when interpreting the LINDO output. For an LP with m constraints, if the optimal LINDO output indicates less than m variables are positive, then the optimal solution is degenerate bfs. MAX X1 + 4 X2 + 3 X3 + 2 X4 SUBJECT TO 2) 2 X1 + 3 X2 + X3 + 2 X4 <= 400 3) X1 + X2 + 2 X3 + X4 <= 150 4) 2 X1 + X2 + X X4 <= 200 5) 3 X1 + X2 + X4 <= 250 222

223 Since the LP has four constraints and in the optimal solution only two variables are positive, the optimal solution is a degenerate bfs. LP OPTIMUM FOUND AT STEP OBJECTIVE FUNCTION VALUE 1) VARIABLE VALUE REDUCED COST X X X X ROW SLACK OR SURPLUS DUAL PRICES 2) 3) 4) 5) 223

224 RANGES IN WHICH THE BASIS IS UNCHANGED: OBJ COEFFICIENT RANGES
VARIABLE CURRENT ALLOWABLE ALLOWABLE COEF INCREASE DECREASE X X X X INFINITY RIGHTHAND SIDE RANGES ROW CURRENT ALLOWABLE ALLOWABLE RHS INCREASE DECREASE INFINITY 224

225 LINDO TABLEAU command indicates the optimal basis is RV = {x1,x2,x3,s4}.
THE TABLEAU ROW (BASIS) X X X X SLK 2 1 ART X X 4 SLK X ROW SLK 3 SLK SLK 5 Author has error in 1st line of this slide. He says basis is x1,x3,s3,x1 (see page 298? of new text) 225

226 Oddities that may occur when the optimal solution found by LINDO is degenerate are:
In the RANGE IN WHICH THE BASIS IS UNCHANGED at least one constraint will have a 0 AI or AD. This means that for at least one constraint the DUAL PRICE can tell us about the new z-value for either an increase or decrease in the rhs, but not both. For a nonbasic variable to become positive, a nonbasic variable’s objective function coefficient may have to be improved by more than its RECDUCED COST. Increasing a variable’s objective function coefficient by more than its AI or decreasing it by more than its AD may leave the optimal solution the same. 226

227 5.3 Managerial Use of Shadow Prices
The managerial significance of shadow prices is that they can often be used to determine the maximum amount a manger should be willing to pay for an additional unit of a resource. 227

228 Example 5: Winco Products 2
Reconsider the Winco to the right. What is the most Winco should be willing to pay for additional units of raw material or labor? 228

229 Example 5: Solution The shadow price for raw material constraint (row 4) shows an extra unit of raw material would increase revenue $1. Winco could pay up to $1 for an extra unit of raw material and be as well off as it is now. Labor constraint’s (row 5) shadow price is 0 meaning that an extra hour of labor will not increase revenue. So, Winco should not be willing to pay anything for an extra hour of labor. MAX X1 + 6 X2 + 7 X3 + 8 X4 SUBJECT TO 2) X1 + X2 + X3 + X4 = 3) X4 >= 400 4) 2 X1 + 3 X2 + 4 X3 + 7 X4 <= 5) 3 X1 + 4 X2 + 5 X3 + 6 X4 <= END LP OPTIMUM FOUND AT STEP OBJECTIVE FUNCTION VALUE 1) VARIABLE VALUE REDUCED COST X X X X ROW SLACK OR SURPLUS DUAL PRICES 2) 3) 4) 5) NO. ITERATIONS= 229

230 5.4 What happens to the Optimal z-Value if the Current Basis Is No Longer Optimal?
Shadow prices were used to determine the new optimal z-value if the rhs of a constraint was changed but remained within the range where the current basis remains optimal. Changing the rhs of a constraint to values where the current basis is no longer optimal can be addressed by the LINDO PARAMETRICS feature. This feature can be used to determine how the shadow price of a constraint and optimal z-value change. My temp version of LINDO would not solve the LP with material set to zero 230

231 The use of the LINDO PARAMETICS feature is illustrated by varying the amount of raw material in the Winco example. Suppose we want to determine how the optimal z- value and shadow price change as the amount of raw material varies between 0 and 10,000 units. With 0 raw material, we then obtain from the RANGE and SENSITIVTY ANALYSIS results that show Row 4 has an ALLOWABLE INCREASE of This indicates at least 3900 units of raw material are required to make the problem feasible. 231

232 Raw Material rhs = 3900 optimal solution
RANGES IN WHICH THE BASIS IS UNCHANGED: OBJ COEFFICIENT RANGES VARIABLE CURRENT ALLOWABLE ALLOWABLE COEF INCREASE DECREASE X INFINITY X INFINITY X INFINITY X INFINITY RIGHTHAND SIDE RANGES ROW CURRENT ALLOWABLE ALLOWABLE RHS INCREASE DECREASE INFINITY OBJECTIVE FUNCTION VALUE 1) VARIABLE VALUE REDUCED COST X X X X ROW SLACK OR SURPLUS DUAL PRICES 2) 3) 4) 5) THE TABLEAU ROW (BASIS) X X X X SLK SLK 4 ART X X X SLK ART ART SLK 5 232

233 Changing Row 4’s rhs to 3900, resolving the LP, and selecting the REPORTS PARAMTERICS feature.
In this feature we choose Row 4, setting the Value to , and select text output. RIGHTHANDSIDE PARAMETRICS REPORT FOR ROW: 4 VAR VAR PIVOT RHS DUAL PRICE OBJ OUT IN ROW VAL BEFORE PIVOT VAL X X SLK 5 SLK X3 SLK E E 233

234 LINDO Parametric Feature Graphical Output (z- value vs
LINDO Parametric Feature Graphical Output (z- value vs. Raw Material rhs from 3900 to ) 234

235 For any LP, the graph of the optimal objective function value as a function a rhs will be a piecewise linear function. The slope of each straight line segment is just the constraint’s shadow price. For < constraints in a maximization LP, the slope of each segment must be nonnegative and the slopes of successive line segments will be nonincreasing. For a > constraint, in a maximization problem, the graph of the optimal function will again be piecewise linear function. The slope of each line segment will be nonpositive and the slopes of successive segments will be nonincreasing 235

236 A graph of the optimal objective function value as a function of a variable’s objective function coefficient can be created. When the slope of the line is portrayed graphically, the graph is a piecewise linear function. The slope of each line segment is equal to the value of x1 in the optimal solution. 236

237 In a maximization LP, the slope of the graph of the optimal z-value as a function of an objective function coefficient will be nondecreasing. In a minimization LP, the slope of the graph of the optimal z-value as a function of an objective function coefficient will be nonincreasing. 237

238 Operations Research: Applications and Algorithms
Chapter 6 Sensitivity Analysis & Duality to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 238

239 6.1 A Graphical Introduction to Sensitivity Analysis
Sensitivity analysis is concerned with how changes in an LP’s parameters affect the optimal solution. The optimal solution to the Giapetto problem was z = 180, x1 = 20, x2 = 60 (Point B in the figure to the right) and it has x1, x2, and s3 (the slack variable for the demand constraint) as basic variables. How would changes in the problem’s objective function coefficients or the constraint’s right- hand sides change this optimal solution? 239

240 If the isoprofit line is flatter than the carpentry constraint, Point A(0,80) is optimal.
Point B(20,60) is optimal if the isoprofit line is steeper than the carpentry constraint but flatter than the finishing constraint. Finally, Point C(40,20) is optimal if the slope of the isoprofit line is steeper than the slope of the finishing constraint. Since a typical isoprofit line is c1x1 + 2x2 = k, we know the slope of the isoprofit line is just -c1/2. 240

241 A graphical analysis can also be used to determine whether a change in the rhs of a constraint will make the basis no longer optimal. In a constraint with a positive slack (or positive excess) in an LPs optimal solution, if we change the rhs of the constraint to a value in the range where the basis remains optimal, the optimal solution to the LP remains the same. 241

242 It is important to determine how a constraint’s rhs changes the optimal z-value.
The shadow price for the ith constraint of an LP is the amount by which the optimal z-value is improved if the rhs of the ith constraint is increased by one. This definition applies only if the change in the rhs of constraint i leaves the current basis optimal. 242

243 Sensitivity analysis is important because
Values of LP parameters might change. If a parameter changes, sensitivity analysis shows it is unnecessary to solve the problem again. Uncertainty about LP parameters. Even if demand is uncertain, the company can be fairly confident that it can still produce optimal amounts of products. 243

244 6.2 Some Important Formulas
An LP’s optimal tableau can be expressed in terms of the LP’s parameters. The formulas are used in the study of sensitivity analysis, duality, and advanced LP topics. When solving a max problem that has been prepared for solution by the Big M method with the LP having m constraints and n variables. Although some of the variables may be slack, excess, or artificial, they are labeled x1, x2, …,xn. 244

245 The LP may then be written as
max z = c1x1 + c2x2 + … + cnxn s.t. a11x1 + a12x2 + … + a1nxn = b1 a21x1 + a22x2 + … + a2nxn = b2 … … … … … amx1 + am2x2 + … + amnxn = bm xi ≥ 0 (i = 1, 2, …, n) 245

246 cBV is the 1 x m row vector [cBV1 cBV2 ∙∙∙ cBVm].
BV = {BV1, BV2, …, BVn} to be the set of basic variables in the optimal tableau. NBV = {NBV1, NBV2, …, NBVn} the set of nonbasic variables in the optimal tableau. cBV is the 1 x m row vector [cBV1 cBV2 ∙∙∙ cBVm]. cNBV is the 1 x (n-m) row vector whose elements are the coefficients of the nonbasic variables (in the order of NBV). The m x m matrix B is the matrix whose jth column is the column for BVj in the initial tableau. 246

247 Aj is the column (in the constraints) for the variable xj.
N is the m x (n-m) matrix whose columns are the columns for the nonbasic variables (in NBV order) in the initial tableau. The m x 1 column vector b is the right-hand side of the constraints in the initial tableau. Matrix algebra can be used to determine how an LP’s optimal tableau (with the set of basic variables BV) is related to the original LP. 247

248 The initial LP can now be written as
Let be the coefficient of xj, then it can be shown that and Right-hand side of optimal tableau’s row 0=cBVB-1b z = cBVxBV + cNBVxNBV s.t. BxBV + NxNBV = b xBV, xNBV ≥ 0 248

249 Right-hand side of optimal row 0 = cBVB-1b
Coefficient of slack variable si in optimal row 0 = ith element of cBVB-1 Coefficient of excess variable ei in optimal row 0 = -(ith element of cBVB-1) Coefficient of artificial variable ai in optimal row 0 = (ith element of cBVB-1) + M (max problem) Right-hand side of optimal row 0 = cBVB-1b 249

250 6.3 Sensitivity Analysis How do changes in an LP’s parameters (objective function coefficients, right-hand sides, and technological coefficients) change the optimal solution? Let BV be the set of basic variables in the optimal tableau. Given a change in an LP, determine if the BV remains optimal. From Chapter 4 we know the simplex tableau (for a max problem) for a set of basic variables is optimal if and only if each constraint has a nonnegative rhs and each variable has a nonnegative coefficient. 250

251 We can use the following procedure to determine if any change in the LP will cause the BV to no longer be optimal. Step 1 Using the formulas of Section 6.2 determine how changes in the LP’s parameters change the right hand side row 0 of the optimal tableau (the tableau having BV as the set of basic variables). Step 2 If each variable in row 0 has a nonnegative coefficient and each constraint has a nonnegative rhs, BV is still optimal. Otherwise, BV is no longer optimal. If BV is no longer optimal, find the new optimal solution by using the formulas to recreate the entire tableau for BV and then continuing the simplex algorithm with the BV tableau as your starting tableau. 251

252 There can two reasons why a change in an LP’s parameters cause BV to no longer be optimal.
A variable (or variables) in row 0 may have a negative coefficient. In this case, a better (larger z- value) bfs can be obtained by pivoting in a nonbasic variable with a negative coefficient in row 0. If this occurs, the BV is now a suboptimal basis. A constraint (or constraints) may now have a negative rhs. In this case, at least one member of BV will now be negative and BV will no longer yield a bfs. If this occurs, we say they BV is now an infeasible basis. 252

253 Six types of changes in an LP’s parameters change the optimal solution.
Changing the objective function coefficient of a nonbasic variable. Changing the objective function coefficient of a basic variable. Changing the right-hand side of a constraint. Changing the column of a nonbasic variable. Adding a new variable or activity. Adding a new constraint. 253

254 6.4 Sensitivity Analysis When More Than One Parameter Is Changed: The 100% Rule
LINDO output can be used to determine whether the current basis remains optimal when more than one objective function coefficient or right-hand side is changed. Depending on whether the objective function coefficient of any variable with a zero reduced cost in the optimal tableau is changed, there are two cases to consider: Case 1 – All variables whose objective function coefficients are changed have nonzero reduced costs in the optimal row 0. The current basis remains optimal if and only if the objective function coefficient for each variable remains within the allowable range given on the LINDO printout. If the current basis remains optimal, both the values of the decision variables and objective function remain unchanged. If the objective coefficient for any variable is outside the allowable range, the current basis is no longer optimal. 254

255 Case 2 – At least one variable whose objective function coefficient is changed has a reduced cost of zero. Depending on whether any of the constraints whose right-hand side are being modified are binding constraints, there are two cases to consider. 255

256 Case 1 – All constraints whose right-hand sides are being modified are nonbinding constraints.
The current basis remains optimal if and only if each right-hand side remains within its allowable range. Then the values of the decision variables and optimal objective function remain unchanged. If the right-hand side for any constraint is outside its allowable range, the current basis is no longer optimal. Case 2 – At least one of the constraints whose right- hand side is being modified is a binding constraint (that is, has zero slack or excess). 256

257 6.5 – Finding the Dual of an LP
Associated with any LP is another LP called the dual. Knowledge of the dual provides interesting economic and sensitivity analysis insights. When taking the dual of any LP, the given LP is referred to as the primal. If the primal is a max problem, the dual will be a min problem and visa versa. To find the dual to a max problem in which all the variables are required to be nonnegative and all the constraints are ≤ constraints (called normal max problem) the problem may be written as 257

258 The dual of a normal max problem is called a normal min problem.
max z = c1x1+ c2x2 +…+ cnxn s.t. a11x1 + a12x2 + … + a1nxn ≤ b1 a21x1 + a22x2 + … + a2nxn ≤ b2 … … … … am1x1 + am2x2 + … + amnxn ≤ bm xj ≥ 0 (j = 1, 2, …,n) The dual of a normal max problem is called a normal min problem. min w = b1y1+ b2y2 +…+ bmym s.t. a11y1 + a21y2 + … + am1ym ≥ c1 a12y1 + a22y2 + … + am2ym ≥ c2 … … … … a1ny1 + a2ny2 + …+ amnym ≥ cn yi ≥ 0 (i = 1, 2, …,m) 258

259 6.6 Economic Interpretation of the Dual Problem
Suppose an entrepreneur wants to purchase all of Dakota’s resources. The entrepreneur must determine the price he or she is willing to pay for a unit of each of Dakota’s resources. To determine these prices define: y1 = price paid for 1 boards ft of lumber y2 = price paid for 1 finishing hour y3 = price paid for 1 carpentry hour The resource prices y1, y2, and y3 should be determined by solving the Dakota dual. In setting resource prices, the prices must be high enough to induce Dakota to sell. 259

260 When the primal is a normal max problem, the dual variables are related to the value of resources available to the decision maker. For this reason, dual variables are often referred to as resource shadow prices. 260

261 6.7 The Dual Theorem and Its Consequences
The Dual Theorem states that the primal and dual have equal optimal objective function values (if the problems have optimal solutions). Weak duality implies that if for any feasible solution to the primal and an feasible solution to the dual, the w-value for the feasible dual solution will be at least as large as the z-value for the feasible primal solution. Any feasible solution to the dual can be used to develop a bound on the optimal value of the primal objective function. 261

262 If the primal is unbounded, then the dual problem is infeasible.
Let be a feasible solution to the primal and be a feasible solution to the dual. If , then x-bar is optimal for the primal and y-bar is optimal for the dual. If the primal is unbounded, then the dual problem is infeasible. If the dual is unbounded, then the primal is infeasible. 262

263 6.8 Shadow Prices The shadow price of the ith constraint is the amount by which the optimal z-value is improved (increased in a max problem and decreased in a min problem) is we increase bi by 1 (from bi to bi+1). In short, adding points to the feasible region of a max problem cannot decrease the optimal z- value. 263

264 6.9 Duality and Sensitivity Analysis
Assuming that a set of basic variables BV is feasible, then BV is optimal if and only if the associated dual solution (cBVB-1) is dual feasible. This result can be used for an alternative way of doing the following types of sensitivity analysis. Changing the objective function coefficient of a nonbasic variable. Changing the column of a nonbasic variable. Adding a new activity. 264

265 6.10 Complementary Slackness
Let be a feasible primal solution and be a feasible dual solution. Then x is primal optimal and y is dual optimal if and only if siyi = 0 (i=1, 2, …, m) ejyj = 0 (j=1, 2, …, n) 265

266 6.11 The Dual Simplex Method
The dual simplex method maintains a non- negative row 0 (dual feasibility) and eventually obtains a tableau in which each right-hand side is non-negative (primal feasibility). The dual simplex method for a max problem Step 1:Is the right-hand side of each constraint non negative? If not, go to step 2. Step 2:Choose the most negative basic variable as the variable to leave the basis. The row it is in will be the pivot row. To select the variable that enters the basis, compute the following ratio for each variable xj that has a negative coefficient in the pivot row: 266

267 Step 2 continued Choose the variable with the smallest ratio as the entering variable. Now use EROs to make the entering variable a basic variable in the pivot row. Step 3: If there is any constraint in which the right- hand side is negative and each variable has a non- negative coefficient, then the LP has no feasible solution. If no constraint infeasibility is found, return to step 1. The dual simplex method is often used to find the new optimal solution to an LP after a constraint is added. 267

268 When a constraint is added one of the following three cases will occur.
The current optimal solution satisfies the new constraint. The current optimal solution does not satisfy the new constraint, but the LP still has a feasible solution. The additional constraint causes the LP to have no feasible solutions. If the right-hand side of a constraint is changed and the current basis becomes infeasible, the dual simplex can be used to find the new optimal solution. 268

269 6.12 Data Environment Analysis
Often people wonder if a business is operating efficiently. The Data Environment Analysis (DEA) method can be used to find the answer. To learn how DEA works, let’s consider a group of three hospitals. Each hospital “converts” two inputs into three different outputs. The two inputs used are Input 1 – capital (measured in number of beds) Input 2 – labor (measured in thousands of hours/month) 269

270 The efficiency of a hospital is defined by
The outputs are Output 1 – hundreds of patient-days during month for patients under age 14 Output 2 – hundreds of patient-days during month for patients between age 14-65 Output 3 – hundreds of patient-days during month for patients over age 65 The efficiency of a hospital is defined by 270

271 Use LINDO to solve the hospital DEA problem.
The DEA approach uses the following four ideas to determine if a hospital is efficient. No hospital can be more than 100% efficient. When evaluating hospital i’s efficiency, attempt to choose output prices and inpu costs that maximize efficiency. To simplify computations, scale the output prices so that the cost of hospital i’s inputs equals 1. Ensure that each input cost and output price is strictly positive. Use LINDO to solve the hospital DEA problem. 271

272 The DUAL PRICES section of the LINDO output gives us great insight into Hospital 2’s inefficiency.
272

273 Operations Research: Applications and Algorithms
Chapter 7 Transportation, Assignment & Transshipment Problems to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 273

274 7.1 Formulating Transportation Problems
A transportation problem basically deals with the problem, which aims to find the best way to fulfill the demand of n demand points using the capacities of m supply points. While trying to find the best way, generally a variable cost of shipping the product from one supply point to a demand point or a similar constraint should be taken into consideration. 274

275 Example 1: Powerco Formulation
Powerco has three electric power plants that supply the electric needs of four cities. The associated supply of each plant and demand of each city is given in the table 1. The cost of sending 1 million kwh of electricity from a plant to a city depends on the distance the electricity must travel. 275

276 Ex. 1 - continued A transportation problem is specified by the supply, the demand, and the shipping costs. Relevant data can be summarized in a transportation tableau. From To City 1 City 2 City 3 City 4 Supply (Million kwh) Plant 1 $8 $6 $10 $9 35 Plant 2 $12 $13 $7 50 Plant 3 $14 $16 $5 40 Demand (Million kwh) 45 20 30 276

277 Example 1: Solution Decision Variables Constraints
Powerco must determine how much power is sent from each plant to each city so xij = Amount of electricity produced at plant i and sent to city j x14 = Amount of electricity produced at plant 1 and sent to city 4 Constraints A supply constraint ensures that the total quality produced does not exceed plant capacity. Each plant is a supply point. A demand constraint ensures that a location receives its demand. Each city is a demand point. Since a negative amount of electricity can not be shipped all xij’s must be non negative 277

278 Ex. 1 – Solution continued
LP Formulation of Powerco’s Problem Min Z = 8x11+6x12+10x13+9x14+9x21+12x22+13x23+7x24 +14x31+9x32+16x33+5x34 S.T.: x11+x12+x13+x14 <= 35 (Supply Constraints) x21+x22+x23+x24 <= 50 x31+x32+x33+x34 <= 40 x11+x21+x31 >= 45 (Demand Constraints) x12+x22+x32 >= 20 x13+x23+x33 >= 30 x14+x24+x34 >= 30 xij >= 0 (i= 1,2,3; j= 1,2,3,4) 278

279 In general, a transportation problem is specified by the following information:
A set of m supply points from which a good is shipped. Supply point i can supply at most si units. A set of n demand points to which the good is shipped. Demand point j must receive at least di units of the shipped good. Each unit produced at supply point i and shipped to demand point j incurs a variable cost of cij. 279

280 xij = number of units shipped from supply point i to demand point j
280

281 If then total supply equals to total demand, the problem is said to be a balanced transportation problem. If total supply exceeds total demand, we can balance the problem by adding dummy demand point. Since shipments to the dummy demand point are not real, they are assigned a cost of zero. 281

282 If a transportation problem has a total supply that is strictly less than total demand the problem has no feasible solution. No doubt that in such a case one or more of the demand will be left unmet. Generally in such situations a penalty cost is often associated with unmet demand and as one can guess the total penalty cost is desired to be minimum. 282

283 7.2 Finding Basic Feasible Solution for Transportation Problems
Unlike other Linear Programming problems, a balanced transportation problem with m supply points and n demand points is easier to solve, although it has m + n equality constraints. The reason for that is, if a set of decision variables (xij’s) satisfies all but one constraint, the values for xij’s will satisfy that remaining constraint automatically. 283

284 There are three basic methods to find the bfs for a balanced TP
An ordered sequence of at least four different cells is called a loop if Any two consecutive cells lie in either the same row or same column No three consecutive cells lie in the same row or column The last cell in the sequence has a row or column with the first cell in the sequence There are three basic methods to find the bfs for a balanced TP Northwest Corner Method Minimum Cost Method Vogel’s Method 284

285 To find the bfs by the Northwest Corner method:
Begin in the upper left (northwest) corner of the transportation tableau and set x11 as large as possible. X11 can clearly be no larger than the smaller of s1 and d1. Continue applying this procedure to the most northwest cell in the tableau that does not lie in a crossed-out row or column. Assign the last cell a value equal to its row or column demand, and cross out both cells row and column. 285

286 To find the bfs by the Minimum Cost method:
The Northwest Corner Method does not utilize shipping costs. It can yield an initial bfs easily but the total shipping cost may be very high. The minimum cost method uses shipping costs in order come up with a bfs that has a lower cost. To find the bfs by the Minimum Cost method: Find the decision variable with the smallest shipping cost (xij). Then assign xij its largest possible value, which is the minimum of si and dj 286

287 Often the minimum cost method will yield a costly bfs.
Next, as in the Northwest Corner Method cross out row i and column j and reduce the supply or demand of the noncrossed-out row or column by the value of xij. Choose the cell with the minimum cost of shipping from the cells that do not lie in a crossed-out row or column and repeat the procedure. Often the minimum cost method will yield a costly bfs. Vogel’s method for finding a bfs usually avoids extremely high shipping costs. 287

288 To find the bfs by the Vogel’s method:
Begin with computing each row and column a penalty. The penalty will be equal to the difference between the two smallest shipping costs in the row or column. Identify the row or column with the largest penalty. Find the first basic variable which has the smallest shipping cost in that row or column. Assign the highest possible value to that variable, and cross-out the row or column as in the previous methods. Compute new penalties and use the same procedure. 288

289 7.3 The Transportation Simplex Method
The Simplex algorithm simplifies when a transportation problem is solved. By using the following procedure, the pivots for a transportation problem may be performed within the confines of the transportation tableau: Step 1 Determine the variable that should enter the basis. Step 2 Find the loop involving the entering variable and some of the basic variables. Step 3 Counting the cells in the loop, label them as even cells or odd cells. 289

290 Step 4 Find the odd cells whose variable assumes the smallest value
Step 4 Find the odd cells whose variable assumes the smallest value. Call this value θ. The variable corresponding to this odd cell will leave the basis To perform the pivot, decrease the value of each odd cell by θ and increase the value of each even cell by θ. The variables that are not in the loop remain unchanged. The pivot is now complete. If θ=0, the entering variable will equal 0, and an odd variable that has a current value of 0 will leave the basis. In this case a degenerate bfs existed before and will result after the pivot. If more than one odd cell in the loop equals θ, you may arbitrarily choose one of these odd cells to leave the basis; again a degenerate bfs will result 290

291 Two important points to keep in mind in the pivoting procedure
Since each row has as many +20s as –20s, the new solution will satisfy each supply and demand constraint. By choosing the smallest odd variable (x23) to leave the basis, we ensured that all variables will remain nonnegative. 291

292 7.4 Sensitivity Analysis for Transportation Problems
Three aspects of sensitivity analysis for the transportation problems. Changing the objective function coefficient of a nonbasic variable. Changing the objective function coefficient of a basic variable. Increasing a single supply by Δ and a single demand by Δ. 292

293 7.5. Assignment Problems Assignment problems are a certain class of transportation problems for which transportation simplex is often very inefficient. 293

294 Example 4: Machine Assignment Problem
Machineco has four jobs to be completed. Each machine must be assigned to complete one job. The time required to setup each machine for completing each job is shown. Machineco wants to minimize the total setup time needed to complete the four jobs. Time (Hours) Job1 Job2 Job3 Job4 Machine 1 14 5 8 7 Machine 2 2 12 6 Machine 3 3 9 Machine 4 4 10 294

295 Exercise 4: Solution Machineco must determine which machine should be assigned to each job. i,j=1,2,3,4 xij=1 (if machine i is assigned to meet the demands of job j) xij=0 (if machine i is not assigned to meet the demands of job j) 295

296 Ex. 4 – Solution continued
In general an assignment problem is balanced transportation problem in which all supplies and demands are equal to 1. The assignment problem’s matrix of costs is its cost matrix. All the supplies and demands for this problem are integers which implies that the optimal solution must be integers. Using the minimum cost method a highly degenerate bfs is obtained. 296

297 The Hungarian Method is usually used to solve assignment problems.
The steps of The Hungarian Method are as listed below: Step1 Find a bfs. Find the minimum element in each row of the m x m cost matrix. Construct a new matrix by subtracting from each cost the minimum cost in its row. For this new matrix, find the minimum cost in each column. Construct a new matrix (reduced cost matrix) by subtracting from each cost the minimum cost in its column. 297

298 Step2 Draw the minimum number of lines (horizontal and/or vertical) that are needed to cover all zeros in the reduced cost matrix. If m lines are required , an optimal solution is available among the covered zeros in the matrix. If fewer than m lines are required, proceed to step 3. Step3 Find the smallest nonzero element (call its value k) in the reduced cost matrix that is uncovered by the lines drawn in step 2. Now subtract k from each uncovered element of the reduced cost matrix and add k to each element that is covered by two lines. Return to step2. 298

299 7.6 Transshipment Problems
A transportation problem allows only shipments that go directly from supply points to demand points. In many situations, shipments are allowed between supply points or between demand points. Sometimes there may also be points (called transshipment points) through which goods can be transshipped on their journey from a supply point to a demand point. Fortunately, the optimal solution to a transshipment problem can be found by solving a transportation problem. 299

300 The following steps describe how the optimal solution to a transshipment problem can be found by solving a transportation problem. Step 1 If necessary, add a dummy demand point (with a supply of 0 and a demand equal to the problem’s excess supply) to balance the problem. Shipments to the dummy and from a point to itself will be zero. Let s= total available supply. Step2 Construct a transportation tableau as follows: A row in the tableau will be needed for each supply point and transshipment point, and a column will be needed for each demand point and transshipment point. 300

301 Each supply point will have a supply equal to it’s original supply, and each demand point will have a demand to its original demand. Let s= total available supply. Then each transshipment point will have a supply equal to (point’s original supply)+s and a demand equal to (point’s original demand)+s. This ensures that any transshipment point that is a net supplier will have a net outflow equal to point’s original supply and a net demander will have a net inflow equal to point’s original demand. Although we don’t know how much will be shipped through each transshipment point, we can be sure that the total amount will not exceed s. 301

302 Operations Research: Applications and Algorithms
Chapter 8 Network Models to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 302

303 8.1 Basic Definitions A graph or network is defined by two sets of symbols: nodes and arcs. A set (call it V) of points, or vertices. The vertices of a graph or network are also called nodes. An arc consists of an ordered pair of vertices and represents a possible direction of motion that may occur between vertices. A sequence of arcs such that every arc has exactly one vertex in common with the previous are is called a chain. 303

304 For example in the figure below:
A path is a chain in which the terminal node of each arc is identical to the initial node of next arc. For example in the figure below: (1,2)-(2,3)-(4,3) is a chain but not a path; (1,2)-(2,3)-(3,4) is a chain and a path, which represents a way to travel from node 1 to node 4. 2 3 1 4 304

305 8.2 Shortest Path Problems
Assume that each arc in the network has a length associated with it. The problem of finding the shortest path from node 1 to any other node in the network is called a shortest path problem. 305

306 Annual Maintenance cost
Example 2: Equipment Replacement Assume that a new car (or machine) has been purchased for $12,000 at time 0. The cost of maintaining the car during a year depends on the age of the car at the beginning of the year, as given in the table below. Age of Car (Years) Annual Maintenance cost Trade-in Price $2,000 1 $7,000 $4,000 2 $6,000 $5,000 3 $9,000 4 $1,000 $12,000 5 $0 306

307 Ex. 2 - continued In order to avoid the high maintenance cost associated with an older car, we may trade in the car and purchase a new car. To simplify the computations we assume that at any time it costs $12,000 to purchase a new car. The goal is to minimize the net cost incurred during the next five years. 307

308 Example 2: Solution This problem should be formulated as a shortest path problem. The network will have six nodes. Node i is the beginning of year i and for i<j, an arc (i,j) corresponds to purchasing a new car at the beginning of year i and keeping it until the beginning of year j. The length of arc (i,j) (call it cij) is the total net cost incurred from year i to j. cij = maintenance cost incurred during years i,i+1,…,j cost of purchasing a car at the beginning of year i trade-in value received at the beginning of year j 308

309 Ex. 2 – Solution continued
Applying this formula to the information the problem yields c12=2+12-7=7 c13= =12 c14= =21 c15= =31 c16= =44 c23=2+12-7=7 c24= =12 c25= =21 c26= =31 c34=2+12-7=7 c35= =12 c36= =21 c45=2+12-7=7 c46= =12 c56=2+12-7=7 309

310 Ex. 2 – Solution continued
From the figure below we can see that both path and will give us the shortest path with a value of 31. 1 6 5 4 3 2 7 21 12 31 44 310

311 The transshipment problem can then easily be solved by using LINGO.
Assuming that all arc lengths are non-negative, Dijkstra’s algorithm, can be used to find the shortest path from a node. Finding the shortest path between node i and node j in a network may be viewed as a transshipment problem. The transshipment problem can then easily be solved by using LINGO. 311

312 8.3 Maximum Flow Problems Many situations can be modeled by a network in which the arcs may be thought of as having a capacity that limits the quantity of a product that may be shipped through the arc. In these situations, it is often desired to transport the maximum amount of flow from a starting point (called the source) to a terminal point (called the sink). These types of problems are called maximum flow problems. 312

313 Example 3: Maximum Flow Sunco Oil wants to ship the maximum possible amount of oil (per hour) via pipeline from node so to node si. The various arcs represent pipelines of different diameters. The maximum number of barrels of oil that can be pumped through each arc the arc capacity. Arc Capacity (so,1) 2 (so,2) 3 (1,2) (1,3) 4 (3,si) 1 (2,si) so si 2 1 (2)2 (2)3 (0)3 a0(2) 3 (0)4 (0)1 313

314 Ex. 3 - Continued Formulate an LP that can be used to determine the maximum number of barrels of oil per hour that can be send from so to si. 314

315 Example 3: Solution An artificial arc called a0 is added from the sink to the source. First determine the decision variable xij = Millions of barrels of oil per hour that will pass through arc(i,j) of pipeline. For a flow to be feasible it must have two characteristics: 0 <= flow through each arc <= arc capacity Flow into node i = Flow out from node i 315

316 Ex. 3 – Solution continued
Let x0 be the flow through the artificial arc, the conservation of flow implies that x0 = total amount of oil entering the sink. Sunco’s goal is to maximize x0. One optimal solution to this LP is Z=3, xso,1=2, x13=1, x12=1, xso,2=1, x3,si=1, x2,si=2, xo=3. Max Z= X0 S.t. Xso,1<=2 (Arc Capacity constraints) Xso,2<=3 X12<=3 X2,si<=2 X13<=4 X3,si<=1 X0=Xso,1+Xso,2 (Node so flow constraints) Xso,1=X12+X13 (Node 1 flow constraints) Xso,2+X12=X2,si (Node 2 flow constraints) X13+X3,si (Node 3 flow constraints) X3,si+X2,si=X0 (Node si flow constraints) Xij>=0 316

317 Important questions – Ford-Fulkerson Method
The maximum flow in a network can be found using LINDO, but LINGO greatly lessens the effort needed to communicate the necessary information to the computer. Important questions – Ford-Fulkerson Method Given a feasible flow, how can we tell if it an optimal flow (that is maximizes x0)? If a feasible flow is nonoptimal, how can we modify the flow to obtain a new feasible flow that has a larger flow from the source to the sink? 317

318 Choose any set of nodes V’ that contains the sink but does not contain the source. Then the set of arcs (i,j) with i not in V’ and j a member of V’ is a cut for the network. The capacity of a cut is the sum of the capacities of the arcs in the cut. The flow from source to sink for any feasible flow is less than or equal to the capacity of any cut. If the sink cannot be labeled, then Capacity of CUT = current flow from source to sink 318

319 8.4 CPM and PERT Network models can be used as an aid in the scheduling of large complex projects that consist of many activities. If the duration of each activity is known with certainty, the critical path method (CPM) can be used to determine the length of time required to complete a project. If the duration of activities is not known with certainty, the program evaluation and review technique (PERT) can be used to estimate the probability that the project will be completed by a given deadline. 319

320 CPM and PERT are used in many applications including the following:
Scheduling construction projects such as office buildings, highways and swimming pools Developing countdown and “hold” procedure for the launching of space crafts Installing new computer systems Designing and marketing new products Completing corporate mergers Building ships 320

321 To apply CPM and PERT, we need a list of activities that make up the project.
The project is considered to be completed when all activities have been completed. For each activity there is a set of activities (called the predecessors of the activity) that must be completed before the activity begins. A project network is used to represent the precedence relationships between activities. 321

322 For each activity there is a set of activities (called predecessors) that must be completed before the activity begins. Activities are represented by arcs and the nodes will be used to represent completion of a set of activities This type of project network is called an activity on arc (AOA) network. Given a list of activities, predecessors, and AOS representation of a project (called a project network or project diagram) can be constructed by using the following rules. 322

323 Node 1 represents the start of the project
Node 1 represents the start of the project. An arc should lead from node 1 to represent each activity that has no predecessors. A node (called the finish node) representing the completion of the project should be included in the network. Number the nodes in the network so that the node representing the completion time of an activity always has a larger number than the node representing the beginning of an activity. An activity should not be represented by more than one arc in the network Two nodes can be connected by at most one arc. To avoid violating rules 4 and 5, it can be sometimes necessary to utilize a dummy activity that takes zero time. 323

324 Example 6: Drawing a Project Network
Widgetco is about to introduce a new product. A list of activities and their predecessors and of the duration of each activity is given. Draw a project diagram for this project. Activity Predecessors Duration(days) A:train workers - 6 B:purchase raw materials 9 C:produce product 1 A, B 8 D:produce product 2 7 E:test product 2 D 10 F:assemble products 1&2 C, E 12 324

325 Example 6: Solution Project Diagram for Widgetco 1 6 5 4 2 3 A 6 B 9
Dummy C 8 D 7 E 10 F 12 Node 1 = starting node Node 6 = finish node 325

326 The early event time for node i, represented by ET(i), is the earliest time at which the event corresponding to node i can occur. The late event time for node i, represented by LT(i), is the latest time at which the event corresponding to node i can occur without delaying the completion of the project. To find the early event time for each node in the project network we follow the steps below: Find each prior event to node i that is connected by an arc to node i. To the ET for each immediate predecessor of the node i, add the duration of the activity connecting the immediate predecessor to node i. ET(i) equals the maximum of the sums computed in previous step. 326

327 To compute LT(i) we begin with the finish node and go backwards and we follow the steps below:
Find each node that occurs after node i and is connected to node i by an arc. These events are immediate successors of node i. From the LT for each immediate successor to node i subtract the duration of the activity joining the successor the node i. LT(i) is the smallest of the differences determined in previous step. 327

328 Ex. 6 – Solution continued
To compute LT(i)’s From the graph we know that the late completion time for node 6 is 38. Since node 6 is the only immediate successor of node 6, LT(5)=LT(6)-12=26. Same way LT(4)= LT(5)- 10=16. Nodes 4 and 5 are the immediate successors of node 3. Thus; 328

329 The concept of total float of an activity can be used as a measure of how important it is to keep each activity’s duration from greatly exceeding our estimate of its completion time. For an arbitrary arc representing activity (i,j), the total float, represented by TF(i,j), of the activity represented by (i,j) is the amount by which the starting time of activity (i,j), could be delayed beyond its earliest possible starting time without delaying the completion of the project. 329

330 Ex. 6 - Solution continued
Widgetco’s ET(i)’s and LT(i)’s Node ET(i) LT(i) 1 2 9 3 4 16 5 26 6 38 330

331 An activity with a total float of zero is a critical activity.
A path from node 1 to the finish node that consists entirely of critical activities is called a critical path. Widgetco critical path is Free float is the amount by which the starting time of the activity corresponding to arc(i,j) can be delayed without delaying the start of any later activity beyond the earliest possible starting time. 331

332 Linear programming can be used to determine the length of the critical path.
Decision variable: xij: the time that the event corresponding to node j occurs Goal is to minimize the time required to complete the project, we use an objective function of: Z=xF-x1 Note that for each activity (i,j), before j occurs, i must occur and activity (i,j) must be completed. A critical path for this project network consists of a path from the start of the project to the finish in which each arc in the path corresponds to a constraint having a dual price critical path. 332

333 This process is called crashing a project.
In many situations, the project manager must complete the project in a time that is less than the length of the critical path. Linear programming can often be used to determine the allocation of resources that minimizes the cost of meeting the project deadline. This process is called crashing a project. The critical path can be found using LINDO but LINGO makes it very easy to communicate the necessary information with the computer. 333

334 PERT CPM assumes that the duration of each activity is known with certainty. For many projects, this is clearly not applicable. PERT is an attempt to correct this shortcoming of CPM by modeling the duration of each activity as a random variable. For each activity, PERT requires that the project manager estimate three quantities: a = estimate of the activity’s duration under the most favorable conditions b = estimate of the activity’s duration under the least favorable conditions m = most likely value for the activity’s duration 334

335 Let Tij be the duration of activity (i,j)
Let Tij be the duration of activity (i,j). PERT requires the assumption that Tij follows a beta distribution. According to this assumption, it can be shown that the mean and variance of Tij may be approximated by 335

336 : expected duration of activities on any path
PERT requires the assumption that the durations of all activities are independent. Let CP be the random variable denoting the total duration of the activities on a critical path found by CPM. PERT assumes that the critical path found by CPM contains enough activities to allow us to invoke the Central Limit Theorem and conclude that the following is normally distributed: : expected duration of activities on any path : variance of duration of activities on any path 336

337 8.5 Minimum Cost Network Flow Problems
The transportation, assignment, transshipment, shortest path, maximum flow, and CPM problems are all special cases of minimum cost network flow problems (MCNFP). Any MCNFP can be solved by a generalization of the transportation simplex called the network simplex. MCNFP can be written as (for each node i in the network) (for each arc in the network) 337

338 Consider the following transportation problem
Constraints stipulate that the net flow out of node i must equal bi and are referred to as the flow balance equation. Consider the following transportation problem 1 2 4 3 Supply point 1 Supply point 2 Demand point 2 Demand point 1 1 2 4 (Node 1) 3 4 5 (Node 2) 6 (Node 3) 3 (Node 4) 338

339 All Variables nonnegative
MCNFP representation of the problem min Z=X13+2X14+3X23+4X24 X13 X14 X23 X24 rhs Constraint 1 = 4 Node 1 5 Node 2 -1 -6 Node 3 -3 Node 4 All Variables nonnegative 339

340 LINGO can easily be used to solve ay MCNFP problem.
The flow balance equations in any MCNFP have this important property: Each variable xij has a coefficient of +1 in the node i flow balance equation, a coefficient of -1 in the node j flow balance equation, and a coefficient of 0 in all other flow balance equations. LINGO can easily be used to solve ay MCNFP problem. 340

341 8.6 Minimum Spanning Tree Problems
Suppose that each arc (i,j) in a network has a length associated with it and that arc (i,j) represents a way of connecting node i to node j. In many applications it must be determined that the set of arcs in a network that connects all nodes such that the sum of the length of the arcs is minimized. Clearly, such a group of arcs contain no loop. For a network with n nodes, a spanning tree is a group of n-1 arcs that connects all nodes of the network and contains no loops. 341

342 The MST Algorithm may be used to find a minimum spanning tree.
A spanning tree of minimum length in a network is a minimum spanning tree (MST). The MST Algorithm may be used to find a minimum spanning tree. Begin at any node i, and join node i to the node in the network (call it node j) that is closest to node i. The two nodes i and j now form a connected set of nodes C={i,j}, and arc (i,j) will be in the minimum spanning tree. The remaining nodes in the network (call them Ć) are referred to as the unconnected set of nodes. 342

343 Now choose a member of C’ (call it n) that is closest to set C
Now choose a member of C’ (call it n) that is closest to set C. Let m represent the node in C that is closest to n. Then the arc(m,n) will be in the minimum spanning tree. Now update C and C’. Since n is now connected to {i,j}, C now equals {i,j,n} and we must eliminate node n from C’. Repeat this process until a minimum spanning tree is found. Ties for closest node and arc to be included in the minimum spanning tree may be broken arbitrarily. 343

344 Example 8: MST Algorithm
The State University campus has five computers. The distances between each pair os computers are given. What is the minimum length of cable required to interconnect the computers? Note that if two computers are not connected this is because of underground rock formations. 4 2 5 3 1 6 344

345 Example 8: Solution We want to find the minimum spanning tree. 1 2 6 5
Iteration 1: Following the MST algorithm discussed before, arbitrarily choose node 1 to begin. The closest node is node 2. Now C={1,2}, Ć={3,4,5}, and arc(1,2) will be in the minimum spanning tree. 4 2 5 3 1 6 345

346 Ex. 8 – Solution continued
Iteration 2: Node 5 is closest to C. since node 5 is two blocks from node 1 and node 2, we may include either arc(2,5) or arc(1,5) in the minimum spanning tree. We arbitrarily choose to include arc(2,5). Then C={1,2,5} and Ć={3,4}. 4 2 5 3 1 6 346

347 Ex. 8 – Solution continued
Iteration 3: Since node 3 is two blocks from node 5, we may include arc(5,3) in the minimum spanning tree. Now C={1,2,5,3} and Ć={4}. 4 2 5 3 1 6 347

348 Ex. 8 – Solution continued
Iteration 4: Node 5 is the closest node to node 4. Thus, we add arc(5,4) to the minimum spanning tree. The minimum spanning tree consists of arcs(1,2), (2,5), (5,3), and (5,4). The length of the minimum spanning tree is =9 blocks. 4 2 5 3 1 6 348

349 8.7 The Network Simplex Method
The simplex algorithm simplifies for MCNFPs. Summary of the Network Simplex method Determine a starting bfs. The n-1 basic variables will correspond to a spanning tree. Indicate nonbasic variables at their upper bounds by dashed arcs. Compare y1, y2,…yn (often called the simplex multipliers) by solving y1 =0, yi-yj= cij for all basic variables xij. For all nonbasic variables, determine the row 0 coefficient from =yi – yj – cij. The current bfs is optimal if ≤ 0 for all xij =Lij and ≥ 0 for all xij = Uij. If the bfs is not optimal, choose he nonbasic variable that most violates the optimality conditions as the entering basic variable. 349

350 Identify the cycle (there will be exactly one
Identify the cycle (there will be exactly one!) created by adding the arc corresponding to the entering variable to the current spanning tree of the current bfs. Use conservation of slow to determine the new values of the variables in the cycle. The variable that exits the basis will be the variable that first hits its upper or lower bound as the value of the entering basic variable is changed. Find the new bfs by changing the flows of the arcs in the cycle found in step 3. Now go to Step 2. 350

351 Operations Research: Applications and Algorithms
Chapter 9 Integer Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 351

352 9.1 Introduction to Integer Programming
An IP in which all variables are required to be integers is call a pure integer programming problem. An IP in which only some of the variables are required to be integers is called a mixed integer programming problem. An integer programming problem in which all the variables must be 0 or 1 is called a 0-1 IP. The LP obtained by omitting all integer or constraints on variables is called LP relaxation of the IP. 352

353 9.2 Formulating Integer Programming Problems
Practical solutions can be formulated as IPs. The basics of formulating an IP model 353

354 Example 1: Capital Budgeting IP
Stockco is considering four investments Each investment Yields a determined NPV Requires at certain cash flow at the present time Currently Stockco has $14,000 available for investment. Formulate an IP whose solution will tell Stockco how to maximize the NPV obtained from the four investments. 354

355 Example 1: Solution Begin by defining a variable for each decision that Stockco must make. The NPV obtained by Stockco is Total NPV obtained by Stocko = 16x1 + 22x2 + 12x3 + 8x4 Stockco’s objective function is max z = 16x1 + 22x2 + 12x3 +8x4 Stockco faces the constraint that at most $14,000 can be invested. Stockco’s 0-1 IP is max z = 16x1 + 22x2 + 12x3 +8x4 s.t x1 + 7x2 + 4x3 +3x4 ≤ 14 xj = 0 or 1 (j = 1,2,3,4) 355

356 Fixed-Charge Problems
Suppose activity i incurs a fixed charge if undertaken at any positive level. Let = Level of activity i = 1 if activity i is undertaken at positive level = 0 if activity i is not undertaken at positive level Then a constraint of the form < must be added to the formulation. It must be large enough to ensure that will be less than or equal to . 356

357 In a set-covering problem, each member of a given set must be “covered” by an acceptable member of some set. The objective of a set-covering problem is to minimize the number of elements in set 3 that are required to cover all the elements in set 1. Given two constraints ensure that at least one is satisfied by adding an either-or-constraint. 357

358 This is called an if-then constraint.
M is a number chosen large enough to ensure that both constraints are satisfied for all values of that satisfy the other constraints in the problem. Suppose we want to ensure that > 0 implies Then we include the following constraint in the formulation: Here, M is a large positive number, chosen large enough so that f < M and – g < M hold for all values of that satisfy the other constraints in the problem. This is called an if-then constraint. 358

359 0-1 variables can be used to model optimization problems involving piecewise linear functions.
A piecewise linear function consists of several straight line segments. The graph of the piecewise linear function is made of four straight-line segments. The points where the slope of the piecewise linear function changes are called the break points of the function. A piecewise linear function is not a linear function so linear programming can not be used to solve the optimization problem. 359

360 Suppose the piecewise linear function f (x) has break points .
By using 0-1 variables, however, a piecewise linear function can e represented in linear form. Suppose the piecewise linear function f (x) has break points Step 1 Wherever f (x) occurs in the optimization problem, replace f (x) by Step 2 Add the following constraints to the problem: 360

361 LINDO can be used to solve pure and mixed IPs.
If a piecewise linear function f(x) involved in a formulation has the property that the slope of the f(x) becomes less favorable to the decision maker as x increases, then the tedious IP formulation is unnecessary. LINDO can be used to solve pure and mixed IPs. In addition to the optimal solution, the LINDO output also includes shadow prices and reduced costs. LINGO and the Excel Solver can also be used to solve IPs. 361

362 9.3 The Branch-and-Bound Method for Solving Pure Integer Programming Problems
In practice, most IPs are solved by some versions of the branch-and-bound procedure. Branch-and-bound methods implicitly enumerate all possible solutions to an IP. By solving a single subproblem, many possible solutions may be eliminated from consideration.   Subproblems are generated by branching on an appropriately chosen fractional-valued variable. 362

363 Key aspects of the branch-and-bound method for solving pure IPs
Suppose that in a given subproblem (call it old subproblem), assumes a fractional value between the integers i and i+1. Then the two newly generated subproblems are New Subproblem 1 Old subproblem + Constraint New Subproblem 2 Old subproblem + Constraint Key aspects of the branch-and-bound method for solving pure IPs If it is unnecessary to branch on a subproblem, we say that it is fathomed. 363

364 A subproblem may be eliminated from consideration in these situations
These three situations (for a max problem) result in a subproblem being fathomed The subproblem is infeasible, thus it cannot yield the optimal solution to the IP. The subproblem yield an optimal solution in which all variables have integer values. If this optimal solution has a better z-value than any previously obtained solution that is feasible in the IP, than it becomes a candidate solution, and its z-value becomes the current lower bound (LB) on the optimal z-value for the IP. The optimal z-value for the subproblem does not exceed (in a max problem) the current LB, so it may be eliminated from consideration. A subproblem may be eliminated from consideration in these situations The subproblem is infeasible. The LB is at least as large as the z-value for the subproblem 364

365 Two general approaches are used to determine which subproblem should be solved next.
The most widely used is LIFO. LIFO leads us down one side of the branch-and- bound tree and quickly find a candidate solution and then we backtrack our way up to the top of the other side The LIFO approach is often called backtracking. The second commonly used approach is jumptracking. When branching on a node, the jumptracking method solves all the problems created by branching. 365

366 The setting is found under the Options.
When solving IP problems using Solver you can adjust a Solver tolerance setting. The setting is found under the Options. For example a tolerance value of .20 causes the Solver to stop when a feasible solution is found that has an objective function value within 20% of the optimal solution. 366

367 9.4 The Branch-and-Bound Method for Solving Mixed Integer Programming Problems
In mixed IP, some variables are required to be integers and others are allowed to be either integer or nonintegers. To solve a mixed IP by the branch-and-bound method, modify the method by branching only on variables that are required to be integers. For a solution to a subproblem to be a candidate solution, it need only assign integer values to those variables that are required to be integers 367

368 9.5 Solving Knapsack Problems by the Branch-and-Bound Method
A knapsack problem is an IP with a single constraint. A knapsack problem in which each variable must be equal to 0 or 1 may be written as When knapsack problems are solved by the branch-and-bound method, two aspects of the method greatly simplify. Due to each variable equaling 0 or 1, branching on xi will yield in xi =0 and an xi =1 branch. The LP relaxation may be solved by inspection. max z = c1x1 + c2x2 + ∙∙∙ + cnxn s.t a1x1 + a2x2 + ∙∙∙ + anxn ≤ b x1 = 0 or 1 (i = 1, 2, …, n) 368

369 Examples of combinatorial optimization problems
9.6 Solving Combinatorial Optimization Problems by the Branch-and-Bound Method A combinatorial optimization problem is any optimization problem that has a finite number of feasible solutions. A branch-and-bound approach is often the most efficient way to solve them. Examples of combinatorial optimization problems Ten jobs must be processed on a single machine. It is known how long it takes to complete each job and the time at which each job must be completed. What ordering of the jobs minimizes the total delay of the 10 jobs? 369

370 In each of these problems, many possible solutions must be considered.
A salesperson must visit each of the 10 cities before returning to his home. What ordering of the cities minimizes the total distance the salesperson must travel before returning home? This problem is called the traveling sales person problem (TSP). In each of these problems, many possible solutions must be considered. 370

371 Example 11: Traveling Salesperson Problem
Joe State lives in Gary, Indiana and owns insurance agencies in Gary, Fort Wayne, Evansville, Terre Haute and South Bend. Each December he visits each of his insurance agencies. The distance between each agency is known. What order of visiting his agencies will minimize the total distance traveled? 371

372 Example 11: Solution To begin define
Several branch-and-bound approaches have been developed for solving TSPs The approach we will use is the one in which the subproblems reduce to assignments problems. 372

373 Ex. 11 – Solution continued
First solve the assignment problem in subproblem 1. This solution contains two subtours and cannot be the optimal solution. Now branch on subproblem 1 in a way that will prevent one of subproblem 1’s subtours from recurring in solutions to subsequent subproblems. Now arbitrarily choose subproblem 2 to solve, applying the Hungarian method to the cost matrix shown. This solution can not be the optimal solution. 373

374 Ex. 11 – Solution continued
Now branch subproblem 2 in an effort to exclude the subtour Thus we add two additional subproblems. Following the LIFO approach, next solve subproblem 4 or subproblem 5. By using the Hungarian method on subproblem 4, the optimal solution z=668 , x15 = x24 = x31 = x43 = x52 =1. This is a candidate solution and any node that cannot yield a z-value < 668 may be eliminated from consideration. Following the LIFO rule, next solve subproblem 5. z=704 thus this subproblem is eliminated. 374

375 Ex. 11 – Solution continued
Only subproblem 3 remains. The optimal solution is z =652. It is possible for this subproblem to yield a solution with no subtours that beats z=668. Next branch on subproblem 3 creating subproblem 6 and 7. Both of these subproblems have a z-value that is larger than 668. Subproblem 4 thus yields the optimal solution. 375

376 When using branch-and-bound methods to solve TSPs with many cities, large amounts of computer time is needed. Heuristic methods, or heuristics, can be used to quickly lead to a good solution. Heuristics is a method used to solve a problem by trial and error when an algorithm approach is impractical. Two types of heuristic methods can be used to solve TSP; nearest neighbor method and cheapest-insertion method. 376

377 Nearest Neighbor Method
Begin at any city and then “visit” the nearest city. Then go to the unvisited city closest to the city we have most recently visited. Continue in this fashion until a tour is obtained. After applying this procedure beginning at each city, take the best tour found. Cheapest Insertion Method (CIM) Begin at any city and find its closest neighbor. Then create a subtour joining those two cities. Next, replace an arc in the subtour (say, arc (i, j) by the combinations of two arcs---(i, k) and (k, j), where k is not in the current subtour---that will increase the length of the subtour by the smallest (or cheapest) amount. Continue with this procedure until a tour is obtained. After applying this procedure beginning with each city, we take the best tour found. 377

378 Three methods to evaluate heuristics
Performance guarantees Gives a worse-case bound on how far away from optimality a tour constructed by the heuristic can be Probabilistic analysis A heuristic is evaluated by assuming that the location of cities follows some known probability distribution Empirical analysis Heuristics are compared to the optimal solution for a number of problems for which the optimal tour is known 378

379 LINGO can be used to solve the IP of a TSP. minimize: s.t.
An IP formulation can be used to solve a TSP but can become unwieldy and inefficient for large TSPs. LINGO can be used to solve the IP of a TSP. minimize: s.t. cij = distance from city i to city j xij = 1 if tour visits i then j, and 0 otherwise (binary) ti = arbitrary real numbers we need to solve for 379

380 9.7 Implicit Enumeration The method of implicit enumeration is often used to solve 0-1 IPs. Implicit enumeration uses the fact that each variable must be equal to 0 or 1 to simplify both the branching and bounding components of the branch-and-bound process and to determine efficiently when a node is infeasible. The tree used in the implicit enumeration method is similar to those used to solve knapsack problems. Some nodes have variable that are specified called fixed variables. 380

381 Three main ideas used in implicit enumeration
All variables whose values are unspecified at a node are called free variables. For any node, a specification of the values of all the free variables is called a completion of the node. Three main ideas used in implicit enumeration Suppose we are at ay node with fixed variables, is there an easy way to find a good completion of the node that is feasible in the original 0-1 TSP? Even is the best completion of a node is not feasible, the best completion gives us a bound on the best objective function value that can be obtained via feasible completion of the node. This bound can be used to eliminate a node from consideration. 381

382 At any node, is there an easy way to determine if all completions of the node are infeasible?
In general, check whether a node has a feasible completion by looking at each constraint and assigning each free variable the best value for satisfying the constraint. 382

383 9.8 Cutting Plane Algorithm
An alternative method to the branch-and-bound method is the cutting plane algorithm. Summary of the cutting plane algorithm Step 1 Find the optional tableau for the IP’s programming relaxation. If all variables in the optimal solution assume integer values, we have found an optimal solution to the IP; otherwise, proceed to step2. Step 2 Pick a constraint in the LP relaxation optimal tableau whose right-hand side has the fractional part closest to 1/2. This constraint will be used to generate a cut. Step 2a For the constraint identified in step 2, write its right- hand side and each variable’s coefficient in the form [x]+ f, where 0 <= f < 1. 383

384 Step 2b Rewrite the constraint used to generate the cut as
All terms with integer coefficients = all terms with fractional coefficients Then the cut is All terms with fractional coefficients <= 0 Step 3 Use the simplex to find the optimal solution to the LP relaxation, with the cut as an additional constraint. If all variables assume integer values in the optimal solution, we have found an optimal solution to the IP. Otherwise, pick the constraint with the most fractional right-hand side and use it to generate another cut, which is added to the tableau. We continue this process until we obtain a solution in which all variables are integers. This will be an optimal solution to the IP. 384

385 Operations Research: Applications and Algorithms
Chapter 10 Advanced Topics in Linear Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 385

386 10.1 The Revised Simplex Algorithm
Section 6.2 demonstrated how to create an optimal tableau from an initial tableau, given an optimal set of the basic variables. The same results can be used to create a tableau corresponding to any set of basic variables. To create a tableau for any set of basic variables BV, view the notation shown on the following slide. Assume the LP has m constraints. 386

387 b = right-hand side vector of the original tableau’s constraints
BV = any set of basic variables (the first element of BV is the basic variable in the first constraint, the second basic variable is the basic variable in the second constraint, etc. BVj is the basic variable for the constraint j in the desired tableau. b = right-hand side vector of the original tableau’s constraints aj = column for xj in the constraints of the original problem B = m x m matrix whose jth column is the column for BVj in the original constraints cj = coefficients of xj in the objective function cBV = 1 x m row vector whose jth element is the objective function coefficient for BVj ui = m x 1 column vector with ith element 1 and all other elements equal to zero. 387

388 Summarizing the formulas of Section 6.2:
B-1aj = column for xj in the BV tableau cBVB-1aj – cj = coefficient of xj in row 0 B-1b = right-hand side of constraints in BV tableau cBVB-1ui = coefficient of slack variable si, in BV row 0 cBVB-1(-ui) = coefficient of excess variable ei, in BV row 0 M + cBVB-1ui = coefficient of artificial variable ai in BV row 0 cBVB-1b = right-hand side of BV row 0 388

389 If we know BV, B-1, and the original tableau, the formulas on the previous slide enable us to compute any part of the simplex tableau for any set of basic variables BV. This means that if a computer is programmed to perform the simplex algorithm, all the computer needs to store on any pivot is the correct set of basic variables, B-1, and the initial tableau. This idea is the basis for the revised simplex algorithm. 389

390 A summary of the revised simplex method (max problem) is:
Step 0 Note the columns from which the current B-1 will be read. Initially, B-1 = 1. Step 1 For the current tableau, compute cBVB-1 Step 2 Price out all nonbasic variables in the current tableau. If each nonbasic variable prices out nonnegative, the current basis is optimal. If the current basic is not optimal, enter into the basis the nonbasic variable with the most negative coefficient in row 0. Call this variable xk. 390

391 Step 3 To determine which row in which xk enters the basis, compute xk’s column in the current tableau (B-1ak) and compute the rhs of the current tableau (B-1b). Then use the ratio test to determine the row in which xk should enter the basis. We now know the set of basic variables (BV) for the new tableau. Step 4 Use the column for xk in the current tableau to determine the ERO’s needed to enter xk into the basis. Perform these ERO’s on the current B-1. This will yield the new B-1. Return to Step 1. 391

392 Most linear programming codes use some version of the revised simplex method to solve LPs.
Since knowing the current tableau’s B-1 and the initial tableau is all that is needed to obtain the next tableau, the computational effort required to solve an LP depends primarily upon the size of B-1. In some cases involving models with a large number of constraints (equations), solving the dual in lieu of the primal will be much easier computationally. 392

393 10.2 The Product Form of the Inverse
Much of the computation in the revised simplex algorithm is concerned with updating B-1 from one tableau to the next. A matrix (like E) that differs from the identity matrix in only one column is called an elementary matrix. It can shown that: B-1 for the new tableau = E(B-1) for the current tableau 393

394 This equation is called the product form of the inverse
This equation is called the product form of the inverse. Most linear programming computer codes utilize the revised simple method and compute successive B-1’s by using the product form of the inverse. 394

395 10.3 Using Column Generation to Solve Large-Scale LPs
For LPs that have many variables, column generation can be used to increase the efficiency of the revised simplex algorithm. Column generation is also a very important component of the Dantzig-Wolfe decomposition algorithm. 395

396 Example 2: Odds and Evens
Woodco sells 3-ft, 5-ft, and 9-ft pieces of lumber. Woodco’s customers demand 25 3-ft boards, ft boards and 15 9-ft boards. Woodco, who must meet its demands by cutting up 17-ft boards, wants to minimize the waste incurred. Formulate an LP to help Woodco accomplish its goal, and solve the LP by column generation. 396

397 Example 2: Solution Each decision corresponds to a way in which a 17-ft board can be cut. Shown are the sensible ways to cut a 17-ft board Number of Combination 3-ft boards 5-ft boards 9-ft boards Waste(feet) 1 5 2 4 3 6 397

398 Ex. 2 – Solution continued
Woodco’s objective function is min z = x1 + x2 + x3 + x4 + x5 + x6 Constraints At least 25 3-ft boards must be cut. (5x1 + 4x2 + 2x x4 + x5 ≥ 25 At least 20 5-ft boards must be cut. (x2 + 2x3 + x x6 ≥ 20 At least 15 9-ft boards must be cut. (x4 + x5 ≥ 15) xi should be required to assume integer values After creating the LP, each nonbasic variable should be priced out to determine which variable should enter the basis. 398

399 Ex. 2 – Solution continued
However, in a large scale cutting stock problem there may be thousands of variables, so this is where column generation comes into play. The idea of column generation is to search efficiently for a column that will price out favorably. Find the combination that prices out most favorably by solving the following knapsack problem max z = 1/5a3 + 1/3a5 + a9-1 s.t. 3a3 + 5a5+ 9a9 ≤ 17 a3, a5, a9 ≥ 0; a3, a5, a9 integer 399

400 Ex. 2 – Solution continued
Because it is a knapsack problem (without restrictions on the variables), it can easily be solved by using the branch-and-bound procedure. Branch-and-bound trees are the result. 400

401 10.4 – The Dantzig-Wolfe Decomposition Algorithm
In many LPs, the constraints and variables may be decomposed in the following manner: Constraints in Set k+1 involve any variable. The constraints in Set k+1 are referred to as the central constraints. LPs that can be decomposed in this fashion can be solved efficiently by the Dantzig-Wolfe decomposition algorithm. Constraints in Set 1 only involve variables in Variable Set 1 Constraints in Set 2 only involve variables in Variable Set 2 ….. …. Constraints in Set k only involve variables in Variable Set k. 401

402 Blast Furnace Time Required (hours)
Example 3: Decomposition SteelCo manufactures two type of steel (steel 1 and steel 2) at two locations (plant 1 and plant 2) requiring the following resources according as shown. Product (1 ton) Iron Required (tons) Coal Required (tons) Blast Furnace Time Required (hours) Steel 1 at Plant 1 8 3 2 Steel 2 at Plant 1 6 1 Steel 1 at Plant 2 7 Steel 2 at Plant 2 5 402

403 Ex. 3 - continued Each day the following resources are available:
Steel selling prices: Steel shipping costs: 403

404 Ex. 3 - continued Each plant has its own coal mine and coal cannot be shipped between plants. Each plant has a different type of furnace requiring different resources. Iron ore is mined from a single mine located midway between the plants. Formulate and solve an LP to maximize SteelCo’s Revenue. 404

405 Example 3: Solution Define:
SteelCo’s revenue is given by: (x1 + x3) + 160(x2 + x4) SteelCo’s shipping costs are given by: (x1 + x2) + 100(x3 + x4) x1 = tons of steel 1 produced daily at plant 1 x2 = tons of steel 2 produced daily at plant 1 x3 = tons of steel 1 produced daily at plant 2 x4 = tons of steel 2 produced daily at plant 2 405

406 Ex. 3 – Solution continued
Therefore, SteelCo wants to maximize: z = ( )x1 + ( )x (170 – 100)x3 + ( )x4 or: z = 90x1 + 80x2 + 70x3 + 60x4 SteeCo’s LP is max z = 90x1 + 80x2 + 70x3 + 60x4 s.t. 3x x ≤ (plant 1 coal constraint) 2x x ≤ 10 (plant 1 furnace constraint) 3x3 + 2x4 ≤ (plant 2 coal constraint) x3 + x4 ≤ (plant 1 furnace constraint) 8x x x3 + 5x4 ≤ (iron ore constraint) x1, x2, x3, x4 ≥ 0 406

407 Ex. 3 – Solution continued
Using the definition of decomposition, the SteelCo LP may be decomposed into Variable Set 1 x1 and x2 (plant 1 variables) Variable Set 2 x3 and x4 (plant 2 variables) Constraint 1 Plant 1 coal and furnace constraints Constraint 2 Plant 2 coal and furnace constraints Constraint 3 Iron ore constraint 407

408 Ex. 3 – Solution continued
Constraint Set 1 and Variable Set 1 involve activities at Plant 1 and do not involve x3 and x4 (which represent plant 2 activities). Constraint Set 2 and Variable Set 2 involve activities at Plant 2 and do not involve x1 and x2 (which represent plant 1 activities). Constraint 3 may be thought of as a centralized constraint that interrelates the two sets of variables 408

409 Assuming each sub problem in the SteelCo LP has a bounded feasible region, the results depend upon the following theorem. Suppose the feasible region for an LP is bounded and the extreme points (or basic feasible solutions) of the LP’s feasible region are P1, P2, …, Pk. Then any point x in the LP’s feasible region may be written as a linear combination of P1, P2, …, Pk. Or, x = μ1P1 + μ2P2 + … + μkPk Moreover, the weights of μ1, μ2 , …, μk may be chosen such that: μ1+ μ2 + … + μk = 1 and μi ≥ 0 for i = 1, 2, …, k 409

410 Any linear combination of vectors which satisfy. μ1+ μ2 + … + μk = 1
Any linear combination of vectors which satisfy μ1+ μ2 + … + μk = 1 and μi ≥ 0 for i = 1, 2, …, k is called a convex combination. To explain the basic ideas of the Dantzig-Wolfe deposition algorithm, assume the set of variables has been decomposed into Set 1 and Set 2 as in the SteelCo LP. The steps shown can be generalized where decomposition results in more than just two sets of variables. 410

411 Step 1 Let the variables in Variable Set 1 by x1, x2, …, xn1
Step 1 Let the variables in Variable Set 1 by x1, x2, …, xn1. Express the variables as a convex combination of the extreme points of the feasible region for Constraint Set 1. If we let P1, P2, …, Pk be the extreme points of the feasible region, then any point in the feasible region can be written in the form: = μ1P1 + μ2P2 + … + μkPk where: μ1+ μ2 + … + μk = 1 and μi ≥ 0 for i = 1, 2, …, k 411

412 Step 2 Let the variables in Variable Set 2 by xn1+1, xn1+2, …, xn
Step 2 Let the variables in Variable Set 2 by xn1+1, xn1+2, …, xn. Express the variables as a convex combination of the extreme points of the feasible region for Constraint Set 2. If we let Q1, Q2, …, Qm be the extreme points of the feasible region, then any point in Constraint Set 2’s feasible region can be written in the form: = λ1Q1 + λ2Q2 + … + λmQm where: λ1+ λ2 + … + λm = 1 and λi ≥ 0 for i = 1, 2, …, m 412

413 Step 3 Using the equations on slides 33 and 34, express the LP’s objective function and centralized constraint in terms of μi‘s and λi‘s. After adding the constraints (called convexity constraints) μ1+ μ2 + … + μk = λ1+ λ2 + … + λm = and the sign restrictions, we obtain the following LP which is referred to as a restrictive master shown on the next slide. 413

414 Max (or min) [objective function in terms of μi‘s and λi‘s] s.t. [central constraints in terms of μi‘s and λi‘s] μ1+ μ2 + … + μk = 1 λ1+ λ2 + … + λm = 1 μi ≥ 0 for i = 1, 2, …, k λi ≥ 0 for i = 1, 2, …, m Step 4 Assume that a basic feasible solution for the restricted master is readily available. Then use the column generation method to solve the restricted master. 414

415 = μ1P1 + μ2P2 + … + μkPk = λ1Q1 + λ2Q2 + … + λmQm
Step 5 Substitute the optimal values of the μi‘s and λi‘s found in Step 4 into the equations. This will yield the optimal values of x1, x2, … , xn = μ1P1 + μ2P2 + … + μkPk = λ1Q1 + λ2Q2 + … + λmQm 415

416 10.5 The Simplex Method for Upper-Bounded Variables
Often LPs contain many constraints of the form xi ≤ ui, that provide an upper bound on xi which are called an upper-bound constraint. To efficiently solve an LP with upper-bound constraints, allow the variable xi to be nonbasic if xi =0 or if xi = ui. Whenever xi is needed to equal its upper bound of ui, simply replace xi by ui-xi’. This is called an upper-ound substitution. 416

417 Three possible occurrences, or bottlenecks, can restrict the amount by which xi can be increased.
xi cannot exceed its upper bound of ui xi increases to a point where it causes one of the current basic variables to become negative. The smallest of xi that will cause one of the current basic variables to become negative may be found by expressing each basic variable in terms of xi. xi increases to a point where it causes one of the current basic variables to exceed its upper bound. As in bottleneck 2, the smallest value of xi for which this bottleneck occurs can be found by expressing each basic variable in terms of xi. 417

418 10.6 Karmarkar’s Method for Solving LPs
Karmarkar’s method for solving LPs is a polynomial time algorithm. Karmarkar’s method is applied to an LP in the following form. The following three concepts play a key role in Karmarkar’s method Projection of a vector onto the set of x satisfying Ax=0 Karmarkar’s centering transformation Karmarkar’s potential function min z =cx s.t Ax = 0 x1+ x2 + ∙∙∙ + xn = 1 x ≥ 0 418

419 Karmarkar’s centering transformation has the following properties.
The n-dimensional unit simplex S is the set of points [x1+ x2 + ∙∙∙ + xn]T satisfying x1+ x2 + ∙∙∙ + xn =1 and xj ≥ 0, j = 1, 2, ∙∙∙ n. Karmarkar’s centering transformation has the following properties. 419

420 420

421 Operations Research: Applications and Algorithms
Chapter 11 Nonlinear Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 421

422 11.1 Review of Differential Calculus
The equation means that as x gets closer to a (but not equal to a), the value of f(x) gets arbitrarily close to c. A function f(x) is continuous at a point if If f(x) is not continuous at x=a, we say that f(x) is discontinuous (or has a discontinuity) at a. 422

423 N-th order Taylor series expansion
The derivative of a function f(x) at x = a (written f’(a)] is defined to be N-th order Taylor series expansion The partial derivative of f(x1, x2,…xn) with respect to the variable xi is written 423

424 11.2 Introductory Concepts
A general nonlinear programming problem (NLP) can be expressed as follows: Find the values of decision variables x1, x2,…xn that max (or min) z = f(x1, x2,…,xn) s.t g1(x1, x2,…,xn) (≤, =, or ≥)b1 s.t g2(x1, x2,…,xn) (≤, =, or ≥)b2 . gm(x1, x2,…,xn) (≤, =, or ≥)bm 424

425 An NLP with no constraints is an unconstrained NLP.
As in linear programming f(x1, x2,…,xn) is the NLP’s objective function, and g1(x1, x2,…,xn) (≤, =, or ≥)b1,…gm(x1, x2,…,xn) (≤, =, or ≥)bm are the NLP’s constraints. An NLP with no constraints is an unconstrained NLP. The feasible region for NLP above is the set of points (x1, x2,…,xn) that satisfy the m constraints in the NLP. A point in the feasible region is a feasible point, and a point that is not in the feasible region is an infeasible point. 425

426 NLPs can be solved with LINGO.
If the NLP is a maximization problem then any point in the feasible region for which f( ) ≥ f(x) holds true for all points x in the feasible region is an optimal solution to the NLP. NLPs can be solved with LINGO. Even if the feasible region for an NLP is a convex set, he optimal solution need not be a extreme point of the NLP’s feasible region. For any NLP (maximization), a feasible point x = (x1,x2,…,xn) is a local maximum if for sufficiently small , any feasible point x’ = (x’1,x’2,…,x’n) having | x1–x’1|< (i = 1,2,…,n) satisfies f(x)≥f(x’). 426

427 Example 11: Tire Production
Firerock produces rubber used for tires by combining three ingredients: rubber, oil, and carbon black. The costs for each are given. The rubber used in automobile tires must have a hardness of between 25 and 35 an elasticity of at 16 a tensile strength of at least 12 To manufacture a set of four automobile tires, pounds of product is needed. The rubber to make a set of tires must contain between 25 and 60 pounds of rubber and at least 50 pounds of carbon black. 427

428 Ex continued Define: R = pounds of rubber in mixture used to produce four tires O = pounds of oil in mixture used to produce four tires C = pounds of carbon black used to produce four tires Statistical analysis has shown that the hardness, elasticity, and tensile strnegth of a pound mixture of rubber, oil, and carbon black is Tensile strength = (O) (O)2 Elasticity = R .04(O) (O)2 Hardness = R + .06(O) -.3(C) (R)(O) +.005(O)2+.001C2 Formulate the NLP whose solution will tell Firerock how to minimize the cost of producing the rubber product needed to manufacture a set of automobile tires. 428

429 Example 11: Solution After defining TS = Tensile Strength E = Elasticity H = Hardness of mixture the LINGO program gives the correct formulation. 429

430 It is easy to use the Excel Solver to solve NLPs.
The process is similar to a linear model. For NLPs having multiple local optimal solutions, the Solver may fail to find the optimal solution because it may pick a local extremum that is not a global extremum. 430

431 11.3 Convex and Concave Functions
A function f(x1, x2,…,xn) is a convex function on a convex set S if for any x’  s and xn  s f [cx’+(1- c)xn] ≤ cf(x’)+(1-c)f(xn) holds for 0 ≤ c ≤ 1. A function f(x1, x2,…,xn) is a concave function on a convex set S if for any x’  s and xn  s f [cx’+(1- c)xn] ≥ cf(x’)+(1-c)f(xn) holds for 0 ≤ c ≤ 1. 431

432 Theorems - Consider a general NLP.
Suppose the feasible region S for NLP is a convex set. If f(x) is concave on S, then any local maximum (minimum) for the NLP is an optimal solution to the NLP. Suppose fn(x) exists for all x in a convex set S. Then f(x) is a convex (concave) function of S if and only if fn(x) ≥ 0[fn(x) ≤ 0] for all x in S. Suppose f(x1, x2,…, xn) has continuous second-order partial derivatives for each point x=(x1, x2,…, xn)  S. Then f(x1, x2,…, xn) is a convex function on S if and only if for each x  S, all principal minors of H are non-negative. 432

433 The Hessian of f(x1, x2,…, xn) is the n x n matrix whose ijth entry is
Suppose f(x1, x2,…, xn) has continuous second-order partial derivatives for each point x=(x1, x2,…, xn)  S. Then f(x1, x2,…, xn) is a concave function on S if and only if for each x  S and k=1, 2,…n, all nonzero principal minors have the same sign as (-1)k. The Hessian of f(x1, x2,…, xn) is the n x n matrix whose ijth entry is An ith principal minor of an n x n matrix is the determinant of any i x i matrix obtained by deleting n – i rows and the corresponding n – i columns of the matrix. 433

434 The kth leading principal minor of an n x n matrix is the determinant of the k x k matrix obtained by deleting the last n-k rose and columns of the matrix. 434

435 11.4 Solving NLPs with One Variable
Solving the NLP To find the optimal solution for the NLP find all the local maxima (or minima). A point that is a local maximum or a local minimum for the NLP is called a local extremum. The optimal solution is the local maximum (or minimum) having the largest (or smallest) value of f(x). 435

436 There are three types of points for which the NLP can have a local maximum or minimum (these points are often called extremum candidates). Points where a < x < b, f’(x) = 0 [called a stationary point of f(x). Points where f’(x) does not exist Endpoints a and b of the interval [a,b] 436

437 Example 21: Profit Maximization by Monopolist
It costs a monopolist $5/unit to produce a product. If he produces x units of the product, then each can be sold for 10-x dollars. To maximize profit, how much should the monopolist produce. 437

438 Example 21: Solution The monopolist wants to solve the NLP
The extremum candidates can be classified as Case 1 check tells us x=2.5 is a local maximum yielding a profit P(2.5)=6.25. P’(x) exists for all points in [0,10], so there are no Case 2 candidates. A =0 has P’(0) = 5 > 0 so a=0 is a local minimum; b=10 has P’(10)=-15<0, so b = 10 is a local minimum 438

439 Demand is often modeled as a linear function of price.
Profit is a concave function of price and Solver should fine the profit maximizing price. 439

440 Solving one variable NLPs with LINGO
If you are maximizing a concave objective function f(x), then you can be certain that LINGO will find the optimal solution to the NLP If you are minimizing a convex objective function, then you know that LINGO will find the optimal solution to the NLP 440

441 Trying to minimize a nonconvex function or maximize a nonconcave function of a one variable NLP then LINGO may find a local extremum that does not solve the NLP. 441

442 11.5 Golden Section Search The Golden Section Method can be used if the function is a unimodal function. A function f(x) is unimodel on [a,b] if for some point on [a,b], f(x) is strictly increasing on [a, ] and strictly decreasing on [ ,b]. The optimal solution of the NLP is some point on the interval [a,b]. By evaluating f(x) at two points x1 and x2 on [a,b], we may reduce the size of the interval in which the solution to the NLP must lie. 442

443 After evaluating f(x1) and f(x2), one of these cases must occur
After evaluating f(x1) and f(x2), one of these cases must occur. It can be shown in each case that the optimal solution will lie in a subset of [a,b]. Case 1 - f(x1) < f(x2) and Case 2 - f(x1) = f(x2) and Case 2 - f(x1) > f(x2) and The interval in which x-bar must lie – either [a,x2) or (x1, b] - is called the interval of uncertainty. 443

444 Spreadsheets can be used to conduct golden section search.
Many search algorithms use these ideas to reduce the interval of uncertainty. Most of these algorithms proceed as follows: Begin with the region of uncertainty for x being [a,b]. Evaluate f(x) at two judiciously chosen points x1 and x2. Determine which of cases 1-3 holds, and find a reduced interval of uncertainty. Evaluate f(x) at two new points (the algorithm specifies how the two new points are chosen). Return to step 2 unless the length of the interval of uncertainty is sufficiently small. Spreadsheets can be used to conduct golden section search. 444

445 11.6 Unconstrained Maximization and Minimization with Several Variables
Consider this unconstrained NLP These theorems provide the basics of unconstrained NLP’s that may have two or more decision variables. If is a local extremum for the NLP then =0. A point having =0 for i = 1, 2,…,n is called a stationary point of f. If Hk( ) > 0,k=1,2,…,n, then a stationary point is a local minimum for unconstrained NLP . 445

446 If, for k=1,2,…,n, Hk is nonzero and has the same sign as (-1)k, then a stationary point is a local maximum for the unconstrained NLP. If Hn( ) ≠ 0 and the conditions of the previous two theorems do not hold, then a stationary point is not a local extremum. LINGO will find the optimal solution if used to solve maximizing concave function or minimizing a convex function problems. 446

447 11.7: The Method of Steepest Ascent
The method of steepest ascent can be used to approximate a function’s stationary point. Given a vector x = (x1, x2,…, xn)  Rn, the length of x (written ||x||) is For any vector x, the unit vector x/||x|| is called the normalized version of x. Consider a function f(x1, x2,…xn), all of whose partial derivatives exist at every point. A gradient vector for f(x1, x2,…xn), written f(x), is given by 447

448 Suppose we are at a point v and we move from v a small distance  in a direction d. Then for a given , the maximal increase in the value of f(x1, x2,…,xn) will occur if we choose If we move a small distance away from v and we want f(x1, x2, …,xn) to increase as quickly as possible, then we should move in the direction of f(v) 448

449 11.8 Lagrange Multipliers Lagrange multipliers can be used to solve NLPs in which all the constraints are equality constraints. Consider NLPs of the following type: To solve the NLP, associate a multiplier i with the ith constraint in the NLP and form the Lagrangian 449

450 for which the points for which
The Lagrange multipliers i can be used in sensitivity analysis. 450

451 11.9 The Kuhn-Tucker Conditions
The Kuhn-Tucker conditions are used to solve NLPs of the following type: The Kuhn-Tucker conditions are necessary for a point to solve the NLP. 451

452 Suppose the NLP is a maximization problem
Suppose the NLP is a maximization problem. If is an optimal solution to NLP, then must satisfy the m constraints in the NLP, and there must exist multipliers 1, 2, …, m satisfying 452

453 Suppose the NLP is a minimization problem
Suppose the NLP is a minimization problem. If is an optimal solution to NLP, then must satisfy the m constraints in the NLP, and there must exist multipliers 1, 2, …, m satisfying 453

454 Unless a constraint qualification or regularity condition is satisfied at an optimal point , the Kuhn-Tucker conditions may fail to hold at . LINGO can be used to solve NLPs with inequality (and possibly equality) constraints. If LINGO displays the message DUAL CONDITIONS:SATISFIED then you know it has found the point satisfying the Kuhn-Tucker conditions. 454

455 11.10:Quadratic Programming
A quadratic programming problem (QPP) is an NLP in which each term in the objective function is of degree 2,1, or 0 and all constraints are linear. LINGO, Excel and Wolfe’s method (a modified version of the two-phase simplex) may be used to solve QPP problems. In practice, the method of complementary pivoting is most often used to solve QPPs. 455

456 11.11: Separable Programming.
Many NLPs are of the following form: They are called seperable programming problems that are often solved by approximating each fj(x) and gij(xj) by a piecewise linear function. 456

457 To approximate the optimal solution to a separate programming problem, solve the following approximation problem For the approximation problem to yield a good approximation to the functions fi and gj,k, we must add the following adjacency assumption: For j=1,2,…,n, at most two j,k’s can be positive. 457

458 11.12 The Method of Feasible Directions
This method takes the steepest ascent method of section 12.7 into a case where the NLP now has linear constraints. To solve begin with a feasible solution x0. Let d0 be a solution to 458

459 Choose our new point x2 to be x2 = x1 + t1(d1- x1), where t1 solves
Choose our new point x1 to be x1 = x0+t0(d0- x0), where t0 solves Let d1 be a solution to Choose our new point x2 to be x2 = x1 + t1(d1- x1), where t1 solves 459

460 Continue generating points x3,…,xk in this fashion until xk=xk-1 or successive points are sufficiently close together. 460

461 11.13 Pareto Optimality and Trade-Off Curves
A solution (call it A) to a multiple-objective problem is Pareto optimal if no other feasible solution is at least as good as A with respect to every objective and strictly better than A with respect to at least one objective. A feasible solution B dominates a feasible solution A to a multiple objective problem if B is at least as good as A with respect to every objective and is strictly better than A with respect to at least one objective. The graph of “scores” of all Pareto optimal solutions, the graph is often called an efficient frontier or a trade-off curve. 461

462 The procedure used to construct a trade-off curve between two objectives may be summarized as follows: Step 1: Choose an objective and determine its best attainable value v1. For the solution attaining v1, find the vale of objective 2, v2. Then (v1, v2) is a point on the trade-off curve. Step 2: For values v of objective 2 that are better than v2, solve the optimization problem in step 1 with the additional constraint that the value of objective 2 is at least as good as v. Varying v will give you other points on the trade-off curve. Step 1 obtained one endpoint of the trade-off curve. If we determine the best value of objective 2 that can be attained the other endpoint of the curve is obtained. 462

463 Operations Research: Applications and Algorithms
Chapter 12 Review of Calculus and Probability to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 463

464 Description A review of some basic topics in calculus and probability, which will be useful in later chapters. 464

465 12.1 Review of Differential Calculus
The limit is one of the most basic ideas in calculus. Definition: The equation means that as x gets closer to a (but not equal to a), the value of f(x) gets arbitrarily close to c. It is also possible that may not exist. 465

466 Example 1 Show that Show that does not exist. Solution
To verify this result, evaluate x2-2x for values of x close to, but not equal to, 2. To verify this result, observe that as x gets near 0, becomes either a very large positive number or a very large negative number.. Thus, as x approaches 0, will not approach any single number. 466

467 Definition: A function f(x) is continuous at a point a if
If f(x) is not continuous at x=a, we say that f(x) is discontinuous (or has a discontinuity) at a. 467

468 Example 2 Bakeco orders sugar from Sugarco. The per- pound purchase price of the sugar depends on the size of the order (see Table 1). Let x = number of pounds of sugar purchased by Bakeco f(x) = cost of ordering x pounds of sugar Then f(x) =25x for 0 ≤ x < f(x) =20x for 100 ≤ x ≤ f(x) =15x for x > For all values of x, determine if x is continuous or discontinuous. 468

469 Example 2 – cont’d Solution It is clear that and do not exist. Thus, f(x) is discontinuous at x=100 and x=200 and is continuous for all other values of x satisfying x ≥ 0. 469

470 Taylor Series Expansion
Higher Derivatives We define f(2)(a)=f ′′(a) to be the derivative of the function f ′(x) at x=a. Similarly, we can define (if it exists) f(n)(a) to be the derivative of f(n-1)(x) at x=a. Thus, for Example 3, f′′ (p) = 3000e-p(-1) – 3000e-p(1-p) Taylor Series Expansion In our study of queuing theory (Chapter 8), we will need the Taylor series expansion of a function f(x). 470

471 Given that fn-1(x) exists for every point on the interval [a,b], Taylor’s theorem allows us to write for any h satisfying 0 ≤ h ≤ b – a, (1) where (1) will hold for some number p between a and a + h. Equation (1) is the nth-order Taylor series expansion of the f(x) about a. 471

472 Example 4 Find the first-order Taylor series expansion of e-x about x = 0. Solution Since f′(x) = -e-x and f′′(x) = -e-x, we know that (1) will hold on any interval [0,b]. Also, f(0) = 1, f′(0) = -1, and f′′(x) = e-x. Then (1) yields the following first-order Taylor series expansion for e-x about x=0: This equation holds for some p between 0 and h. 472

473 Partial Derivatives We now consider a function f of n>1 variables (x1, x2, …,xn), using the notation f (x1, x2, …,xn) to denote such a function. Definition: The partial derivative of (x1, x2, …,xn) with respect to the variable xi is written , where 473

474 Suppose that for each i, we increase xi by a small amount Δxi
Suppose that for each i, we increase xi by a small amount Δxi. Then the value of f will increase by approximately We will also use second-order partial derivatives extensively. We use the notation to denote a second-order partial derivative. 474

475 Review of Integral Calculus
A knowledge of the basics of integral calculus is valuable when studying random variables. Consider two functions: f(x) and F(x). If F′(x), we say that F(x) is the indefinite integral of f(x). The fact that F(x) is the indefinite integral of f(x) is written The definite integral of f(x) from x=a to x=b is written 475

476 The Fundamental Theorem of Calculus states that if f(x) is continuous fro all x satisfying a ≤ x ≤ b, then where F(x) is any indefinite integral of f(x). 476

477 12.2 Differentiation of Integrals
In order to differentiate a function whose value depends on an integral you should let f(x, y) be a function of variables x and y, and let g(y) and h(y) be functions of y. Then is a function only of y. Leibniz’s rule for differentiating an integral states that 477

478 12.3 Basic Rules of Probability
Definition: Any situation where the outcome is uncertain is called an experiment. Definition: For any experiment, the sample space S of the experiment consists of all possible outcomes for the experiment. Definition: An event E consists of any collection of points (set of outcomes) in the sample space. Definition: A collection of events E1, E2,…,En is said to be a mutually exclusive collection of events if for i ≠ j (i=1,2,…,n and j=1,2,…n), Ei and Ej have no points in common. 478

479 With each event E, we associate an event Ē
With each event E, we associate an event Ē. Ē consists of the points in the sample space that are not in E. With each event E, we also associate a number P(E), which is the probability that event E will occur when we perform the experiment. The probabilities of events must satisfy the following rules of probability: 479

480 Rule 1 For any event E, P(E) ≥ 0.
Rule 2 If E=S (that is, if E contains all points in the sample space), then P(E) = 1. Rule 3 If E1, E2,…,En is mutually exclusive collection of events, then Rule 4 P(Ē) = 1 – P(E) 480

481 Definition: Suppose events E1 and E2 both occur with positive probability. Events E1 and E2 are independent if and only if P(E2|E1)=P(E2)(or equivalently, P(E1|E2) = P(E1)). 481

482 12.4 Bayes’ Rule Generally speaking, n mutually exclusive states of the world (S1, S2,…, Sn) may occur. The states of the world are collectively exhaustive: S1, S2,…, Sn include all possibilities. Suppose a decision maker assigns a probability P(Si) to Si. P(Si) is the prior probability of Si. Suppose that for each possible outcome Oj and each possible state of the world Si, the decision maker knows P(Oj|Si), the likelihood of the outcome Oj given the state of the world Si. 482

483 Bayes’ rule combines prior probabilities and likelihoods with the experimental outcomes to determine a post-experimental probability, or posterior probability, for each state of the world. Bayes’ rule: 483

484 12.5 Random Variables, Mean, Variance, and Covariance
Definition: A random variable is a function that associates a number with each point in an experiment’s sample space. We denote random variables by boldface capital letters (usually X, Y, or Z). Definition: A random variable is discrete if it can assume only discrete values x1, x2,…. A discrete random variable X is characterized by the fact that we know the probability that X = xi (written P(X=xi)). 484

485 P(X=xi) is the probability mass function (pmf) for the random variable X.
Definition: The cumulative distribution function F(x) = P(X≤x). For a discrete random variable X, 485

486 Continuous Random Variables
If, for some interval, the random variable X can assume all values on the interval, then X is a continuous random variable. Probability statements about a continuous random variable X require knowing X’s probability density function (pdf). The probability density function f(x) for a random variable X may be interpreted as follows: For Δ small, 486

487 Mean and Variance of a Random Variable
The mean (or expected value) and variance are two important measures that are often used to summarize information contained in a random variable’s probability distribution. The mean of a random variable X (written E(X)) is a measure of central location for the random variable. Mean of a discrete random variable X. 487

488 Mean of a continuous random variable,
For a function h(X) of a random variable X (such as X2 and eX), E[h(X)] may be computed as follows: If X is a discrete random variable, If X is a continuous random variable, 488

489 Variance of the discrete random variable X,
Variance of a continuous random variable, X, Also, var X may be found from the relation For any random variable X, (var X)½ is the standard deviation of X (written σx). 489

490 Independent Random Variables
Definition: Two random variables X and Y are independent if and only if for any two sets A and B, The definition of independence generalizes to situations where more than two random variables are of interest. Loosely speaking, a group of n random variables is independent if knowledge of the values of any subset of the random variables does not change our view of the distribution of any of the other random variables. 490

491 For two random variables X and Y, the covariance of X and Y (written (X,Y) is defined by
491

492 Mean, Variance, and Covariance for Sums of Random Variables
From given random variables X1 and X2, we often create new random variables (c is a constant): cX1, X1+c, X1+X2. The following rules can be used to express the mean, variance, and covariance of these random variables in terms of E(X1), E(X2), varX1, varX2, and cov(X1,X2). E(cX1)=cE(X1) E(X1+c)=E(X1)+c E(X1+X2)=E(X1)+E(X2) 492

493 If X1 and X2 are independent variables,
var cX1 = c2varX1 var(X1+c) = var X1 If X1 and X2 are independent variables, var(X1+X2) = varX1 + varX2 In general, var(X1+X2) = varX1 + varX2 + 2cov(X1+X2) For random variables X1, X2 ,… Xn, Finally, for constants a and b, cov(aX1, bX2)=ab cov(X1,X2) 493

494 12.6 The Normal Distribution
Definition: A continuous random variable X has a normal distribution if for some µ and σ > 0, the random variable has the following density function: 494

495 Useful Properties of Normal Distributions
Property 1: If X is N(µ,σ2), then cX is N(cµ,c2 σ2). Property 2: If X is N(µ,σ2), then X + c (for any constant c) is N(µ+c, σ2). Property 3: If X1 is N(µ1,σ12), X2 is N(µ2,σ22), and X1 and X2 are independent, then X1+X2 is N(µ1+µ2,σ12+ σ22). 495

496 Finding Normal Probabilities via Standardization
If Z is a random variable that is N(0,1), the Z is said to be a standardized normal random variable. If X is N(µ,σ2), then (X- µ)/σ is N(0,1). Suppose X is (µ,σ2) and we want to find P(a ≤ X ≤ b). We use the following relations (this procedure is called standardization): 496

497 This result is known as the Central Limit Theorem.
If X1, X2,…Xn are independent random variables, then for n sufficiently large (usually n>30), the random variable X = X1 + X2 +…+Xn may be closely approximated by a normal random variable X′ that has E(X′) = E(X1) + E(X2) +…+E(Xn) and varX′ = varX1 + varX2+…+varXn. This result is known as the Central Limit Theorem. When we say that X′ closely approximates X, we mean that P(a ≤ X ≤ b) is close to P(a ≤ X′ ≤ b). 497

498 1.7 Finding Normal Probabilities with Excel
Probabilities involving a standard normal random variable can be determines with the EXCEL=NORMSDIST function. The S in NORMSDIST stands for standardized normal. For example, P(Z≤-1) can be found by entering the formula =NORMSDIST(-1). The EXCEL=NORMSDIST function can be used to determine a normal probability for any normal (not just a standard normal) random variable. 498

499 If X is N(µ,σ2) the entering formula =NORMSDIST(a,µ,σ,1) will return P(X≤a).
The 1 ensures that EXCEL returns the cumulative normal probability. Changing the last argument to “0” causes EXCEL to return the height of the normal density function for X=a. 499

500 12.7 Z-Transforms Consider a discrete random variable X whose only possible values are nonnegative integers. For n=0,1,2,… let P(X=n)=an. We define (for |z|≤1) the z-transform of X (call it pxT(z)) to be 500

501 Operations Research: Applications and Algorithms
Chapter 13 Decision Making Under Uncertainty to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 501

502 Description We have all had to make important decisions where we were uncertain about factors that were relevant to the decisions. In this chapter, we study situations in which decisions are made in an uncertain environment. The chapter presents the basic theory of decision making under certainty: the widely used Van Neumann-Morgenstern utility model and the use of decision trees from making decisions at different points in time. We close by looking at decision making with multiple objectives. 502

503 13.1 Decision Criteria Dominated Actions The Maximin Criterion
Definition: An action ai dominated by an action ai' for all sj  S, rij ≤ ri' j, and for some state si', rij' < ri' j'. The Maximin Criterion For each action, determine the worst outcome (smallest reward). The maximin criterion chooses the action with the “best” worst outcome. 503

504 Definition: The maximin criterion chooses the action ai with the largest value of minjSrij.
The Maximax Criterion For each action, determine the best outcome (largest reward). The maximax criterion chooses the action with the “best” best outcome. Definition: The maximax criterion chooses the action ai with the largest value of maxj Srij. 504

505 The Expected Value Criterion
Minimax Regret The maximax regret criterion (developed by L. J. Savage) uses the concept of opportunity cost to arrive at a decision. For each possible state of the world sj, find an action i* (j) that maximizes rij. i*(j) is the best possible action to choose if the state of the world is actually sj. For any action ai and state sj, the opportunity loss or regret for ai in sj is ri*(j),j–rij. The Expected Value Criterion Chooses the action that yields the largest expected reward. 505

506 13.2 Utility Theory We now show how the Von Neumann- Morgenstern concept of a utility function can be use as an aid to decision making under uncertainty. Consider a situation in which a person will receive, for i = 1,2,…,n, a reward ri with probability pi. This is denoted as the lottery (p1, r1; p2, r2; …; pn, rn) 506

507 A lottery is often represented by a tree in which each branch stands for a possible outcome of the lottery The number on each branch represents the probability that the outcome will occur Our goal is to determine a method that a person can use to choose between lotteries. 507

508 Suppose he or she must choose to play L1 or L2 but not both.
We write L1pL2 if the person prefers L1. We write L1iL2 if he or she is indifferent between choosing L1 and L2. If L1iL2, we say that L1 and L2 are equivalent lotteries. More formally, a lottery L is a compound lottery if for some I, there is a probability pi that the decision maker’s reward is to play another lottery L′. 508

509 If a lottery is not a compound lottery, it is a simple lottery.
The utility of the reward ri, written u(r), is the number qi such that the decision maker is indifferent between the following lotteries: 1 ri and qi 1 - qi Most favorable outcome Least favorable outcome 509

510 The specification of u(ri) for all rewards ri is called the decision maker’s utility function.
For a given lottery L = (p1, r1; p2, r2;…;pn,rn), define the expected utility of the lottery L, written E(U for L), by 510

511 Von Neumann-Morgenstern Axioms
Axiom 1: Complete Ordering Axiom For any two rewards r1 and r2, one of the following must be true: The decision maker(1) prefers r1 to r2, (2) prefers r2 to r1, or (3) is indifferent between r1 and r2. Also, if the person prefers r1 to r2 and r2 to r3, then he or she must prefer r1 to r3 (transitivity of preferences). Axiom 2: Continuity Axiom If the decision maker prefers r1 to r2 and r2 to r3, then for some c(0<c<1),L1iL2, where c 1 - c r1 r3 L2 1 L1 $0 511

512 Axiom 3: Independence Axiom
Suppose the decision maker is indifferent between rewards r1 and r2. Let r3 be any other reward. Then for any c (0 < c < 1), L1iL2, where L1 and L2 differ only in that L1 has a probability c of yielding a reward r1, whereas L2 has the probability c of yielding a reward r2. c 1 - c r1 r3 L1 c 1 - c r2 r3 L2 512

513 Axiom 4: Unequal Probability Axiom
Thus the Independence Axiom implies that the decision maker views a chance at r1 and a chance c at r2 to be identical value, and this view holds for all values of c and r3. Axiom 4: Unequal Probability Axiom Suppose the decision maker prefers reward r1 to reward r2. If two lotteries have only r1 and r2, as their possible outcomes, he or she will prefer the lottery with the higher probability of obtaining r1. Axiom 5: Compound Lottery Axiom Suppose that when all possible outcomes are considered, a compound lottery L yields a probability pi of receiving a reward r1. 513

514 Why We May Assume u (Worst Outcome)=0 and u (Best Outcome)=1
Up to now, we have assumed that u (least favorable outcome)=0 and u (most favorable outcome)=1. Even if a decision maker’s utility function does not have these values, we can transform his or her utility function into a utility function having u=1. Lemma 1 – Given a utility function u(x), define for any a>0 and any b the function v(x) = au(x)+b. Given any two lotteries L1 and L2 it will be the case that 514

515 A decision maker using u(x) as his or her utility function will have L1pL2 is and only if a decision maker using v(x) as his or her utility function will have L1pL2. A decision maker using u(x) as his or her utility function will have L1iL2 if and only if a decision maker using v(x) as his or her utility functions will have L1iL2. 515

516 Using Lemma 1, we can show that without changing how an individual ranks lotteries, we can transform the decision maker’s utility function into one having u (least favorable outcome)=0 and u (most favorable outcome)=1. 516

517 Estimating an Individual’s Utility Function
How might we estimate an individual's utility function? We begin by assuming that the least favorable outcome (say - $10,000) has a utility of 0 and that the most favorable outcome (say, $30,000) has a utility of 1. Next we define a number x1/2 having u(x1/2) = ½. Eventually the utility function can be approximated by drawing a curve (smooth, we hope) joining the points. 517

518 Unfortunately, if a decision maker’s preference violate any of the preceding axioms (such as transitivity), this procedure may not yields a smooth curve. If it does not yield a relatively smooth curve, more sophisticated procedures for assessing utility function must be used. 518

519 Relation Between an Individual’s Utility Function and His or Her Attitude Toward Risk
Definition: The certainty equivalent of a lottery L, written CE(L), is the number CE(L) such that the decision maker is indifferent between the lottery L and receiving a certain payoff of CE(L). Definition: The risk premium of a lottery L, written RP(L), is given by RP(L) = EV(L)-CE(L), where EV(L) is the expected value of the lottery’s outcomes. 519

520 With respect to attitude toward risk, a decision maker is
Let a nondegenerate lottery be any lottery in which more than one outcome can occur. With respect to attitude toward risk, a decision maker is 1 Risk-averse if and only if for any nondegenerate lottery L, RP(L) > 0 2 Risk-neutral if and only if for any nondegenerate lottery L, RP(L) = 0 Risk-seeking if and only if for any nondegenerate lottery L, RP(L) < 0 520

521 Definition: A function u(x) is said to be strictly concave (or strictly convex) if for any two points on the curve y=u(x), the line segment joining those two points lies entirely (with the exception of its endpoints) below (or above) the curve y= u(x). In reality, many people exhibit risk-seeking behavior and risk avers behavior. A person whose utility function contains both convex and concave segments may exhibit both risk-avers and risk-seeking behavior. 521

522 Exponential Utility One important class is called exponential utility and has been used in many financial investment analyses. An exponential utility function has only one adjustable numerical parameter, and there are straightforward ways to discover the most appropriate value of this parameter for a particular individual or company. An exponential utility function has the following form: U(x) = 1-e-x/R 522

523 Here x is a monetary value (a payoff if positive, a cost if negative) U(x) is the utility of this value, an R>0 is an adjustable parameter called the risk tolerance. To assess a person’s (or company’s) exponential utility function, we need only to assess the value of R. It’s been shown that the risk tolerance is approximately equal to that dollar amount R such that the decision maker is indifferent between the following two options. 523

524 Option 1 – Obtain no payoff at all.
Option 2 – Obtain a payoff of R dollars or a loss of R/2 dollars, depending on the flip of a fair coin. A second tip for finding R is based on empirical evidence found by Ronald Howard, a prominent decision analyst. He discovered tentative relationships between risk tolerance and several financial variables – net sales, net income, and equity. 524

525 13.3 Flaws in Expected Utility Maximization: Prospect Theory and Framing Effects
The axioms underlying expected maximization of utility (EMU) seem reasonable, but in practice people’s decisions often deviate from the predictions of EMU. Psychologists Kahneman and Tversky (1981) developed prospect theory and framing effects for values to try and explain why people deviate from the predictions of EMU. 525

526 The shape of the π(p) function in the figure implies that individuals are more sensitive to changes in probability when the probability of an event is small (near 0) or large (near 1). 526

527 Framing Kahneman and Tversky’s idea of framing is based on the fact that people often set their utility functions from the standpoint of a frame or status quo from which they view the current situation. Most people’s utility functions treat a loss of a given value as being more serious than a gain of an identical value. 527

528 13.4 Decision Trees Often, people must make a series of decisions at different points in time. Then decision trees can often be used to determine optimal decisions. A decision tree enables a decision maker to decompose a large complex decision problem into several smaller problems. A decision fork represents a point in time when Colaco (the company in Example 3 in the book) has to make a decision. 528

529 An event fork is drawn when outside forces determine which of several random events will occur.
Each branch of an event fork represents a possible outcome, and the number on each branch represents the probability that the event will occur. A branch of a decision tree is a terminal branch if no forks emanate from the branch. 529

530 Incorporating Risk Aversion into Decision Tree Analysis
In the Colaco example, the optimal strategy yields a .45 chance that the company will end up with a relatively small final asset position of $50,000. On the other hand, the strategy of test marketing and acting optimally on the results of the test market study yields only a chance that Colaco’s asset position will be below $100,000. Thus if Colaco is a risk-averse decision maker, the strategy of immediately marketing nationally may not reflect the company’s preference. 530

531 Expected Value of Sample Information
Decision tree can be used to measure the value of sample or test market information. We begin by determining Colaco’s expected final asset position if the company acts optimally and the test market study is costless. We call this expected final asset position Colaco’s expected value with sample information (EVWSI). 531

532 We call this the expected value with original information (EVWOI).
We next determine the largest expected final asset position that Colaco would obtain if the test market study were not available. We call this the expected value with original information (EVWOI). Now the expected value of the test market information, referred to as expected value of sample information (EVSI), is defined to be EVSI = EVWSI – EVWOI. 532

533 Expected Value of Perfect Information
We can modify the analysis used too determine EVSI to find the value of perfect information. By perfect information we mean that all uncertain events that can affect Colaco’s final asset position still occur with the given probabilities. Expected value with perfect information (EVWPI) is found by drawing a decision tree in which the decision maker has perfect information about which state has occurred before making a decision. 533

534 Then the expected value of perfect information (EVPI) is given by EVPI=EVWPI – EVWOI.
534

535 13.5 Bayes’ Rule and Decision Trees
We are also given estimates of the probabilities of each state of the world. These are called prior probabilities. In different states of the world, different decisions may be optimal. It may be desirable to purchase information that gives the decision maker more foreknowledge about the state of the world. This may enable the decision maker to make better decisions. 535

536 The probabilities p(si|oj) are called posterior probabilities.
Given knowledge of the outcome of the experiment, these probabilities give new values for the probability of each state of the world. The probabilities p(si|oj) are called posterior probabilities. In many situations, however, we may be given the prior probabilities p(si) for each state of the world, and instead of being given the posterior probabilities p(si|oj), we might be given the likelihoods. 536

537 With the help of Bayes’ rule, we can use the prior probabilities and likelihoods to determine the needed posterior probabilities. To begin the computation of the posterior probabilities, we need to determine the joint probabilities of each state of the world and experimental outcome. We obtain the joint probabilities by using the definition of conditional probabilities. 537

538 Next we computer the probability of each possible experimental outcome p(LS) and p(LF).
Now Bayes’ rule can be applied to obtain the desired posterior probabilities. 538

539 In summary, to find posterior probabilities, we go through the following three step process:
Step 1 – Determine the joint probabilities of the form p(si∩oj) by multiplying the prior probability p(si) times the likelihood p(oj|si). Step 2 – Determine the probabilities of each experimental outcome p(oj) by summing up all joint probabilities of the form p(sk∩oj). Step 3 – Determine each posterior probability (p(si|oj)) by dividing the joint probability (p(si∩oj)) by the probability of the experimental outcome oj(p(oj)). 539

540 Using LINGO to Compute Posterior Probabilities
LINGO can be used to compute posterior probabilities. An example program can be found in the book. 540

541 13.6 Decision Making with Multiple Objectives
Suppose a woman believes that there are n attributes that will determine her decision. Let xi(a) be the value of the ith attribute associated with an alternative a. She associates a value v(x1(a), x2(a),…, xn(a)) with the alternative a. The function v(x1, x2,…, xn) is the decision maker’s value function. Alternatively, the decision maker can associate a cost c(x1(a), x2(a),…, xn(a)) with the alternative a. 541

542 The function c(x1, x2,…, xn) is her cost function.
Definition: A value function v(x1, x2,…, xn) is an additive value function if there exist n functions v1(x1), v2(x2),…, v2(xn) satisfying Definition: A cost function c(x1, x2,…, xn) is an additive cost function if there exist n functions c1(x1), c2(x2),…, c2(xn) satisfying 542

543 Definition: An attribute (call it attribute 1) is preferentially independent (pi) of another attributes (attribute 2) if preferences for values of attribute 1 do not depend on the value of attribute 1. Definition: If attribute 1 is pi of attribute 2, and attribute 2 is pi of attribute 1, then attribute is mutually preferentially independent (mpi) of attribute 2. 543

544 Definition: A set of attributes S mutually preferentially independent (mpi) of a set of attributes S′ if (1) the values of the attributes in S′ do not affect preferences for the values of attributes in S, and (2) the values of attributes in S do not affect preference for the values of attributes in S′. Definition: A set of attributes 1,2,…,n is mutually preferentially independent (mpi) if for all subsets S of {1,2,…,n}, S is mpi of S. 544

545 Theorem 1: If the set of attributes 1,2,…n is pi, the decision maker’s preferences can be represented by an additive value (or cost) function. When more than one attribute affects a decision maker’s preferences, the person’s utility function is called a multiattribute utility function. We restrict ourselves here to explaining how to assess and use multiattribute utility functions when only two attributes are operative. 545

546 Properties of Multiattribute Utility Functions
Definition: Attribute 1 is utility independent (ui) of attribute 2 if preferences for lotteries involving different levels of attribute 1 do not depend on the level of attribute 2. Definition: If attribute 1 is ui of attribute 2 is ui of attribute 1, then attributes 1 and 2 are mutually utility independent (mui). If attributes 1 and 2 are mui, it can be shown that the decision maker’s utility function u(x1,x2) must be of the following form 546

547 This equation is often called the multilinear utility function.
Theorem 2: Attributes 1 and 2 are mui if and only if the decision maker’s utility function u(x1,x2) is a multilinear function of the form The determination of a decision maker's utility function u(x1, x2) can be further simplified if it exhibits additive independence. 547

548 Definition: A decision maker’s utility function exhibits additive independence if the decision maker is indifferent between x1(best), x2(best) x1(worst), x2(worst) x1(best), x2(worst) x1(worst), x2(best) 548

549 Essentially, additive independence of attributes 1 and 2 implies that preferences over lotteries involving only attribute 1 depend only on the marginal distribution for possible values of attribute 1 and do not depend on the joint distribution of the possible values of attributes 1 and 2. 549

550 Assessment of Multiattribute Utility Functions
If attributes 1 and 2 are mui, how can we determine u1(x1), u2(x2), k1, k2), and k3? The procedure to be used in assessing a multiattribute utility function may be summarized as follows: Step 1: Check whether attributes 1 and 2 are mui. If they are go to Step 2. If the attributes are not mui, the assessment of the multiattribute utility function is beyond the scope of our discussion. Step 2: Check for additive independence. Step 3: Assess u1(x1) and u2(x2). 550

551 Step 4: Determine k1, k2, and (if there is no additive independence k3.
Step 5: Check to see whether the assessed utility function is really consistent with the decision maker’s preferences. To do this, set up several lotteries and use the expected utility of each to rank the lotteries from most to least favorable. 551

552 Use of Multiattribute Utility Functions
To illustrate how a multiattribute utility function might be used, suppose that a company must determine whether to mount a small or large advertising campaign during the coming year. To find the best option the company must determine which of the lotteries has a larger expected utility. 552

553 13.7 The Analytic Hierarchy Process
We have discussed situations in which a decision maker chooses between alternatives on the basis of how well the alternatives meet various objectives. When multiple objectives are important to a decision maker, it may be difficult to choose between alternatives. Thomas Saaty’s analytic hierarchy process (AHP) provides a powerful tool that can be use to make decisions in situations involving multiple objectives. 553

554 Obtaining Weights for Each Objective
Suppose there are n objectives. We begin by writing down an n x n matrix (known as the pairwise comparison matrix) A. The entry in row i and column j of A indicates how much more important objective I is than objective j. 554

555 Checking for Consistency
We can now use the following four-step procedure to check for consistency of the decision maker’s comparison. (w denotes our estimate of the decision maker’s weights) Step 1: Compute AwT. Step 2: Compute Step 3: Computer the consistency index(CI) as follows Step 4: Compare CI to the random index (RI) for the appropriate value of n. 555

556 For a perfectly consistent decision maker, the ith entry in AwT=n (ith entry of wT).
If CI is sufficiently small, the decision maker’s comparisons are probably consistent enough to give useful estimates of the weights for his or her objective function. 556

557 Finding the Score of an Alternative for an Objective
We now determine how well each job “satisfies” or “scores” on each objective. To determine these scores, we construct for each objective a pairwise comparison matrix in which the rows and columns are possible decisions. As described earlier, we can now “synthesize” the objective weights with the scores of each job on each objective to obtain an overall score for each alternative. 557

558 AHP has been applied by decision makers in countless areas, including accounting, finance, marketing, energy resource planning, microcomputer selection, sociology, architecture, and political science. 558

559 Implementing AHP on a Spreadsheet
Figure 5 in the book illustrates how easy it is to implement AHP on a spreadsheet. The work is completed in the AHP.xls file. To computer the consistency index for a pairwise comparison matrix for objectives the EXCEL matrix multiplication function MMULT is used, computing AwT. The MMULT function is easily used to multiply matrices. 559

560 Operations Research: Applications and Algorithms
Chapter 14 Game Theory to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 560

561 14.1 Two-Person Zero-Sum and Constant-Sum Games: saddle Points
Characteristics of Two-Person Zero-Sum games There are two players (called the row player and column player) The row player must choose 1 of m strategies. Simultaneously, the column player must choose 1 if n strategies. If the row player chooses her ith strategy and the column player chooses her jth strategy, then the row player receives a reward for aij and the column player loses an amount aij. Thus, we may think of the row player’s reward of aij as coming from the column player. Such a game is called a two-person zero- sum game, which can be represented by a reward matrix. 561

562 A zero-sum game is a game in which the payoffs for the players always adds up to zero is called a zero-sum game. Two-person zero-sum is played according to the following basic assumption: Each player chooses a strategy that enables him/her to do the best he/she can, given that his/her opponent knows the strategy he/she is following. A two-person zero-sum game has a saddle point if and only if Max (row minimum) = min (column maximum) all all rows columns 562

563 If a two-person zero-sum game has a saddle point
The row player should choose any strategy (row) attaining the maximum on the right side of (1). The column player should choose any strategy (column) attaining the minimum on the right side of (1). An easy way to spot a saddle point is to observe that the reward for a saddle point must be the smallest number in its row and the largest number in its column. A saddle point can also be thought of as a equilibrium point in that neither player can benefit from a unilateral change in strategy. 563

564 A two-person constant-sum game is a two- player game in which, for any choice of both player’s strategies, the row player’s reward and the column player’s reward add up to a constant value c. 564

565 14.2 Two-Person Zero-Sum Games: Randomized Strategies, Domination and Graphical Solution
It has been shown that not all two-person zero-sum games have saddle points. How can the value and optimal strategies for a two-person zero-sum game that does not have a saddle point be found? 565

566 Example 2: Odds and Evens
Two players (called Odd and Even) simultaneously choose the number of fingers (1 or 2) to put out. If the sum of the fingers put out by both players is odd, then Odd wins $1 from Even. If the sum of the fingers put out by both players is even, then Even wins $1 from Odd. Consider the row player to be Odd and the column player to be Even. Determine whether this game has a saddle point. 566

567 Example 2: Solution This is a zero-sum game with the reward matrix shown. Observe that for any choice of strategies by both players, there is a player who can benefit by unilaterally changing his or her strategy. Thus, no choice of strategies by the player is stable. Column Player (Even) Row Player (Odd) 1 Finger 2 Fingers Row minimum -1 +1 Column Maximum 567

568 To further the analysis of the Example, the set of allowable strategies for each player must be expanded to include randomized strategies. Any mixed strategy (x1, x2, … ,xn) for the row player is a pure strategy if any of the xi equals 1. Any mixed strategy may be written as (x1, 1- x1), and it suffices t determine the value of x1. As functions of x1, the expected rewards can be drawn as line segments on a graph. 568

569 The common value of the floor and ceiling is called the value of the game to the row player.
Any mixed strategy for the row player that guarantees that the row player gets an expected reward a least equal to the value of the fame is an optimal strategy for the row player. 569

570 14.3 Linear Programming and Zero-Sum Games
Linear programming can be used to find the value and optimal strategies for any two- person zero-sum game. 570

571 Example 4: Stone, Paper, Scissors
Two players must simultaneously utter one of the three words stone, paper, or scissors and show corresponding hand signs. If both players utter the same word, then the game is a draw. Otherwise one layer wins $1 from the other player according to the these rules Scissors defeats (cuts) paper Paper defeats (covers) stone Stone defeats (breaks) scissors Find the optimal strategies for this two-person zero-sum game. 571

572 Example 4: Solution The reward matrix is Define:
x1 = probability that row player chooses stone x2 = probability that row player chooses paper x3 = probability that row player chooses scissors y1 = probability that row player chooses stone y2 = probability that row player chooses paper y3 = probability that row player chooses scissors Column Player Row Player Stone Paper Scissors Row minimum -1 +1 Column minimum 572

573 The row player has chosen a mixed strategy (x1, x2, x3) then the optimal strategy can be found by solving the LP The column player has chosen a mixed strategy (y1, y2, y3) then the optimal strategy can be found by solving the LP max z = v s.t v ≤ x2 – x (Stone constraint) v ≤ -x1 + x3 (Paper constraint) v ≤ x1 – x (Scissor constraint) x1 + x2 +x3 = 1 x1, x2, x3 ≥ 0; v urs min z = w s.t w ≤ -y2 – y (Stone constraint) w ≤ y1 - y (Paper constraint) w ≤ -y1 + y (Scissor constraint) y1 + y2 +y3 = 1 y1, y2, y3 ≥ 0; w urs 573

574 The column player’s LP is the dual of the row player’s LP.
Thus the row player’s floor equals the column players ceiling. This result is often known as the Minimax Theorem. The common value of v and w the value of the game to the row player. 574

575 Ex. 4 – Solution continued
The most negative element in the Stone, Paper, Scissors reward matrix is -1. Therefore, we add |-1| to each element of the reward matrix. This yields the constant-sum game. The LP’s for each player are modified. Stone, Paper, Scissors appears to be a fair game so we conjecture that v = w = 0. This solution is dual feasible. Thus the primal feasible and dual feasible solution has been found. 575

576 Ex. 4 – Solution continued
The value of Stone, Paper, Scissors is v’ -1 = 0. The optimal strategy for the row player is ( 1/3, 1/3, 1/3). The optimal strategy for the column player is ( 1/3, 1/3, 1/3). 576

577 Simply type in either the row or column player’s problem.
LINDO or LINGO can be used to solve for the value and the optimal strategies in a two- person zero-sum game. Simply type in either the row or column player’s problem. 577

578 14.4 Two-Person Nonconstant-Sum Games
Most game-theoretic models of business situations are not constant-sum games, because it is unusual for business competitors to be in total conflict. As in a two-person zero sum game, a choice of strategy by each player (prisoner) is an equilibrium point if neither player can benefit from a unilateral change in strategy. 578

579 Example 7: Prisoner’s Dilemma
Two prisoners who escaped and participated in a robbery have been recaptured and are awaiting trial for their new crime. Although they are both guilty the district attorney is not sure he has enough evidence to convict them. To entice them to testify against each other, the district attorney tells each prisoner “If only one of you confesses and testifies against your partner, the person who confesses will go free while the person who does not confess will surely be convicted and given a 20-year jail sentence. If both of you confess, then you will both be convicted and sent to prison for 5 years. Finally, if neither of you confess, I can convict you both of a misdemeanor and you will each get 1 year in prison.” What should each prisoner do? 579

580 Example 7: Solution Assume that the prisoners cannot communicate with each other, the strategies and rewards for each are shown. This is not a constant-sum two-player game. For each prisoner the “confess” strategy dominates the “don’t confess” strategy. Each prisoner seeks to eliminate any dominated strategies from consideration. Prisoner 2 Prisoner 1 Confess Don’t Confess (5,5) (0, -20) (-20,0) (-1, -1) 580

581 Ex. 7 – Solution continued
On the other hand if each prisoner chooses the dominated “don’t confess” strategy, then each prisoner will spend only 1 year in prison. If each prisoner chooses his dominated strategy, both are better off than if each prisoner chooses his undominated strategy. 581

582 A general prisoner’s dilemma reward matrix where:
NC = noncooperative action C = cooperative action P = punishment for not cooperating S = payoff to person who is double-crossed R = reward for cooperating if both players cooperate T = temptation for double-crossing opponent Player1 Player 2 NC C (P, P) (T, S) (S, T) (R, R) 582

583 14.5 Introduction to n-Person Game Theory
In many competitive situations, there are more than two competitors. Any game with n players is an n-person game and is specified by the game’s characteristic function. For each subset S of N, the characteristic function v of a game gives the amount of v(S) that the members of S can be sure of receiving if they act together and for a coalition. S ca be determined by calculating the amount that members of S can get without help from players who are not in S. 583

584 Example 11: The Drug Game Joe willie has invented a new drug.
Joe cannot manufacture the drug himself but can sell the drug’s formula to company 2 or company 3. The lucky company will split a $1 million profit with Joe Willie. Find the characteristic function for this game. 584

585 Example 11: Solution Letting Joe Will be player 1, company 2 be player 2 and company 3 be player 3, the characteristic function for this game is v({ }) = v({1}) = v({2}) = v({3}) = v({2,3}) = v({1,2 }) = v({1,3}) = v({1, 2,3}) = $1,000,000 585

586 The characteristic function must satisfy the following inequality
Consider any two subsets of sets A and B such that A and B have no players in common (A  B ≠ ). The characteristic function must satisfy the following inequality This property is called superadditivity. There are many solution concepts for n-person games. A solution concept should indicate the reward that each player will receive. 586

587 If x satisfies both, it is said that x is an imputation.
Let x={x1, x2, …,xn} be a vector such that player i receives a reward xi. This is called a reward vector. A reward vector x={x1, x2, …,xn} is not a reasonable candidate for a solution unless x satisfies If x satisfies both, it is said that x is an imputation. (Group rationality) (Individual rationality) 587

588 14.6 The Core of an n-Person Game
An important solution concept for an n-person game is the core. The concept of domination Given the imputation dominates x through a coalition S (written ) if The core of an n-person game is the set of all undominated imputations. And for all 588

589 Theorem 1 An imputation is in the core of an n-person game if and only if for each subset S of N
589

590 Ex. 11 – Solution continued
Find the core of the drug game. For this game, Theorem 1 shows that x = (x1, x2, x3) will be in the core if and only if x1, x2, x3 satisfy the following The core of the game is ($1,000,000, $0, $0) and it emphasizes the importance of player 1. x1 ≥ 0 x2 ≥ 0 x3 ≥ 0 X1+ x2 + x3 = $1,000,000 x1 + x2 ≥ $1,000,000 x1 + x3 ≥ $1,000,000 x2 + x3 ≥ 0 X1+ x2 + x3 = $1,000,000 590

591 14.7 The Shapley Value An alternative solution concept for n-person games is the Shapley value, which in general gives more equitable solutions than the core does. Theorem 2 Given any n-person game with the characteristic function v, there is a unique reward vector x = (x1, x2, …, xn) satisfying the axioms. The reward of the ith player (xi) is given by 591

592 Ex. 11 – Solution continued
Find the Shapley value for the drug game. To compute x1, the reward that player 1 should receive, list all coalitions S for which player 1 is not a member. For each of these coalitions, compute v(S  {i}) – v(S) and p3(S). 592

593 Operations Research: Applications and Algorithms
Chapter 15 Deterministic EOQ Inventory Models to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 593

594 Description We now begin our formal study of inventory modeling.
We begin by discussing some important concepts used to describe inventory models and then develop versions of the famous economic order quantity (EOQ) model that can be used to make optimal inventory decisions when demand is deterministic. 594

595 15.1 Introduction to Basic Inventory Models
The purpose of inventory theory is to determine rules that management can use to minimize the costs associated with maintaining inventory and meeting customer demand. Inventory models answer the following questions When should an order be placed for a product? How large should each order be? 595

596 Costs Involved in Inventory Models
The inventory models that we will discuss involve some or all of the following costs: Ordering and Setup Cost These costs do not depend on the size of the order. They typically include things like paperwork, billing or machine setup time if the product is made internally. Unit Purchasing Cost This cost is simply the variable cost associated with purchasing a single unit. Typically, the unit purchasing cost includes the variable labor cost, variable overhead cost, and raw material cost. 596

597 Holding or Carrying Cost
This is the cost of carrying one unit of inventory for one time period. The holding costs usually includes storage cost, insurance cost, taxes on inventory and others. Usually, however, the most significant components of holding cost if the opportunity cost incurred by tying up capital in inventory. Stockout or Shortage Cost When a customer demands a product and the demand is not met on time, a stockout, or shortage, is said to occur. If they will accept delivery at a later date, we say the demands are back-ordered. This case is often referred to as the backlogged demand case. If they will not accept late delivery, we are in the lost sales case. These costs are often harder to measure than other costs. 597

598 Assumptions of EOQ Models
Repetitive Ordering The ordering decision is repetitive in the sense that it is repeated in a regular fashion. Constant Demand Demand is assumed to occur at a known, constant rate. Constant Lead Time The lead time for each order is a known constant, L. By the lead time we mean the length of time between the instant when an order is placed and the instant at which the order arrives. 598

599 Continuous Ordering An order may be placed at any time. Inventory models that allow this are called continuous review models. If the amount of on-hand inventory is reviewed periodically and orders may be placed only periodically, we are dealing with a periodic review model. 599

600 15.2 Basic Economic Order Quantity Model
For the basic EOQ model to hold, certain assumptions are required: Demand is deterministic and occurs at a constant rate. In an order of any size (say q units is placed, an ordering and setup cost K is incurred. The lead time for each order is zero. No shortages are allowed. The cost per unit-year of holding inventory is h. 600

601 Assumptions of the Basic EOQ Model
We define D to be the number of units demanded per year. The setup cost K of assumption 2 is in addition to a cost pq of purchasing or producing the q unites ordered. We assume p (the purchasing cost) does not depend on the size of the order. Assumption 3 assumes each order arrives on time. Assumption 4 implies that all demand is met on time. 601

602 Assumption 5 implies that if I units are held for T years, a holding cost of ITh is incurred.
Given these assumptions, the EOQ model determines an ordering policy that minimizes the yearly sum of ordering cost, purchasing cost, and holding cost. 602

603 Derivation of Basic EOQ Model
We begin by making some simple observations. We should never place an order when I, the inventory level, is greater than zero; if we place an order then the we are incurring an unnecessary holding cost. If I=0, we must place an order to prevent a shortage. Each time an order is placed (when I=0) we should order the same quantity, q. TC(q)= annual cost of placing orders + annual purchasing cost + annual holding cost To determine the annual holding cost, we need to examine the behavior of I over time. 603

604 A key concept in the study of EOQ models is the idea of a cycle.
Definition: Any interval of time that begins with the arrival of an order and ends the instant before the next order is received is called a cycle. In the figure the average inventory during any cycle is simply half of the maximum inventory level attained during the cycle. This result will hold in any model for which demand occurs at a constant rate. 604

605 The economic order quantity, or EOQ, minimizes TC(q).
Thus, q* does indeed minimize total annual cost. The figure on the next page confirms the fact that at q*, the annual holding and ordering costs are the same. 605

606 Example 1 Braneast Airlines uses 500 taillights per year.
Each time an order for taillights is placed, an ordering cost of $5 is incurred. Each light cost 40¢, and the holding cost is 8¢/light/year. Assume that demand occurs at a constant rate and shortages are not allowed. What is the EOQ? How many orders will be placed each year? How much time will elapse between the placement of orders? 606

607 Example 1 Solution We are given that K = $5, h = $0.08/light/year, and D = 500 lights/year. The EOQ is Hence, the airline should place an order for taillights each time that inventory reaches zero. 607

608 The time between placement (or arrival) of orders is simply the length of a cycle.
Since the length of each cycle is , q*/D, the time between orders will be 608

609 Sensitivity of Total Cost to Small Variations in the Order Quantity
In most situations, a light deviation from the EOQ will result in only a slight increase in costs. For Example 1, let’s see how derivations from the EOQ change the total annual cost. We focus our attention on how the annual holding cost and ordering costs are affected by the changes in order quantity. Let HC(q) = annual holding cost if the order quantity is q OC(q) = annual ordering cost if the order quantity is q 609

610 We find that HC(q)=0.04q and OC(q)=2500/q.
The figure shows that HC(q) + OC(q) is very flat near q*. The flatness of the HC(q) + OC(q) is very important, because it is often difficult to estimate h and K. Inaccurate estimation of h and K may result in a value of q that differs slightly from the actual EOQ. The flatness of the HC(q) + OC(q) curve indicates that even a moderate error in the determination of the EOQ will only increase costs by a slight amount. 610

611 Determination of EOQ When Holding Cost is Expressed in Terms of Dollar Value of Inventory
Often the annual holding cost is expressed in terms of the cost of holding one dollar’s worth of inventory for one year. Suppose that hd = cost of holding one dollar in inventory for one year. Then the cost of holding one unit of inventory for one year will be phd and may be written as 611

612 The Effect of Nonzero Lead Time
We now allow lead time L to be greater than zero. The introduction of a nonzero lead time leaves the annual holding and ordering costs unchanged. Hence, the EOQ still minimizes total costs. To prevent shortages from occurring and to minimize holding cost, each order must be placed at an inventory level that ensures that when each order arrives, the inventory level will equal zero. Definition: The inventory level at which an order should be placed is the reorder point. 612

613 To determine the reorder point for the basic EOQ model, two cases must be considered:
Case 1: Demand during the lead time does not exceed the EOQ. In this case the reorder point occurs when the inventory level equals LD. The order will arrive L time units later, and upon arrival of the order, the inventory level will equal LD - LD=0. Case 2: Demand during the lead time exceeds EOQ. In this case the reorder point does not equal LD. In general it can be shown that the reorder point equals the remainder when LD is divided by the EOQ. 613

614 The determination of the reorder point becomes extremely important when demand is random and stockouts can occur. 614

615 Spreadsheet Template for the Basic EOQ Model
The user inputs the values of K, h, lead time (L), and D. In A8 the EOQ is determined by the formula (2*K*D/H)^.5. In B8 we computer annual holding costs by the formula .5*A8*H. In D5 we computer orders per year for the EOQ by the formula D/A8. In C8 we compute annual holding costs for the EOQ from the formula K*D5. 615

616 In D8 we computer total annual cost for the EOQ by the formula B8+C8.
In B11 we computer the reorder point by the formula =MOD(L*D,A8) This yields the remainder obtained when L*D is divided by the EOQ. 616

617 Power of Two Ordering Policies
Suppose a company orders three products and the EOQ’s for each product yield time between orders of 3.5 days, 5.6 days and 9.2 days. It would rarely be the case that they all arrive on the same days. If we could synchronize our reorder intervals so that orders for different products often arrived on the same day we often can greatly reduce our fixed costs. 617

618 Roundy (1986) devised an elegant and simple method called Power of Two Ordering Policies to ensure that the placement of orders for multiple products are well synchronized. Let q* = EOQ. Then the optimal reorder interval for a product is t* = q*/D. The virtue of a Power of Two policy is that different products will frequently arrive at the same time. This will greatly reduce fixed costs. 618

619 15.3 Computing the Optimal Order Quantity When Quantity Discounts Are Allowed
Up to now, we have assumed that the annual purchase cost does not depend on the order size. In real life, however, suppliers often reduce the unit purchasing price for large orders. Such price reductions are referred to as quantity discounts. The approach used previously is no longer valid and we need a new approach to find the optimal quantity. 619

620 If we let q be the quantity ordered each time an order is placed, the general quantity discount model analyzed in this section may be described as follows: If q<b1, each item cost p1 dollars. If b1≤q<b2, each item costs p2 dollars. If bk-2≤q<bk-1, each item cost pk-1 dollars. If bk-1≤q<bk=∞, each item costs pk dollars. Since b1, b2, …., bk-1 are points where a price change occurs, we refer to them as price break points. 620

621 Example 4 A local accounting firm in Smalltown orders boxes of floppy disks from a store in Megalopolis. The per-box price charged by the store depends on the number of boxes purchased. The accounting firm uses 10,000 disks per year. The cost of placing an order is assumed to be $100. 621

622 Example 4 The only holding cost is the opportunity cost of capital, which is assumed to be 20% per year. For this example, b1 = 100 b2 = 300 p1 =$50.00 p2 = $49.00 p3 = $48.50 622

623 These definitions are illustrated on the next slide.
Before explaining how to find the order quantity minimizing total annual costs, we need the following definitions: TCi(q) = total annual cost if each order is q units at a price pi EOQi = quantity that minimizes total annual cost if, for any order quantity, the purchasing cost of the item is pi EOQi is admissible if bi-1≤EOQi<bi. TC(q) = actual annual cost of q items are ordered each time an order is placed. These definitions are illustrated on the next slide. Our goal is to find a value of q minimizing TC(q). 623

624 In general, the value of q minimizing TC(q) can be either a break point (shown in second figure) or some EOQi (shown in first figure). The following observations are helpful in determining the point that minimizes TC(q). For any value of q, TCk(q) < TCk-1(q) < … <TC2(q) <TC1(q) If EOQi is admissible, then minimum cost for bi- 1≤q<bi occurs for q=EOQi (see the first figure on the next slide). If EOQi<bi-1, the minimum cost for bi- 1≤q<bi occurs for q=bi-1 (see second figure on next slide). 624

625 If EOQi is admissible, then TC(q) cannot be minimized at an order quantity for which the purchasing price per item exceeds pi. Thus if EOQi is admissible, the optimal order quantity must occur for either price pi, pi+1,…, or pk. These observations allow us to use the following method to determine the optimal order quantity when quantity discounts are allowed. Beginning with the lowest price, determine for each price the order quantity that minimizes total annual cost for bi-1 ≤ q <bi. 625

626 Continue determining q. k, q. k-1…. Until one of the q
Continue determining q*k, q*k-1…. Until one of the q*i’s is admissible; from observation 2, this will mean that q*i = EOQi. The optimal order quantity will be the member of {q*k, Q*k-1,…q*i} with the smallest value of TC(q). 626

627 Example 4 continued Each time an order is placed for disks, how many boxes of disks should be ordered? How many orders will be placed annually? What is the total annual cost of meeting the accounting firm’s disk needs? 627

628 Example 4 Solution Note the K=$100 and D = 1000 boxes per year.
We first determine the best order quantity for p3 = $48.50 and 300 ≤ q. Then Since EOQ3 < 300, EOQ3 is not admissible. Therefore q ≥ 300, TC3(q) is minimized by q*3 = 300. 628

629 Example 4 Solution We next consider p2=$49.00 and 100 ≤ q < Then Since 100 ≤ EOQ2 < 300, EOQ2 is admissible, and for a price p2 = $49.00, the best we can do is to choose q*2 = Since q*2 is admissible, p1 =%50.00 and 0 ≤ q < 100 cannot yield the order quantity minimizing TC(q). Thus, either q*2 or q*3 will minimize TC(q). 629

630 Example 4 Solution To determine which of these order quantities minimizes TC(q), we must find the smaller of TC3(300) and TC2(142.86). For q*3 the annual holding cost is $9.70. Thus, for q*3, Annual ordering cost = 100(1000/300) = $333.33 Annual purchasing cost = 1000(48.50) = $48,500 Annual holding cost = (½)(300)(9.7) = $1455 TC3(300) = $50, 630

631 Example 4 Solution For q*2 the annual holding cost is $9.80. Thus, for q*2, Annual ordering cost = 100(1000/142.86) = $699.99 Annual purchasing cost = 1000(49) = $49,000 Annual holding cost = (½)(142.86)(9.8) = $700.01 TC3(142.86) = $50, 400 Thus q*3 = 300 will minimize TC(q). Our analysis shows that each time an order is placed, 300 boxes of disks should be ordered. Then 3.33 orders are placed each year. 631

632 Spreadsheet Template for Quantity Discounts
The table illustrates how inventory problems with a quantity discount can be solved on a spreadsheet. In cell C2 we enter D, the annual demand. In cell D2(HD) we enter the annual cost of holding $1 of goods in inventory for one year. In the cell range A6:C8 we enter the left-hand endpoint, right-hand endpoint, and price for each interval. 632

633 Spreadsheet Template for Quantity Discounts
In D6 we computer the EOQ for the interval b0 = 0 ≤ order quantity < 100=b1 by entering the formula (2*$K*$D/(($HD*C6))^.5). This statement computes the order quantity in the first interval that in E6 we enter the formula =IF(AND(D6>=A6,D6<B^),D6,IF(D6<A6,A6,B^-1) 633

634 15.4 The Continuous Rate EOQ Model
Many goods are produced internally rather than purchased from an outside supplier. In this situation, the EOQ assumption that each order arrives at the same instant seems unrealistic; it isn’t possible to produce, say, 10,000 cars at the drop of a hard hat. If a company meets demand by making its own products, the continuous rate EOQ model will be more realistic than the traditional EOQ model. 634

635 The variation of inventory over time is shown.
The continuous rate EOQ model assumes that a firm can produce a good at a rate of r units per time period. This means that during any time period of length, t, the firm can produce rt units. We define q = number of units produced during each production run K = cost of setting up a production run H = cost of holding one unit in inventory for one year D = annual demand for the product The variation of inventory over time is shown. 635

636 As r increases, production occurs at a more rapid rate
As r increases, production occurs at a more rapid rate. Hence, for large r, the rate model should approach the instantaneous delivery situation of the EOQ model. 636

637 Spreadsheet Template for the Continuous Rate EOQ Model
The table in the book illustrates a template for the continuous rate EOQ model. In cell A6 the user inputs K; in B6, h; in C6, D; and in D6, the production rate r. In A8 the formula (2*K*D/H)^.5*(R/(R-D))^.5 computes the optimal size. In B8 the formulas D/Q computers the number of runs per year. In C8 we computer the annual cost with the formula (H*Q*(R-D)/(2*R))+K*D/Q. 637

638 15.5 & 15.6 The EOQ Model with Back Orders Allowed & When to Use EOQ Models
In many real-life situations, demand is not met on time, and shortages occur. When a shortage occurs, costs are incurred. Let s by the cost of being short one unit for one year. The variables K, D, and h have their usual meanings. In most situations s is very difficult to measure. We assume all demand is backlogged and no sales are lost. 638

639 To determine the order policy that minimizes annual costs, we define
q = order quantity q - M = maximum shortage that occurs under an ordering policy Equivalently, the firm will be q – M units short each time an order is placed. We assume lead time for each order is zero. The firm’s maximum inventory level will be M – q + q = M. 639

640 TC(q,M) is minimized by q* and M*:
Maximum shortage = q* - M* As s approaches infinity, q* and M* both approach the EOQ, and the maximum shortage approaches zero. 640

641 Spreadsheet Template for the EOA Model with Back Orders
A spreadsheet template for the EOQ model with back orders. In cells A6, B6, C6, and D6 we enter the values of K, D, h, and s, respectively. In A8 we computer the optimal order quantity with the formula (2*K*D(H+S)/(H*S))^.5 In B8 we computer the optimal value of M with the formula (2*K*D*S/(H*(H+S)))^.5 In C8 we compute the maximum shortage with the formula =Q-M. 641

642 In D8 we compute the annual total cost TC(q,M) with the formula (M^2
In D8 we compute the annual total cost TC(q,M) with the formula (M^2*H)/(2*Q))+((Q- M)^2S/2*Q))+(K*D/Q). 642

643 15.7 Multiple Product Economic Order Quantity Models
Supposed a company orders several products. Each time an order is received shipments of some of the products arrive. Each time an order arrives there is a fixed cost associated with the order and there is another fixed cost associated with each product included in the order. How can we minimize the sum of annual holding and fixed costs? 643

644 To begin we find the product that is most frequently ordered.
Chopra and Meindl(2001) devised a method to find a near optimal solution to this type of problem. To begin we find the product that is most frequently ordered. We assume this product will be included in each order. 644

645 We then setup a Solver model that determines the following Changing Cells:
Number of orders received per year. For all products other than the chosen product the number of orders that need to be received before an order of the product is received. We can easily determine the total fixed cost and the total holding cost for each product. The sum of these costs will be our target cell for Solver. Our model is highly nonlinear and it is necessary to use the Evolutionary Solver to find the optimal solution. 645

646 Operations Research: Applications and Algorithms
Chapter 16 Probabilistic Inventory Models to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 646

647 Description No we consider inventory models in which demand over a given time period is uncertain, or random; single period inventory models, where a problem is ended once a single ordering decision has been made; single-period bidding models; version of the EOQ model for uncertain demand that incorporate the important concepts of safety stock and service level; the periodic review (R,S) model, and the ABC inventory classification system. 647

648 16.1 Single Period Decision Models
In many situations, a decision maker is faced with the problem of determining the value q for a variable. After q has been determined, the value d assumed by a random variable D is observed. Depending on the values of d and q, the decision maker incurs a cost c(d,q). We assume that the person is risk-neutral and wants to choose q to minimize her expected cost. Since the decision is made only once, we call a model of this type a single-period decision model. 648

649 16.2 The Concept of Marginal Analysis
For the single-period model, we now assume that D is an integer-valued discrete random variable with P(D=d)=p(d). Let E(q) be the decision maker’s expected cost if q is chosen. Then In most practical applications, E(q) is a convex function of q. Let q* be the value of q that minimizes E(q). 649

650 If E(q) is not a convex function, this argument may not work.
From the graph we see that q* is the smallest value of q for which E(q*+1) – E(q*) ≥ 0. Thus if E(q) is a convex function of q, we can find the value of q minimizing expected cost by finding the smallest value of q that satisfies inequality. If E(q) is not a convex function, this argument may not work. Our approach determines q* by repeatedly computing the effect of adding a marginal unit to the value of q. It is called marginal analysis. 650

651 16.3 The News Vendor Problem: Discrete Demand
Organizations often face inventory problems where the following sequence of events occurs: The organization decides how many units to order (q). With probability p(d), a demand if d units occurs. Depending on d and q, a cost c(d,q) is incurred. Problems that follow this sequence are often called news vendor problems. We show how marginal analysis can be used to solve news vendor problems when demand is a discrete random variable and c(d,q) has the following form. 651

652 c(d,q) = coq + (terms not involving q) (d ≤ q)
c(d,q) = -cuq + (terms not involving q) (d ≥ q+1) co is the overstocking cost and cu is the understocking cost. To derive the optimal order quantity via marginal analysis, let E(q) be the expected cost if an order is placed for q units. Since marginal analysis is applicable, we have just shown that E(q) will be minimized by the smallest value of q satisfying 652

653 16.4 The News Vendor Problem: Continuous Demand
We now consider the news vendor problem when demand D is a continuous random variable having density function f(d). By modifying our marginal analysis argument it can be shown that the decision maker’s expected cost is minimized by ordering q* units, where q* is the smallest number satisfying 653

654 The optimal order quantity can be determined by finding the value of q
The optimal order quantity can be determined by finding the value of q* satisfying 654

655 16.5 Other One-Period Models
Many single-period models in operations research cannot be easily handled by marginal analysis. In such situations we express the decision maker’s objective function as a function f(q) of the decision variable q. Then we find a maximum or minimum of f(q) by setting f′(q) = 0. We illustrate this idea by discussion of a bidding model. 655

656 Example 4 Condo Construction Company is bidding on an important construction job. The job will cost $2 million to complete. One other company is bidding for the job. Condo believes that the opponent’s bid is equally likely to be an amount between $2 million and $4 million. If Condo wants to maximize expected profit, what should its bid be? 656

657 Example 4 Solution Let B = random variable representing bid of Condo’s opponent b = actual bid of Condo’s opponent q = Condo’s bid If b > q, Condo outbids the opponent and earns a profit of q – 2,000,000. If b < q, Condo is outbid by the opponent and earns nothing. 657

658 Example 4 Solution Let E(q) be Condo’s expected profit if it bids q. Then Since f(b) = 1/2,000,000 for 2,000,000 ≤ b ≤ 4,000,000, we obtain 658

659 Example 4 Solution To find the value of q maximizing E(q), we find
Hence, E′(q) =0 for q=3,000,000. We know that E(q) is a concave function of q, and 3,000,000 does indeed maximize E(q). Hence, Condo should bid $3 million. Condo’s expected profit will be $500,000. 659

660 16.6 The EOQ with Uncertain Demand: The (r,q) and (s, S) Models
This section discusses a modification of the EOQ that is used when lead time is nonzero and the demand during each lead time is random. Assume all demand can be backlogged and a continuous review model, so that orders may be placed at any time, and we define K = ordering cost h = holding cost/unit/year L = lead time for each order q = quantity ordered each time and order takes place 660

661 We also require these definitions:
D = random variable representing annual demand with mean E(D), variance var D, and standard deviation σD cB = cost incurred for each unit short which does not depend on how long it takes to make up stockout OHI(t) = on-hand inventory at time t X = random variable representing demand during lead time We assume that X is a continuous random variable having density function f(x) and mean variance, and standard deviation of E(X), var X and σL. 661

662 We want to choose q and r to minimize the annual expected total cost.
We suppose if D is normally distributed, then X will also be normally distributed. If the length of lead time is independent of the demand per unit time during the lead time, then E(X) = E(L) E(D) and var X = E(L)(var D) + E(D)2(var L) We want to choose q and r to minimize the annual expected total cost. A cycle is defined to be the time interval between any two instants at which an order is received. 662

663 Determination of Reorder Point: Back-Ordered cases
The situation in which all demand must eventually be met and no sales are lost is called the back-ordered case, for which we show how to determine the reorder point and order quantity that minimize annual expected cost. Define TC(q,r) = expected annual cost incurred if each order is for q units and is placed when the reorder point is r. Then TC(q,r) = (expected annual holding cost) +(expected annual ordering cost)+(expected annual cost due to shortages) 663

664 Then I(t) = OHI(t) – B(t) yields Expected value of I(t)  expected value of OHI(t)
Expected value of I(t) during a cycle = ½ [(expected value of I(t) at beginning of cycle) + (expected value of I(t) at end of cycle)] To determine the expected annual cost due to stockouts or back orders, we must define Br = random variable representing the number of stockouts or back orders during a cycle if the reorder point is r. 664

665 We obtain We could find the values of q and r that minimize the equation above by determining values q* and r* of q and r satisfying 665

666 We can use LINGO to determine the values of q and r that exactly satisfy the last equation.
In most cases, however, the value of q* satisfying the equation is very close to the EOQ of Then we have the reorder point r* and the order quantity for q* for the back-ordered case: 666

667 Determination of Reorder Point: Lost Sales Case
The optimal order quantity q* and the reorder point r* for the lost sales case are The key to the derivation of the above equations is to realize that expected inventory in lost sales case = (expected inventory in back-ordered case) + (expected number of shortages per cycle). 667

668 Continuous Review (r, q) Policies
A continuous review inventory policy, in which we order a quantity q whenever our inventory level reaches a reorder level r, is often called and (r,q) policy. An (r,q) policy is also called a two-bin policy because it can easily be implemented by using two bins to store an item. 668

669 Continuous Review (s, S) Policies
Suppose that a demand for more than one unit can arrive at a particular time. Then an order may be triggered when the inventory level is less than r, and our computation of expected inventory at the end and beginning of a cycle is then incorrect. In such situations, it has been shown that an (s, S) policy is optimal. To implement an (s, S) policy, we place an order whenever the inventory level is less than or equal to s. The size of the order is sufficient to raise the inventory level to S (assuming zero lead time). 669

670 16.7 The EOQ with Uncertain Demand: The Service Level Approach to Determining Safety Stock Level
It is usually very difficult to determine accurately the cost of being one unit short. Managers often decide to control shortages by meeting a specified service level. There are two measures of service level: Service Level Measure 1 – SLM1, the expected fraction all demand that is met on time Service Level Measure 2 – SLM2, the expected number of cycles per year during which a shortage occurs 670

671 Determination of Reorder Point and Safety Stock Level for SLM1
For given values of q and r, we have If X is normally distributed, the determination of E(Br) requires a knowledge of the normal loss function. 671

672 The normal loss function, NL(y), is defined by the fact that σXNL(y) is the expected number of shortages that will occur during a lead time if (1) lead time demand is normally distributed with mean E(X) and standard deviation σX and (2) the reorder point is E(X) +y σX . LINGO with function may be used to computer values of the normal loss function. 672

673 Using LINGO to Computer the Reorder Point Level for SLM1
Using function in LINGO is a simple matter to computer the reorder point level of SLM1. 673

674 Determination of Reorder Point and Safety Stock Level for SLM2
Suppose that a manager wants to hold sufficient safety stock to ensure that an average of s0 cycles per year will result in a stockout. The reorder point r for SLM2 for continuous lead time demand and the reorder point for SLM2 for discrete lead time demand, by choosing the smallest value of r satisfying 674

675 16.8 (R, S) Periodic Review Policy
In this section we will discuss a widely used periodic review policy: the (R, S) policy. We need to define the concept of on-order inventory level. The on-order inventory level is simply the sum of on-hand inventory and inventory on order. Every R units of time, we review the on-hand inventory level and place an order to bring the on-order inventory level up to S. 675

676 We define R = time (in years) between reviews
D = demand (random) during a one-year period E(D) = mean demand during a one-year period K = cost of placing an order J = cost of reviewing inventory level h = cost of holding one item in inventory for one year cB = cost per-unit short in the backlogged case (assumed to be independent of the length of time until the order is filled) L = lead time for each order (assumed constant) 676

677 DL+R = demand (random) during a time interval of length L + R
E(DL+R) = mean of DL+R σ DL+R = standard deviation of DL+R Through calculations and marginal analysis to determine (for a given R) the value of S that minimizes the sum of annual expected holding and shortage costs will occur for the value of S satisfying 677

678 Suppose that all shortages result in lost sales, and a cost of cLS (including shortage cost plus lost profit) is incurred for each lost sale. Then the value of S minimizing the sum of annual expected holding and shortage costs is given by 678

679 Determination of R Often the review interval R is set equal to .
This makes the number of orders placed per year equal the number recommended if a simple EOQ model were used to determine the size of orders. Since each order is accompanied by a review, however, we must set the cost per order to K +J. This yields 679

680 Implementation of and (R, S) System
Retail stores often find an (R, S) policy easy to implement, because the quantity ordered equals the number of sales occurring during the period between reviews. 680

681 16.9 The ABC Inventory Classification System
Many companies develop inventory policies for thousands of items. In these cases they can not spend the time to focus on the “optimal” inventory policy for each item. The ABC classification, devised at General Electric during the 1950s, helps a company identify a small percentage of its items that account for a large percentage of the dollar value of annual sales. These items are called Type A items. 681

682 Since most of our inventory investment is in Type A items, high service levels will result in huge investments in safety stocks. Hax and Candea recommends that SLM1 be set at on 80-85% for Type A items. Tight management control of ordering procedures is essential for Type A items. Also, every effort should be made to lower the lead time needed to one week. 682

683 For Type B items, Hax and Candea recommend the SLM1 be set at 95%.
Inventory policies for Type B items can generally be controlled by computer. For Type C items, a simple two-bin system is usually adequate. Parameters are reviewed twice a year. Demand for Type C items may be forecasted by simple extrapolation methods. 683

684 16.10 Exchange Curves In many situations, it is difficult to estimate holding and shortage costs accurately. Exchange curves can be used in such situations to identify “reasonable” inventory policies. An exchange curve enables us to display graphically the trade-off between annual ordering costs, and average inventory investment. We define ci = cost of purchasing each unit of product i h = cost of holding $1 worth of either product in inventory for one year 684

685 Ki = order cost of product i
qi = EOQ for product i Di = annual demand for product i Two measures of effectiveness for this (or any other) ordering policy are AII = average dollar value of inventory AOC = annual ordering cost A plot of the points (AOC, AII) associated with each value of h is known as an exchange curve, we see that 685

686 Exchange Curves for Stockouts
Exchange curves can also be used to assess the trade-offs between average inventory investment (AII) and the expected number of lead time per year resulting in stockouts. Using more sophisticated techniques, an exchange surface involving three or more quantities can be derived. An exchange surface makes it easy to identify the trade-offs involved between improving service, increased inventory investment, and increased work load (orders per year). 686

687 Operations Research: Applications and Algorithms
Chapter 17 Markov Chains to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 687

688 Description Sometimes we are interested in how a random variable changes over time. The study of how a random variable evolves over time includes stochastic processes. An explanation of stochastic processes – in particular, a type of stochastic process known as a Markov chain is included. We begin by defining the concept of a stochastic process. 688

689 17.1 What is a Stochastic Process?
Suppose we observe some characteristic of a system at discrete points in time. Let Xt be the value of the system characteristic at time t. In most situations, Xt is not known with certainty before time t and may be viewed as a random variable. A discrete-time stochastic process is simply a description of the relation between the random variables X0, X1, X2 ….. 689

690 A continuous –time stochastic process is simply the stochastic process in which the state of the system can be viewed at any time, not just at discrete instants in time. For example, the number of people in a supermarket t minutes after the store opens for business may be viewed as a continuous- time stochastic process. 690

691 17.2 What is a Markov Chain? One special type of discrete-time is called a Markov Chain. Definition: A discrete-time stochastic process is a Markov chain if, for t = 0,1,2… and all states P(Xt+1 = it+1|Xt = it, Xt-1=it-1,…,X1=i1, X0=i0) =P(Xt+1=it+1|Xt = it) Essentially this says that the probability distribution of the state at time t+1 depends on the state at time t(it) and does not depend on the states the chain passed through on the way to it at time t. 691

692 In our study of Markov chains, we make further assumption that for all states i and j and all t, P(Xt+1 = j|Xt = i) is independent of t. This assumption allows us to write P(Xt+1 = j|Xt = i) = pij where pij is the probability that given the system is in state i at time t, it will be in a state j at time t+1. If the system moves from state i during one period to state j during the next period, we that a transition from i to j has occurred. 692

693 The pij’s are often referred to as the transition probabilities for the Markov chain.
This equation implies that the probability law relating the next period’s state to the current state does not change over time. It is often called the Stationary Assumption and any Markov chain that satisfies it is called a stationary Markov chain. We also must define qi to be the probability that the chain is in state i at the time 0; in other words, P(X0=i) = qi. 693

694 We call the vector q= [q1, q2,…qs] the initial probability distribution for the Markov chain.
In most applications, the transition probabilities are displayed as an s x s transition probability matrix P. The transition probability matrix P may be written as 694

695 We also know that each entry in the P matrix must be nonnegative.
For each I We also know that each entry in the P matrix must be nonnegative. Hence, all entries in the transition probability matrix are nonnegative, and the entries in each row must sum to 1. 695

696 17.3 n-Step Transition Probabilities
A question of interest when studying a Markov chain is: If a Markov chain is in a state i at time m, what is the probability that n periods later than the Markov chain will be in state j? This probability will be independent of m, so we may write P(Xm+n =j|Xm = i) = P(Xn =j|X0 = i) = Pij(n) where Pij(n) is called the n-step probability of a transition from state i to state j. For n > 1, Pij(n) = ijth element of Pn 696

697 The Cola Example Suppose the entire cola industry produces only two colas. Given that a person last purchased cola 1, there is a 90% chance that her next purchase will be cola 1. Given that a person last purchased cola 2, there is an 80% chance that her next purchase will be cola 2. If a person is currently a cola 2 purchaser, what is the probability that she will purchase cola 1 two purchases from now? If a person is currently a cola a purchaser, what is the probability that she will purchase cola 1 three purchases from now? 697

698 The Cola Example We view each person’s purchases as a Markov chain with the state at any given time being the type of cola the person last purchased. Hence, each person’s cola purchases may be represented by a two-state Markov chain, where State 1 = person has last purchased cola 1 State 2 = person has last purchased cola 2 If we define Xn to be the type of cola purchased by a person on her nth future cola purchase, then X0, X1, … may be described as the Markov chain with the following transition matrix: 698

699 The Cola Example We can now answer questions 1 and 2.
We seek P(X2 = 1|X0 = 2) = P21(2) = element 21 of P2: 699

700 The Cola Example We seek P11(3) = element 11 of P3:
Hence, P21(2) =.34. This means that the probability is .34 that two purchases in the future a cola 2 drinker will purchase cola 1. By using basic probability theory, we may obtain this answer in a different way. We seek P11(3) = element 11 of P3: Therefore, P11(3) = .781 700

701 Probability of being in state j at time n where q=[q1, q2, … q3].
Many times we do not know the state of the Markov chain at time 0. Then we can determine the probability that the system is in state i at time n by using the reasoning. Probability of being in state j at time n where q=[q1, q2, … q3]. 701

702 To illustrate the behavior of the n-step transition probabilities for large values of n, we have computed several of the n-step transition probabilities for the Cola example. This means that for large n, no matter what the initial state, there is a .67 chance that a person will be a cola 1 purchaser. We can easily multiply matrices on a spreadsheet using the MMULT command. 702

703 17.4 Classification of States in a Markov Chain
To understand the n-step transition in more detail, we need to study how mathematicians classify the states of a Markov chain. The following transition matrix illustrates most of the following definitions. A graphical representation is shown in the book. 703

704 Definition: Given two states of i and j, a path from i to j is a sequence of transitions that beings in i and ends in j, such that each transition in the sequence has a positive probability of occurring. Definition: A state j is reachable from state i if there is a path leading from i to j. Definition: Two states i and j are said to communicate if j is reachable from i, and i is reachable from j. Definition: A set of states S in a Markov chain is a closed set if no state outside of S is reachable from any state in S. 704

705 Definition: A state i is an absorbing state if pij=1.
Definition: A state i is a transient state if there exists a state j that is reachable from i, but the state i is not reachable from state j. The importance of these concepts will become clear after the next two sections. 705

706 17.5 Steady-State Probabilities and Mean First Passage Times
Steady-state probabilities are used to describe the long-run behavior of a Markov chain. Theorem 1: Let P be the transition matrix for an s-state ergodic chain. Then there exists a vector π = [π1 π2 … πs] such that 706

707 Theorem 1 tells us that for any initial state i,
The vector π = [π1 π2 … πs] is often called the steady-state distribution, or equilibrium distribution, for the Markov chain. 707

708 Transient Analysis & Intuitive Interpretation
The behavior of a Markov chain before the steady state is reached is often call transient (or short-run) behavior. An intuitive interpretation can be given to the steady-state probability equations. This equation may be viewed as saying that in the steady-state, the “flow” of probability into each state must equal the flow of probability out of each state. 708

709 Use of Steady-State Probabilities in Decision Making
In the Cola Example, suppose that each customer makes on purchase of cola during any week. Suppose there are 100 million cola customers. One selling unit of cola costs the company $1 to produce and is sold for $2. For $500 million/year, an advertising firm guarantees to decrease from 10% to 5% the fraction of cola 1 customers who switch after a purchase. Should the company that makes cola 2 hire the firm? 709

710 At present, a fraction π1 = ⅔ of all purchases are cola 1 purchases.
Each purchase of cola 1 earns the company a $1 profit. We can calculate the annual profit as $3,466,666,667. The advertising firm is offering to change the P matrix to 710

711 Now the cola 1 company’s annual profit will be $3,660,000,000.
For P1, the steady-state equations become π1 = .95π1+.20π2 π2 = .05π1+.80π2 Replacing the second equation by π1 + π2 = 1 and solving, we obtain π1=.8 and π2 = .2. Now the cola 1 company’s annual profit will be $3,660,000,000. Hence, the cola 1 company should hire the ad agency. 711

712 Mean First Passage Times
For an ergodic chain, let mij = expected number of transitions before we first reach state j, given that we are currently in state i; mij is called the mean first passage time from state i to state j. In the example, we assume we are currently in state i. Then with probability pij, it will take one transition to go from state i to state j. For k ≠ j, we next go with probability pik to state k. In this case, it will take an average of 1 + mkj transitions to go from i and j. 712

713 This reasoning implies
By solving the linear equations of the equation above, we find all the mean first passage times. It can be shown that 713

714 Solving for Steady-State Probabilities and Mean First Passage Times on the Computer
Since we solve steady-state probabilities and mean first passage times by solving a system of linear equations, we may use LINDO to determine them. Simply type in an objective function of 0, and type the equations you need to solve as your constraints. Alternatively, you may use the LINGO model in the file Markov.lng to determine steady-state probabilities and mean first passage times for an ergodic chain. 714

715 17.6 Absorbing Chains Many interesting applications of Markov chains involve chains in which some of the states are absorbing and the rest are transient states. This type of chain is called an absorbing chain. To see why we are interested in absorbing chains we consider the following accounts receivable example. 715

716 Accounts Receivable Example
The accounts receivable situation of a firm is often modeled as an absorbing Markov chain. Suppose a firm assumes that an account is uncollected if the account is more than three months overdue. Then at the beginning of each month, each account may be classified into one of the following states: State 1 New account State 2 Payment on account is one month overdue State 3 Payment on account is two months overdue State 4 Payment on account is three months overdue State 5 Account has been paid State 6 Account is written off as bad debt 716

717 Suppose that past data indicate that the following Markov chain describes how the status of an account changes from one month to the next month: New 1 month 2 months 3 months Paid Bad Debt New 1 month 2 months 3 months Paid Bad Debt 717

718 To simplify our example, we assume that after three months, a debt is either collected or written off as a bad debt. Once a debt is paid up or written off as a bad debt, the account if closed, and no further transitions occur. Hence, Paid or Bad Debt are absorbing states. Since every account will eventually be paid or written off as a bad debt, New, 1 month, 2 months, and 3 months are transient states. 718

719 A typical new account will be absorbed as either a collected debt or a bad debt.
What is the probability that a new account will eventually be collected? To answer this questions we must write a transition matrix. We assume s – m transient states and m absorbing states. The transition matrix is written in the form of m columns s-m rows m rows s-m P = 719

720 The transition matrix for this example is
Then s =6, m =2, and Q and R are as shown. New 1 month 2 months 3 months Paid Bad Debt New 1 month 2 months 3 months Paid Bad Debt Q R 720

721 What is the probability that a new account will eventually be collected?
What is the probability that a one-month overdue account will eventually become a bad debt? If the firm’s sales average $100,000 per month, how much money per year will go uncollected? 721

722 By using the Gauss-Jordan method of Chapter 2, we find that
t1 t2 t3 t4 (I-Q)-1 = 722

723 To answer questions 1-3, we need to compute
a a2 (I-Q)-1 R= 723

724 then t1 = New, a1 = Paid. Thus, the probability that a new account is eventually collected is element 11 of (I – Q)-1R =.964. t2 = 1 month, a2 = Bad Debt. Thus, the probability that a one-month overdue account turns into a bad debt is element 22 of (I –Q)-1R = .06. From answer 1, only 3.6% of all debts are uncollected. Since yearly accounts payable are $1,200,000 on the average, (0.36)(1,200,000) = $43,200 per year will be uncollected. 724

725 17.7 Work-Force Planning Models
Many organization employ several categories of workers. For long-term planning purposes, it is often useful to be able to predict the number of employees of each type who will be available in the steady state. Such predications can be made via an analysis similar to the steady state probabilities for Markov chains. Consider an organization whose members are classified at any point in time into one of s groups. 725

726 Let P be the s x (s+1) matrix whose ijth entry is pij.
During every time period, a fraction pij of those who begin a time period in group I begin the next time period in group j. Also during every time period, a fraction pi,s+1 of all group i members leave the organization. Let P be the s x (s+1) matrix whose ijth entry is pij. At the beginning of each time period, the organization hire Hi group i members. Let Ni(t) be the number of group i members at the beginning of period t. 726

727 The equations used to compute the steady- state census is
A question of natural interest is when Ni(t) approaches a limit as t grows large (call the limit, if it exists, Ni). If each Ni(t) does not approach a limit, we call N = (N1, N2,…,Ns) the steady-state census of the organization. The equations used to compute the steady- state census is 727

728 Note that the following can be used to simplify the above equation.
If a steady-state census does not exist, then the equation will have no solution. 728

729 Using LINGO to Solve for the Steady-State Census
The LINGO model found in file Census.lng can be used to determine the steady-state census for a work-force planning problem. 729

730 Operations Research: Applications and Algorithms
Chapter 18 Deterministic Dynamic Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 730

731 Description Dynamic programming is a technique that can be used to solve many optimization problems. In most applications, dynamic programming obtains solutions by working backward from the end of the problem toward the beginning, thus breaking up a large, unwieldy problem into a series of smaller, more tractable problems 731

732 18.1 Two Puzzles Example We show how working backward can make a seemingly difficult problem almost trivial to solve. Suppose there are 20 matches on a table. I begin by picking up 1, 2, or 3 matches. Then my opponent must pick up 1, 2, or 3 matches. We continue in this fashion until the last match is picked up. The player who picks up the last match is the loser. How can I (the first player) be sure of winning the game? 732

733 Thus I cannot lose if I pick up 1 match on my first turn.
If I can ensure that it will be opponent’s turn when 1 match remains, I will certainly win. Working backward one step, if I can ensure that it will be my opponent's turn when 5 matches remain, I will win. If I can force my opponent to play when 5, 9, 13, 17, 21, 25, or 29 matches remain, I am sure of victory. Thus I cannot lose if I pick up 1 match on my first turn. 733

734 18.2 A Network Problem Many applications of dynamic programming reduce to finding the shortest (or longest) path that joins two points in a given network. For larger networks dynamic programming is much more efficient for determining a shortest path than the explicit enumeration of all paths. 734

735 Characteristics of Dynamic Programming Applications
The problem can be divided into stages with a decision required at each stage. Characteristic 2 Each stage has a number of states associated with it. By a state, we mean the information that is needed at any stage to make an optimal decision. Characteristic 3 The decision chosen at any stage describes how the state at the current stage is transformed into the state at the next stage. 735

736 Characteristic 4 Characteristic 5
Given the current state, the optimal decision for each of the remaining stages must not depend on previously reached states or previously chosen decisions. This idea is known as the principle of optimality. Characteristic 5 If the states for the problem have been classified into on of T stages, there must be a recursion that related the cost or reward earned during stages t, t+1, …., T to the cost or reward earned from stages t+1, t+2, …. T. 736

737 18.3 An Inventory Problem Dynamic programming can be used to solve an inventory problem with the following characteristics: Time is broken up into periods, the present period being period 1, the next period 2, and the final period T. At the beginning of period 1, the demand during each period is known. At the beginning of each period, the firm must determine how many units should be produced. Production capacity during each period is limited. 737

738 Each period’s demand must be met on time from inventory or current production. During any period in which production takes place, a fixed cost of production as well as a variable per-unit cost is incurred. The firm has limited storage capacity. This is reflected by a limit on end-of-period inventory. A per-unit holding cost is incurred on each period’s ending inventory. The firms goal is to minimize the total cost of meeting on time the demands for periods 1,2, …., T. 738

739 Such a model is called a periodic review model.
In this model, the firm’s inventory position is reviewed at the end of each period, and then the production decision is made. Such a model is called a periodic review model. This model is in contrast to the continuous review model in which the firm knows its inventory position at all times and may place an order or begin production at any time. 739

740 18.4 Resource-Allocation Problems
Resource-allocation problems, in which limited resources must be allocated among several activities, are often solved by dynamic programming. To use linear programming to do resource allocation, three assumptions must be made: Assumption 1 : The amount of a resource assigned to an activity may be any non negative number. Assumption 2 : The benefit obtained from each activity is proportional to the amount of the resource assigned to the activity. 740

741 Assumption 3: The benefit obtained from more than one activity is the sum of the benefits obtained from the individual activities. Even if assumptions 1 and 2 do not hold, dynamic programming can be used to solve resource-allocation problems efficiently when assumption 3 is valid and when the amount of the resource allocated to each activity is a member of a finite set. 741

742 Generalized Resource Allocation Problem
The problem of determining the allocation of resources that maximizes total benefit subject to the limited resource availability may be written as where xt must be a member of {0,1,2,…}. 742

743 We may generalize the recursions to this situation by writing
To solve this by dynamic programming, define ft(d) to be the maximum benefit that can be obtained from activities t, t+1,…, T if d unites of the resource may be allocated to activities t, t+1,…, T. We may generalize the recursions to this situation by writing fT+1(d) = 0 for all d where xt must be a non-negative integer satisfying gt(xt)≤ d. 743

744 A Turnpike Theorem Turnpike results abound in the dynamic programming literature. Why are the results referred to as a turnpike theorem? Think about taking an automobile trip in which our goal is to minimize the time needed to complete the trip. For a long trip it may be advantageous to go slightly out of our way so that most of the trip will be spent on a turnpike, on which we can travel at the greatest speed. 744

745 18.5 Equipment Replacement Problems
Many companies and customers face the problem of determining how long a machine should be utilized before it should be traded in for a new one. Problems of this type are called equipment- replacement problems and can be solved by dynamic programming. An equipment replacement model was actually used by Phillips Petroleum to reduce costs associated with maintaining the company’s stock of trucks. 745

746 18.6 Formulating Dynamic Programming Recursions
In many dynamic programming problems, a given stage simply consists of all the possible states that the system can occupy at that stage. If this is the case, then the dynamic programming recursion can often be written in the following form: Ft(i) = min{(cost during stage t) + ft+1 (new stage at stage t +1)} where the minimum in the above equation is over all decisions that are allowable, or feasible, when the state at state t is i. 746

747 Not all recursions are of the form shown before.
Correct formulation of a recursion of the form requires that we identify three important aspects of the problem: Aspect 1: The set of decisions that is allowable, or feasible, for the given state and stage. Aspect 2: We must specify how the cost during the current time periods (stage t) depends on the value of t, the current state, and the decision chosen at stage t. Aspect 3: We must specify how the state at stage t+1 depends on the value of t, the states at stage t, and the decision chosen at stage t. Not all recursions are of the form shown before. 747

748 A Fishery Example The owner of a lake must decide how many bass to catch and sell each year. If she sells x bass during year t, then a revenue r(x) is earned. The cost of catching x bass during a year is a function c(x, b) of the number of bass caught during the year and of b, the number of bass in the lake at the beginning of the year. Of course, bass do reproduce. 748

749 A Fishery Example To model this, we assume that the number of bass in the lake at the beginning of a year is 20% more than the number of bass left in the lake at the end of the previous year. Assume that there are 10,000 bass in the lake at the beginning of the first year. Develop a dynamic programming recursion that can be used to maximize the owner’s net profit over a T-year horizon. 749

750 A Fishery Example In problems where decisions must be made at several points in time, there is often a trade-off of current benefits against future benefits. At the beginning of year T, the owner of the lake need not worry about the effect that the capture of bass will have on the future population of the lake. So the beginning of the year problem is relatively easy to solve. For this reason, we let time be the stage. At each stage, the owner of the lake must decide how many bass to catch. 750

751 A Fishery Example We define xt to be the number of bass caught during year t. To determine an optimal value of xt, the owner of the lake need only know the number of bass (call it bt) in the lake at the beginning of year t. Therefore, the state at the beginning of year t is bt. We define ft (bt) to be the maximum net profit that can be earned from bass caught during years t, t+1, …T given that bt bass are in the lake at the beginning of year t. 751

752 A Fishery Example We may now dispose of aspects 1-3 of the recursion.
Aspect 1: What are the allowable decisions? During any year we can’t catch more bass than there are in the lake. Thus, in each state and for all t 0 ≤ xt ≤ bt must hold. Aspect 2: What is the net profit earned during year t? If xt bass are caught during a year that begins with bt bass in the lake, then the net profit is r(xt) – c(xt, bt). Aspect 3: What will be the state during year t+1? The year t+1 state will be 1.2 (bt – xt). 752

753 A Fishery Example After year T, there are no future profits to consider, so ft(bt)=max{r(xt) – c(xt,bt)+ft+1[1.2(bt-xt)]} where 0 ≤ xt ≤ bt. We use this equation to work backwards until f1(10,000) has been computed. Then to determine the optimal fishing policy, we begin by choosing x1 to be any value attaining the maximum in the equation for f1(10,000). Then year 2 will begin with 1.2(10,000 – x1) bass in the lake. 753

754 A Fishery Example This means that x2 should be chosen to be any value attaining the maximum in the equation for f2(1.2(10,000-x1)). Continue in this fashion until optimal values of x3, x4,…xT have been determined. 754

755 Incorporating the Time Value of Money into Dynamic Programming Formulations
A weakness of the current formulation is that profits received during the later years are weighted the same as profits received during the earlier years. Suppose that for some ß < 1, $1 received at the beginning of year t+1 is equivalent to ß dollars received at the beginning of year t. 755

756 We can incorporate this idea into the dynamic programming recursion by replacing the previous equation with where 0 ≤ xt ≤ bt. Then we redefine ft(bt) to be the maximum net profit that can be earned during years t, t+1,…T. This approach can be used to account for the time value of money in any dynamic programming formulation. 756

757 Computational Difficulties in Using Dynamic Programming
There is a problem that limits the practical application of dynamic programming. In many problems, the state space becomes so large that excessive computational time is required to solve the problem by dynamic programming. 757

758 18.7 The Wagner-Whitin Algorithm and the Silver-Meal Heuristic
The Inventory Example in this chapter is a special case of the dynamic lot-size model. Description of Dynamic Lot-Size Model Demand dt during periods t(t = 1, 2,…, T) is known at the beginning of period 1. Demand for period t must be met on time from inventory or from period t production. The cost c(x) of producing x units during any period is given by c(0) =0, and for x >0, c(x) = K+cx, where K is a fixed cost for setting up production during a period, and c is a variable per-unit cost of production. 758

759 At the end of period t, the inventory level it is observed, and a holding cost hit is incurred. We let i0 denotes the inventory level before period 1 production occurs. The goal is to determine a production level xi for each period t that minimizes the total cost of meeting (on time) the demands for periods 1, 2, …, T. There is a limit ct placed on period t’s ending inventory. There is a limit rt placed on period t’s production. Wagner and Whitin have developed a method that greatly simplifies the computation of optimal production schedules for dynamic lot- size models. 759

760 Lemmas 1 and 2 are necessary for the development of the Wagner-Whitin algorithm.
Lemma 1: Suppose it is optimal to produce a positive quantity during a period t. Then for some j=0, 1, …, T-t, the amount produced during period t must be such that after period t’s production, a quantity dt + dt+1 + … + dt+j will be in stock. In other words, if production occurs during period t, we must (for some j) produce an amount that exactly suffices to meet the demands for periods t, t+1,…, t+j. 760

761 Lemma 2: If it is optimal to produce anything during period t, then it-1 <dt. In other words, production cannot occur during period t unless there is insufficient stock to meet period t demand. With the possible exception of the first period, production will occur only during periods in which beginning inventory is zero, and during each period in which beginning inventory is zero (and dt ≠ 0), production must occur. 761

762 Using this insight , Wagner and Whitin developed a recursion that can be used to determine an optimal production policy. 762

763 The Silver-Meal Heuristic
The Silver-Meal(S-M) Heuristic involved less work than the Wagner-Whitin algorithm and can be used to find a near-optimal production schedule. The S-M heuristic is based on the fact that our goal is to minimize average cost per period. In extensive testing, the S-M heuristic usually yielded a production schedule costing less than 1% above the optimal policy obtained by the Wagner-Whitin algorithm. 763

764 18.8 Using Excel to Solve Dynamic Programming Problems
Excel can often be used to solve DP problems. The examples in the book demonstrates how spreadsheets can be used to solve problems. Refer to these Excel files for more information Dpknap.xls Dpresour.xls Dpinv.xls 764

765 Operations Research: Applications and Algorithms
Chapter 19 Probabilistic Dynamic Programming to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 765

766 Description In deterministic dynamic probability, a specification of the current state and current decision was enough to tell us with certainty the new state and the costs during the current stage. In many practical problems, these factors may not be known with certainty, even us the current state and decision are known. This chapter explains how to use dynamic programming to solve problems in which the current period’s cost or the next period’s state are random. We call these problems probabilistic dynamic programming problems (or PDPs) 766

767 19.1 When Current Stage Costs Are Uncertain, but the Next Period’s State is Certain
For problems in this section, the next period’s state is known with certainty, but the reward earned during the current stage is not known with certainty. For the price of $1/gallon, the Safeco Supermarket chain has purchased 6 gallons of milk from a local dairy. Each gallon of milk is sold in the chain’s three stores for $2/gallon. The dairy must buy back 50¢/gallon any milk that is left at the end of the day. 767

768 Unfortunately for Safeco, demand for each of the chain’s three stores is uncertain.
Past data indicate that the daily demand at each store is shown in the book. Safeco wants to allocate the 6 gallons of milk to the three stores so as to maximize the expected net daily profit earned from milk. Use dynamic programming to determine how Safeco should allocate the 6 gallons of milk among the three stores. 768

769 With the exception of the fact that the demand is uncertain, this problem is very similar to the resource allocation problem. Observe that since Safeco’s daily purchase costs are always $6, we may concentrate our attention on the problem of allocating the milk to maximize daily revenue earned from the 6 gallons. Define Rt(gt) = expected revenue earned from gt gallons assigned to store t Ft(x) = maximum expected revenue earned from x gallons assigned to stores t, t+1…,3 769

770 For t = 1, 2, we may write were gt must be a member of {0,1,…,x}.
We begin computing the rt(gt)’s. Complete computations can be found in the book. 770

771 19.2 A Probabilistic Inventory Model
In this section, we modify the inventory model from Chapter 6 to allow for uncertain demand. This will illustrate the difficulties involved in solving a PDP for which the state during the next period is uncertain. 771

772 Each period’s demand is equally likely to be 1 or 2 units.
After meeting the current period’s demand out of current production and inventory, the firm’s end-of-period inventory is evaluated, and a holding cost of $1 per unit is assessed. Because of limited capacity, the inventory at the end of each period cannot exceed 3 units. It is required that all demand be met on time. Any inventory on hand at the end of period 3 can be sold at $2 per unit. 772

773 At the beginning of period 1, the firm has 1 unit of inventory
At the beginning of period 1, the firm has 1 unit of inventory. Use dynamic programming to determine a production policy that minimizes the expected net cost incurred during the three periods. 773

774 Solution Define ft(i) to be the minimum expected net cost incurred during the periods t, t +1,…3 when the inventory at the beginning of period t is i units. Then where x must be a member of {0,1,2,3,4} and x must satisfy (2 - i ) ≤ x ≤ (4 - i). 774

775 For t = 1, 2, we can derive the recursive relation for ft(i) by noting that for any month t production level x, the expected costs incurred during periods t, t+1, …,3 are the sum of the expected costs incurred during period t and the expected costs incurred during periods t+1, t+2, …,3 . As before, if x units are produced during month t, the expected cost during month t will be c(x) + (½) (i+x-1)+ (½)(1+x-2). If x units are produced during month t, the expected cost during periods t+1, t+2, …,3 is computed as follows. 775

776 Half of the time, the demand during period t will be 1 unit, and the inventory at the beginning of t+1 will be i + x – 1. In this situation, the expected costs incurred during periods t+1, t+2, …,3 is ft+1(i+x-1). Similarly, there is a ½ chance that the inventory at the beginning of period t+1 will be i + x – 2. In this case, the expected cost incurred during periods t+1, t+2, …,3 will be ft+1(i+x-2). In summary, the expected cost during the periods t+1, t+2, …,3 will be (½) ft+1(i+x-1) + (½) ft+1(i+x-2). 776

777 With this in mind, we may write for t = 1, where x must be a member of [0,1, 2, 3, 4} and x must satisfy (2-i) ≤ x ≤ (4-i). Generalizing the reason that led to the above equation yields the following important observation concerning the formulation of PDPs. 777

778 Suppose the possible states during period t+1 are s1, s2, …sn and the probability that the period t+1 state will be si is pi. Then the minimum expected cost incurred during periods t+1, t+2,…, end of the problem is where ft+1(si) is the minimum expected cost incurred from period t+1 to the end of the problem, given that the state during period t+1 is si. 778

779 We now work backward until f1(1) is determined
We define xt(i) to be a period t production level attaining the minimum in (3) for ft(i). We now work backward until f1(1) is determined The relevant computations are summarized in the tables in the book. Since each period’s ending inventory must be nonnegative and cannot exceed 3 units, the state during each period must be 0,1,2, or 3. 779

780 (s, S) Policies Consider the following modification of the dynamic lot-size model from Chapter 6, for which there exists an optimal production policy called an (s,S) inventory policy: The cost of producing x >0 units during a period consists of a fixed cost K and a per-unit variable production cost c. With a probability p(x), the demand during a given period will be x. A holding cost of h per unit is accessed on each period’s end inventory. If we are short, a per-unit shortage cost of d is incurred. 780

781 Such a policy is called an (s,S) policy.
The goal is to minimize the total expected cost incurred during periods 1,2,…,T. All demands must be met by the end of the period T. For such an inventory problem, Scarf used dynamic programming to prove that there exists an optimal production policy of the following form: For each t(t=1,2,…T) there exists a pair of numbers (st, St) such that if it- 1, the entering inventory for period t, is less than st, then an amount St-it-1 is produced; if it-1≥st, then it is optimal not to produce during period t. Such a policy is called an (s,S) policy. 781

782 19.3 How to Maximize the Probability of a Favorable Event Occurring
There are many occasions on which the decision maker’s goal is to maximize the probability of a favorable event occurring. To solve such a problem, we assign a reward of 1 if the favorable event occurs and a reward of 0 if it does not occur. Then the maximization of expected reward will be equivalent to maximizing the probability that the favorable event will occur. 782

783 Martina’s optimal strategy depends on the values of the parameters defining the problem.
This is a kind of sensitivity analysis reminiscent of the sensitivity analysis that we applied to linear programming problems. 783

784 19.4 Further Examples of Probabilistic Dynamic Programming Formulations
Many probabilistic dynamic programming problems can be solved using recursions of the following form (for max problems): 784

785 Probabilistic Dynamic Programming Example
When Sally Mutton arrives at the bank, 30 minutes remain n her lunch break. If Sally makes it to the head of the line and enters service before the end of her lunch break, she earns reward r. However, Sally does not enjoy waiting in lines, so to reflect her dislike for waiting in line, she incurs a cost of c for each minute she waits. During a minutes in which n people are ahead of Sally, there is a probability p(x|n) that x people will complete their transactions. 785

786 Suppose that when Sally arrives, 20 people are ahead of her in line.
Use dynamic programming to determine a strategy for Sally that will maximize her expected net revenue (reward-waiting costs). 786

787 Solution When Sally arrives at the bank, she must decide whether to join the line or give up and leave. At any later time, she may also decide to leave if it is unlikely that she will be served by the end of her lunch break. We can work backward to solve the problem. We define ft(n) to be the maximum expected net reward that Sally can receive from time t to the end of her lunch break if at time t, n people are ahead of her. 787

788 We let t=0 be present and t=30 be the end of the problem.
Since t=29 is the beginning of the last minute of the problem, we write This follows because if Sally chooses to leave at time 29, she earns no reward and incurs no more costs. 788

789 On the other hand, if she stays at time 29, she will incur a waiting cost of c(a revenue of –c) and with probability p(n|n) will enter service and receive a reward r. Thus, if Sally stays, her expected net reward is rp(n|n)-c. For t<29, we write 789

790 The last recursion follows, because if Sally stays, she will earn an expected reward (as in the t=29 case) of rp(n|n)-c during the current minute, and with probability p(k|n), there will be n-k people ahead of her; in this case, her expected net reward from time t+1 to time 30 will be ft+1(n-k). If Sally stays, her overall expected reward received from time t+1, t+2,…,30 will be 790

791 Of course, if n people complete their transactions during the current minute, the problem ends, and Sally’s future net revenue will be zero. To determine Sally’s optimal waiting policy, we work backward until f0(20) is computed. Sally stays until the optimal action is “leave” or she begins to be served. Problems in which the decision maker can terminate the problem by choosing a particular action are known as stopping rule problems; they often have a special structure that simplifies the determination of optimal policies. 791

792 19.5 Markov Decision Processes
To use dynamic programming in a problem for which the stage is represented by time, one must determine the value of T, the number of time periods over which expected revenue or expected profit is maximized. T is referred to as the horizon length. Suppose a decision maker’s goal is to maximize the expected reward earned over an infinite horizon. In many situations, the expected reward earned over an infinite horizon may be unbounded. 792

793 Two approaches are commonly used to resolve the problem of unbound expected rewards over an infinite horizon: We can discount rewards (or costs) by assuming that $1 reward received during the next period will have the same value as a reward of β dollars (0< β<1) received during the current period. This is equivalent to assuming that the decision maker wants to maximize expected discounted reward. Then the maximum expected discounted reward that can be received over an infinite period horizon is Thus, discounting rewards (or costs) resolves the problem of an infinite expected reward. 793

794 The decision maker can choose to maximize the expected reward earned per period. Then he or she would choose a decision during each period in an attempt to maximize the average reward per period as given by Thus if a $3 reward were earned each period, the total reward during an infinite number of periods would be unbounded, but the average reward per period would equal $3. 794

795 Infinite horizon probabilistic dynamic programming problems are called Markov decision processes (or MDPs). 795

796 Description of MDP An MDP is described by four types of information:
State space At the beginning of each period, the MDP is in some state i, where i is a member of S ={1,2,…N}. S is referred to as the MDP’s state space. Decision set For each state I, there is a finite set of allowable decisions, D(i). Transition probabilities Suppose a period begins in state I, and a decision d D(i) is chosen. Then the probability p(j|i,d), the next period’s state will be. The next period’s state depends only on the current period’s state and on the decision chosen during the current period. This is why we use the term Markov decision process. 796

797 Expected rewards During a period in which the state is I and a decision d D(i) is chosen, an expected rewards of rid is received. In an MDP, what criterion should be used to determine the correct decision? Answering this question requires that we discuss the idea of an optimal policy for an MDP. Definition: A policy is a rule that specifies how each period’s decision is chosen. 797

798 Definition: A policy δ is a stationary policy if whenever the state is i, the policy δ chooses (independently of the period) the same decision (call this decision δ(i)). Definition: If a policy δ* has the property that for all i  S V(i) = Vδ*(i) then δ* is an optimal policy. 798

799 We now consider three methods that can be used to determine optimal stationary policy:
Policy integration Linear programming Value iteration, or successive approximations 799

800 Policy Iteration Value Determination Equations
Before we explain the policy iteration method, we need to determine a system of linear equations that can be used to find Vδ(i) for I  S and any stationary policy δ. The value determination equations Howard’s Policy Iteration Method We now describe Howard’s policy iteration method for finding an optimal stationary policy for an MDP (max problem). 800

801 Step1: Policy evaluation – Choose a stationary policy δ and use the value determination equations to find Vδ(i)(i=1,2,…,N). Step 2: Policy improvement – For all states i = 1,2,…, N, compute In a minimization problem, we replace max by min. The policy iteration method is guaranteed to find an optimal policy for the machine replacement example after evaluating a finite number of policies. 801

802 Linear Programming It can be shown that an optimal stationary policy for a maximization problem can be found by solving the following LP: For a minimization problem, we solve the following LP: 802

803 The optimal solution to these LPs will have Vi = V(i)
The optimal solution to these LPs will have Vi = V(i). Also, if a constraint for state I and decision d is binding, then decision d is optimal in state i. 803

804 Value Iteration There are several versions of value iteration.
We discuss for a maximization problem the simplest value iteration scheme, also known as successive approximation. Let Vt(i) be the maximum expected discount reward that can be earned during t periods if the state at the beginning of the current period is i. Then 804

805 Maximizing Average Reward per Period
We now briefly discuss how linear programming can be used to find a stationary policy that maximizes the expected per-period reward earned over an infinite horizon. Consider a decision rule or policy Q that chooses decision dD(i) with probability qi(d) during a period in which the state is i. A policy Q will be stationary policy if each qi(d) equals 0 or 1. 805

806 Then the expected reward per period may be written as
To find a policy that maximizes expected reward per period over an infinite horizon, let πid be the fraction of all periods in which the state is i and the decision d D(i) is chosen. Then the expected reward per period may be written as 806

807 Putting together our objective function shown on the previous slide and all the constraints yields the following LP: 807

808 Operations Research: Applications and Algorithms
Chapter 20 Queuing Theory to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 808

809 Description Each of us has spent a great deal of time waiting in lines. In this chapter, we develop mathematical models for waiting lines, or queues. 809

810 20.1 Some Queuing Terminology
To describe a queuing system, an input process and an output process must be specified. Examples of input and output processes are: Situation Input Process Output Process Bank Customers arrive at bank Tellers serve the customers Pizza parlor Request for pizza delivery are received Pizza parlor send out truck to deliver pizzas 810

811 The Input or Arrival Process
The input process is usually called the arrival process. Arrivals are called customers. We assume that no more than one arrival can occur at a given instant. If more than one arrival can occur at a given instant, we say that bulk arrivals are allowed. Models in which arrivals are drawn from a small population are called finite source models. If a customer arrives but fails to enter the system, we say that the customer has balked. 811

812 The Output or Service Process
To describe the output process of a queuing system, we usually specify a probability distribution – the service time distribution – which governs a customer’s service time. We study two arrangements of servers: servers in parallel and servers in series. Servers are in parallel if all server provide the same type of service and a customer need only pass through one server to complete service. Servers are in series if a customer must pass through several servers before completing service. 812

813 Queue Discipline The queue discipline describes the method used to determine the order in which customers are served. The most common queue discipline is the FCFS discipline (first come, first served), in which customers are served in the order of their arrival. Under the LCFS discipline (last come, first served), the most recent arrivals are the first to enter service. If the next customer to enter service is randomly chosen from those customers waiting for service it is referred to as the SIRO discipline (service in random order). 813

814 Finally we consider priority queuing disciplines.
A priority discipline classifies each arrival into one of several categories. Each category is then given a priority level, and within each priority level, customers enter service on an FCFS basis. Another factor that has an important effect on the behavior of a queuing system is the method that customers use to determine which line to join. 814

815 20.2 Modeling Arrival and Service Processes
We define ti to be the time at which the ith customer arrives. In modeling the arrival process we assume that the T’s are independent, continuous random variables described by the random variable A. The assumption that each interarrival time is governed by the same random variable implies that the distribution of arrivals is independent of the time of day or the day of the week. This is the assumption of stationary interarrival times. 815

816 A negative interarrival time is impossible. This allows us to write
Stationary interarrival ties is often unrealistic, but we may often approximate reality by breaking the time of day into segments. A negative interarrival time is impossible. This allows us to write We define1/λ to be the mean or average interarrival time. 816

817 The most common choice for A is the exponential distribution.
We define λ to be the arrival rate, which will have units of arrivals per hour. An important questions is how to choose A to reflect reality and still be computationally tractable. The most common choice for A is the exponential distribution. An exponential distribution with parameter λ has a density a(t) = λe-λt. We can show that the average or mean interarrival time is given by 817

818 Using the fact that var A = E(A2) – E(A)2, we can show that
Lemma 1: If A has an exponential distribution, then for all nonnegative values of t and h, 818

819 For reasons that become apparent, a density that satisfies the equation is said to have the no-memory property. The no-memory property of the exponential distribution is important because it implies that if we want to know the probability distribution of the time until the next arrival, then it does not matter how long it has been since the last arrival. 819

820 Relations between Poisson Distribution and Exponential Distribution
If interarrival times are exponential, the probability distribution of the number of arrivals occurring in any time interval of length t is given by the following important theorem. Theorem 1: Interarrival times are exponential with parameter λ if and only if the number of arrivals to occur in an interval of length t follows the Poisson distribution with parameter λt. 820

821 A discrete random variable N has a Poisson distribution with parameter λ if, for n=0,1,2,…,
What assumptions are required for interarrival times to be exponential? Consider the following two assumptions: Arrivals defined on nonoverlapping time intervals are independent. For small Δt, the probability of one arrival occurring between times t and t +Δt is λΔt+o(Δt) refers to any quantity satisfying 821

822 Theorem 2: If assumption 1 and 2 hold, then Nt follows a Poisson distribution with parameter λt, and interarrival times are exponential with parameter λ; that is, a(t) = λe-λt. Theorem 2 states that if the arrival rate is stationary, if bulk arrives cannot occur, and if past arrivals do not affect future arrivals, then interarrival times will follow an exponential distribution with parameter λ, and the number of arrivals in any interval of length t is Poisson with parameter λt. 822

823 The Erlang Distribution
If interarrival times do not appear to be exponential they are often modeled by an Erlang distribution. An Erlang distribution is a continuous random variable (call it T) whose density function f(t) is specified by two parameters: a rate parameters R and a shape parameters k (k must be a positive integer). Given values of R and k, the Erlang density has the following probability density function: 823

824 Using integration by parts, we can show that if T is an Erlang distribution with rate parameter R and shape parameter k, then 824

825 Using EXCEL to Computer Poisson and Exponential Probabilities
EXCEL contains functions that facilitate the computation of probabilities concerning the Poisson and Exponential random variable. The syntax of the Poisson EXCEL function is as follows: =POISSON(x,Mean,True) gives probability that a Poisson random variable with mean = Mean is less than or equal to x. =POISSON(x,Mean,False) gives probability that a Poisson random variable with mean =Mean is equal to x. 825

826 The syntax of the EXCEL EXPONDIST function is as follows:
=EXPONDIST(x,Lambda,TRUE) gives the probability that an exponential random variable with parameter Lambda assumes a value less than or equal to x. =EXPONDIST(x,Lambda,FALSE) gives the probability that an exponential random variable with parameter Lambda assumes a value less than or equal to x. 826

827 Modeling the Service Process
We assume that the service times of different customers are independent random variables and that each customer’s service time is governed by a random variable S having a density function s(t). We let 1/µ be then mean service time for a customer. The variable 1/µ will have units of hours per customer, so µ has units of customers per hour. For this reason, we call µ the service rate. Unfortunately, actual service times may not be consistent with the no-memory property. 827

828 For this reason, we often assume that s(t) is an Erlang distribution with shape parameters k and rate parameter kµ. In certain situations, interarrival or service times may be modeled as having zero variance; in this case, interarrival or service times are considered to be deterministic. For example, is interarrival times are deterministic, then each interarrival time will be exactly 1/λ, and is service times are deterministic, each customer’s service time is exactly 1/µ. 828

829 The Kendall-Lee Notation for Queuing Systems
Standard notation used to describe many queuing systems. The notation is used to describe a queuing system in which all arrivals wait in a single line until one of s identical parallel servers id free. Then the first customer in line enters service, and so on. To describe such a queuing system, Kendall devised the following notation. Each queuing system is described by six characters: /2/3/4/5/6 829

830 The first characteristic specifies the nature of the arrival process
The first characteristic specifies the nature of the arrival process. The following standard abbreviations are used: M = Interarrival times are independent, identically distributed (iid) D = Interarrival times are iid and deterministic Ek = Interarrival times are iid Erlangs with shape parameter k. GI = Interarrival times are iid and governed by some general distribution 830

831 The second characteristic specifies the nature of the service times:
M = Service times are iid and exponentially distributed D = Service times are iid and deterministic Ek = Service times are iid Erlangs with shape parameter k. G = Service times are iid and governed by some general distribution 831

832 The third characteristic is the number of parallel servers
The third characteristic is the number of parallel servers. The fourth characteristic describes the queue discipline: FCFS = First come, first served LCFS = Last come, first served SIRO = Service in random order GD = General queue discipline The fifth characteristic specifies the maximum allowable number of customers in the system. The sixth characteristic gives the size of the population from which customers are drawn. 832

833 In many important models 4/5/6 is GD/∞/∞
In many important models 4/5/6 is GD/∞/∞. If this is the case, then 4/5/6 is often omitted. M/E2/8/FCFS/10/∞ might represent a health clinic with 8 doctors, exponential interarrival times, two-phase Erlang service times, an FCFS queue discipline, and a total capacity of 10 patients. 833

834 The Waiting Time Paradox
Suppose the time between the arrival of buses at the student center is exponentially distributed with a mean of 60 minutes. If we arrive at the student center at a randomly chosen instant, what is the average amount of time that we will have to wait for a bus? The no-memory property of the exponential distribution implies that no matter how long it has been since the last bus arrived, we would still expect to wait an average of 60 minutes until the next bus arrived. 834

835 20.3 Birth-Death Processes
We subsequently use birth-death processes to answer questions about several different types of queuing systems. We define the number of people present in any queuing system at time t to be the state of the queuing systems at time t. We call πj the steady state, or equilibrium probability, of state j. The behavior of Pij(t) before the steady state is reached is called the transient behavior of the queuing system. 835

836 A birth-death process is a continuous-time stochastic process for which the system’s state at any time is a nonnegative integer. 836

837 Laws of Motion for Birth-Death
With probability λjΔt+o(Δt), a birth occurs between time t and time t+Δt. A birth increases the system state by 1, to j+1. The variable λj is called the birth rate in state j. In most queuing systems, a birth is simply an arrival. Law 2 With probability µjΔt+o(Δt), a death occurs between time t and time t + Δt. A death decreases the system state by 1, to j-1. The variable µj is the death rate in state j. In most queuing systems, a death is a service completion. Note that µ0 = 0 must hold, or a negative state could occur. Law 3 Births and deaths are independent of each other. 837

838 Relation of Exponential Distribution to Birth-Death Processes
Most queuing systems with exponential interarrival times and exponential service times may be modeled as birth-death processes. More complicated queuing systems with exponential interarrival times and exponential service times may often be modeled as birth- death processes by adding the service rates for occupied servers and adding the arrival rates for different arrival streams. 838

839 Derivation of Steady-State Probabilities for Birth-Death Processes
We now show how the πj’s may be determined for an arbitrary birth-death process. The key role is to relate (for small Δt) Pij(t+Δt) to Pij(t). The above equations are often called the flow balance equations, or conservation of flow equations, for a birth-death process. 839

840 We obtain the flow balance equations for a birth-death process:
840

841 Solution of Birth-Death Flow Balance Equations
If is finite, we can solve for π0: It can be shown that if is infinite, then no steady-state distribution exists. The most common reason for a steady-state failing to exist is that the arrival rate is at least as large as the maximum rate at which customers can be served. 841

842 20.4 The M/M/1/GD/∞/∞ Queuing System and the Queuing Formula L=λW
We define We call p the traffic intensity of the queuing system. We now assume that 0 ≤ p < 1 thus If p ≥ 1, however, the infinite sum “blows up”. Thus, if p ≥ 1, no steady-state distribution exists. 842

843 Derivation of L Throughout the rest of this section, we assume that p<1, ensuring that a steady-state probability distribution does exist. The steady state has been reached, the average number of customers in the queuing system (call it L) is given by and 843

844 Derivation of Lq In some circumstances, we are interested in the expected number of people waiting in line (or in the queue). We denote this number by Lq. 844

845 Derivation of Ls Also of interest is Ls, the expected number of customers in service. 845

846 The Queuing Formula L=λW
We define W as the expected time a customer spends in the queuing system, including time in line plus time in service, and Wq as the expected time a customer spends waiting in line. By using a powerful result known as Little’s queuing formula, W and Wq may be easily computed from L and Lq. We first define the following quantities L λ = average number of arrivals entering the system per unit time 846

847 L = average number of customers present in the queuing system
Lq = average number of customers waiting in line Ls = average number of customers in service W = average time a customer spends in the system Wq = average time a customer spends in line Ws = average time a customer spends in service Theorem 3 – For any queuing system in which a steady-state distribution exists, the following relations hold: L = λW Lq = λWq Ls = λWs 847

848 Example 4 Suppose that all car owners fill up when their tanks are exactly half full. At the present time, an average of customers per hour arrive at a single-pump gas station. It takes an average of 4 minutes to service a car. Assume that interarrival and service times are both exponential. For the present situation, compute L and W. 848

849 Suppose that a gas shortage occurs and panic buying takes place.
To model the phenomenon, suppose that all car owners now purchase gas when their tank are exactly three quarters full. Since each car owner is now putting less gas into the tank during each visit to the station, we assume that the average service time has been reduced to 3 1/3 minutes. How has panic buying affected L and W? 849

850 Solutions We have an M/M/1/GD/∞/∞ system with λ = cars per hour and µ = 15 cars per hour. Thus p = 7.5/15 = .50. L = .50/1-.50 = 1, and W = L/λ = 1/7.5 = 0.13 hour. Hence, in this situation, everything is under control, and long lines appear to be unlikely. We now have an M/M/1/GD/∞/∞ system with λ = 2(7.5) = 15 cars per hour. Now µ = 60/3.333 = 18 cars per hour, and p = 15/18 = 5/6. Then Thus, panic buying has cause long lines. 850

851 Problems in which a decision maker must choose between alternative queuing systems are called queuing optimization problems. 851

852 More on L = λW The queuing formula L = λW is very general and can be applied to many situations that do not seem to be queuing problems. L = average amount of quantity present. λ = Rate at which quantity arrives at system. W = average time a unit of quantity spends in system. Then L = λW or W = L/λ 852

853 A Simple Example Example Solution
Our local MacDonalds’ uses an average of 10,000 pounds of potatoes per week. The average number of pounds of potatoes on hand is 5000 pounds. On the average, how long do potatoes stay in the restaurant before being used? Solution We are given that L=5000 pounds and λ = 10,000 pounds/week. Therefore W = 5000 pounds/(10,000 pounds/week)=.5 weeks. 853

854 A Spreadsheet for the M/M/1/GD/∞/∞ Queuing System
Figure 12 in the book provides a template that can be used to computer important quantities for the M/M/1/GD/∞/∞ queuing system. 854

855 20.5 The M/M/1/GD/c/∞ Queuing System
We analyze the M/M/1/GD/c/∞ queuing system is identical to the M/M/1/GD/∞/∞ system except for the fact that when c customers are present, all arrivals are turned away and are forever lost to the system. The rate diagram for the queuing system can be found in Figure 13 in the book. 855

856 For the M/M/1/GD/c/∞ system, a steady state will exist even if λ ≥ µ.
This is because, even if λ ≥ µ, the finite capacity of the system prevents the number of people in the system from “blowing up”. 856

857 A Spreadsheet for the M/M/1/GD/c/∞ Queuing System
Figure 14 in the book provides a template that can be used to computer important quantities for the M/M/1/GD/c/∞ queuing system. 857

858 20. 6 The M/M/s/GD/∞/∞ Queuing System
We now consider the M/M/s/GD/∞/∞ system. We assume that interarrival times are exponential (with rate λ), service times are exponential (with rate µ), and there is a single line of customers waiting to be served at one of the s parallel servers. If j ≤ s customers are present, then all j customers are in serve; if j >s customers are present, then all s servers are occupied, and j – s customers are waiting in line. 858

859 Summarizing, we find that the M/M/s/GD/∞/∞ system can be modeled as a birth-death process with parameters we define p=λ /sµ. For p<1, the following steady-state probabilities 859

860 A Spreadsheet for the M/M/s/GD/∞/∞ Queuing System
Figure 16 in the book provides a template that can be used to computer important quantities for the M/M/s/GD/∞/∞ queuing system. Having a spreadsheet to computer quantities of interest for the M/M/s system enables us to use spreadsheet techniques such as data tables and goal seek to answer questions of interest. 860

861 This is easily done with a one-way data table.
To determine the number of servers that minimizes expected cost per minute we would like to vary the number of servers (starting with 5) and computer expected cost per minutes for different number of servers. This is easily done with a one-way data table. 861

862 20.7 The M/M/∞/GD/∞/∞ and GI/G/∞/GD/∞/∞ Models
There are many examples of systems in which a customer never has to wait for service to begin. In such a system, the customer’s entire stay in the system may be thought of as his or her service time. Since a customer never has to wait for service, there is, in essence, a server available for each arrival, and we may think of such a system as an infinite-server (or self-service). 862

863 Such a system operated as follows:
Using Kendall-Lee notation, an infinite server system in which interarrival and service times may follow arbitrary probability distributions may be written as GI/G/∞/GD/∞/∞ queuing system. Such a system operated as follows: Interarrival times are iid with common distribution A. Define E(A) = 1/λ. Thus λ is the arrival rate. When a customer arrives, he or she immediately enters service. Each customer’s time in the system is governed by a distribution S having E(S)= 1/µ. 863

864 Let L be the expected number of customers in the system in the steady state, and W be the expected time that a customer spends in the system. 864

865 20.8 The M/G/1/GD/∞/∞ Queuing System
Next we consider a single-server queuing system in which interarrival times are exponential, but the service time distribution (S) need not be exponential. Let (λ) be the arrival rate (assumed to be measured in arrivals per hour). Also define 1/µ = E(S) and σ2=var S. In Kendall’s notation, such a queuing system is described as an M/G/1/GD/∞/∞ queuing system. 865

866 An M/G/1/GD/∞/∞ queuing system is not a birth-death process, because the probability that a service completion occurs between t and t+Δt when the state of the system at time t is j depends on the length of time since the last service completion. Determination of the steady-state probabilities for M/G/1/GD/∞/∞ queuing system is a difficult matter. Fortunately, however, utilizing the results of Pollaczek and Khinchin, we may determine Lq, L, Ls, Wq, W, Ws. 866

867 The result is similar to the one for the M/G/1/GD/∞/∞ system.
Pollaczek and Khinchin showed that for the M/G/1/GD/∞/∞ queuing system, It can also be shown that π0, the fraction of the time that the server is idle, is 1-p. The result is similar to the one for the M/G/1/GD/∞/∞ system. 867

868 20.9 Finite Source Models: The Machine Repair Model
With the exception of the M/G/1/GD/∞/∞ model, all the models we have studied have displayed arrival rates that were independent of the state of the system. There are two situations where the assumption of the state-independent arrival rate may be invalid: If customers do not want to buck long lines, the arrival rate may be a decreasing function of the number of people present in the queuing system. If arrivals to a system are drawn from a small population, the arrival rate may greatly depend on the state of the system. 868

869 Models in which arrivals are drawn from a small population are called finite source models.
In the machine repair problem, the system consists of K machines and R repair people. At any instant in time, a particular machine is in either good or bad condition. The length of time that a machine remains in good condition follows an exponential distribution with rate λ. Whenever a machine breaks down the machine is sent to a repair center consisting of R repair people. 869

870 The repair center services the broken machines as if they were arriving at an M/G/R/GD/∞/∞ system.
Thus, if j ≤ R machines are in bad condition, a machine that has just broken will immediately be assigned for repair; if j > R machines are broken, j – R machines will be waiting in a single line for a repair worker to become idle. The time it takes to complete repairs on a broken machine is assumed exponential with rate µ. Once a machine is repaired, it returns to good condition and is again susceptible to breakdown. 870

871 When the state is j, there are K-j machines in good condition.
The machine repair model may be modeled as a birth-death process, where the state j at any time is the number of machines in bad condition. Note that a birth corresponds to a machine breaking down and a death corresponds to a machine having just been repaired. When the state is j, there are K-j machines in good condition. When the state is j, min (j,R) repair people will be busy. 871

872 Since each occupied repair worker completes repairs at rate µ, the death rate µj is given by
If we define p = λ /µ, an application of steady- state probability distribution: 872

873 Using the steady-state probabilities shown on the previous slide, we can determine the following quantities of interest: L = expected number of broken machines Lq = expected number of machines waiting for service W = average time a machine spends broken (down time) Wq = average time a machine spends waiting for service Unfortunately, there are no simple formulas for L, Lq, W, Wq. The best we can do is express these quantities in terms of the πj’s: 873

874 Figure 19 (Machrep.wk1) gives a spreadsheet template for the machine repair model.
874

875 20.10 Exponential Queues in Series and Open Queuing Networks
In the queuing models that we have studied so far, a customer’s entire service time is spent with a single server. In many situations the customer’s service is not complete until the customer has been served by more than one server. A system like the one shown in Figure 19 in the book is called a k-stage series queuing system. 875

876 Theorem 4 – If (1)interarrival times for a series queuing system are exponential with rate λ, (2) service times for each stage I server are exponential, and (3) each stage has an infinite-capacity waiting room, then interarrival times for arrivals to each stage of the queuing system are exponential with rate λ. For this result to be valid, each stage must have sufficient capacity to service a stream of arrivals that arrives at rate λ; otherwise, the queue will “blow up” at the stage with insufficient capacity. 876

877 Open Queuing Networks Open queuing networks are a generalization of queues in series. Assume that station j consists of sj exponential servers, each operating at rate µj. Customers are assumed to arrive at station j from outside the queuing system at rate rj. These interarrival times are assumed to be exponentially distributed. Once completing service at station I, a customer joins the queue at station j with probability pij and completes service with probability 877

878 Define λj, the rate at which customers arrive at station j.
λ1, λ2,… λk can be found by solving the following systems of linear equations: This follows, because a fraction pij of the λi arrivals to station i will next go to station j. Suppose the siµj > λj holds for all stations. 878

879 Then it can be shown that the probability distribution of the number of customers present at station j and the expected number of customers present at station j can be found by treating station j as an M/M/sj/GD/∞/∞ system with arrival rate λj and service rate µj. If for some j, sj µj≤ λj, then no steady-state distribution of customers exists. Remarkably, the number of customers present at each station are independent random variables. 879

880 That is, knowledge of the number of people at all stations other than station j tells us nothing about the distribution of the number of people at stations j! This result does not hold, however, if either interarrival or service times are not exponential. To find L, the expected number of customers in the queuing system, simply add up the expected number of customers present at each station. To find W, the average time a customer spends in the system, simply apply the formula L=λW to the entire system. 880

881 Network Models of Data Communication Networks
Queuing networks are commonly used to model data communication networks. The queuing models enable us to determine the typical delay faced by transmitted data and also to design the network. See the file Compnetwork.xls . We are interested, of course, in the expected delay for a packet. Also, if total network capacity is limited, a natural question is to determine the capacity on each arc that will minimize the expected delay for a packet. 881

882 The usual way to treat this problem is to treat each arc as if it is an independent M/M/1 queue and determine the expected time spent by each packet transmitted through that arc by the formula We are assuming a static routing in which arrival rates to each node do not vary with the state of the network. In reality, many sophisticated dynamic routing schemes have been developed. 882

883 A dynamic routing scheme would realize, for example, if arc AB is congested and arc AD is relatively free we should directly send messages from A to D instead of sending them via route A-B-D. 883

884 20.11 The M/G/s/GD/s/∞ System (Blocked Customers Cleared)
In many queuing systems, an arrival who finds all servers occupied is, for all practical purposes, lost to the system. If arrivals who find all servers occupied leave the system, we call the system a blocked customers cleared, or BCC, system. Assuming that interarrival times are exponential, such a system may be modeled as an M/G/s/GD/s/∞ system. 884

885 In most BCC systems, primary interest is focused on the fraction of all arrivals who are turned away. Since arrivals are turned away only when s customers are present, a fraction πs of all arrivals will be turned away. Hence, an average of λπs arrivals per unit time will be lost to the system. Since an average of λ(1-πs) arrivals per unit time will actually enter the system, we may conclude that 885

886 This fact is known as Erlang’s loss formula.
For an M/G/s/GD/s/∞ system, it can be shown that πs depends on the service time distribution only through its mean (1/µ). This fact is known as Erlang’s loss formula. In other words, any M/G/s/GD/s/∞ system with an arrival rate λ and mean service time of 1/µ will have the same value of πs. 886

887 A Spreadsheet for the BCC Model
Figure 22 (file Bcc.xls) gives a spreadsheet template for the M/G/s/GD/s/∞ queuing system. 887

888 Using LINGO for BCC Computations
The LINGO will yield πs. function may be used to solve a problem where we seek the number of servers minimizing expected cost per-unit time when cost is the sum of service cost and cost due to lost business. 888

889 20.12 How to Tell Whether Interarrival Times and Service Times are Exponential
How can we determine whether the actual data are consistent with the assumption of exponential interarrival times and service times? Suppose for example, that interarrival times of t1, t2, …tn have been observed. It can be shown that a reasonable estimate of the arrival rate λ is given by 889

890 20.13 Closed Queuing Networks
For manufacturing units attempting to implement just-in-time manufacturing, it makes sense to maintain a constant level of work in progress. For a busy computer network it may be convenient to assume that as soon as a job leaves the system another job arrives to replace the job. Systems where there is constant number of jobs present may be modeled as closed queuing networks. Since the number of jobs is always constant the distribution of jobs at different servers cannot be independent. 890

891 Let λj equal the arrival rate to server j.
Buzen’s Algorithm can be used to determine steady state probabilities for closed queuing networks. Let λj equal the arrival rate to server j. Since there are no external arrivals we may set all rj=0 and obtain the values of the λj from the equation used in the open network situation. That is 891

892 Since jobs never leave the system for each I
The fact cause the equation on the previous slide to have no unique solution. Fortunately, it turns out that we can use any solution to the equation to help us get steady state probabilities. If we define then we determine for any state n its steady state probability πN(n) from the following equation: 892

893 Here Buzen’s algorithm gives us an efficient way to determine (in a spreadsheet) G(N). Once we have the steady state probability distribution we can easily determine other measures of effectiveness such as expected queue length at each server and expected time a job spends during each visit to a server, fraction of time a server is busy, and the throughput for each server. 893

894 20.14 An Approximation for the G/G/m Queuing System
In most situations interarrival times will follow an exponential random variable. Often, however, service times will not follow an exponential distribution. When interarrival times and service times each follow a non exponential random variable we call the queuing system a G/G/m system. The first G indicates that interarrival times always follow the same random variable while the second G indicates that service times always follow the same random variables. 894

895 The user need only input the following information:
The Allen-Cunneen approximation often gives a good approximation to L, W, Lq, and Wq for G/G/m systems. The file ggm.xls contains a spreadsheet implementation of the Allen-Cunneen approximation. The user need only input the following information: The average number of arrivals per unit time (lambda) in cell B3. The average rate at which customers can be services (Mu) in cell B4. 895

896 The number of servers (s) is cell B5.
The squared coefficient of variation (variance of interarrival times)/(mean interarrival time)2) of interarrival times in cell B6. The squared coefficient of variation (variance of service times)/(mean service time)2) of service times in cell B7. The Allen-Cunneen approximation is exact if interarrival times and service times are exponential. 896

897 20.15 Priority Queuing Models
There are many situations in which customers are not served on a first come, first served (FCFS) basis. Let WFCFS, WSIRO, and WLCFS be the random variables representing a customer’s waiting time in queuing systems under the disciplines FCFS, SIRO, LCFS, respectively. It can be shown that E(WFCFS) = E(WSIRO) = E(WLCFS) Thus, the average time (steady-state) that a customer spends in the system does not depend on which of these three queue disciplines is chosen. 897

898 It can also be shown that varWFCFS < varWSIRO < var(WLCFS)
Since a large variance is usually associated with a random variable that has a relatively large chance of assuming extreme values, the above equation indicates that relatively large waiting times are most likely to occur with an LCFS discipline and least likely to occur with an FCFS discipline. 898

899 In many organizations, the order in which customers are served depends on the customer’s “type”.
For example, hospital emergency rooms usually serve seriously ill patients before they serve nonemergency patients. Models in which a customer’s type determines the order in which customers undergo service are call priority queuing models. The interarrival times of type i customers are exponentially distributed with rate λi. 899

900 Interarrival times of different customer types are assumed to be independent.
The service time of a type I customer is described by a random variable Si. 900

901 Nonpreemptive Priority Models
In a nonpreemptive model, a customer’s service cannot be interrupted. After each service completion, the next customer to enter service is chosen by given priority to lower-numbered customer types. In the Kendall-Lee notation, a nonpreemptive priority model is indicated by labeling the fourth characteristic as NPRP. 901

902 Wk = expected steady-state in the system spent by a type k customer
Wqk = expected steady-state waiting time in line spent by a type k customer Wk = expected steady-state in the system spent by a type k customer Lqk = expected steady-state number of type k customers waiting in line Lk = expected steady-state number of type k customers in the system 902

903 Mi/Gi/1/NPRP/∞/∞ Model
903

904 The Mi/Gi/1/NPRP/∞/∞ Model with Customer-Dependent Waiting Costs
Consider a single-serve nonpreemptive priority system in which a cost ck is charged for each unit of time that a type k customer spends in the system. If we want to minimize the expected cost incurred per unit time, what priority ordering should be placed on the customer types? Suppose the n customer types are numbered such that 904

905 Then expected cost is minimized by giving the highest priority to type 1 customers, the second highest priority to type 2 customers , and so forth and the lowest priority to type n customers. Thus, cost can be minimized by giving the highest priority to customer types with the largest values of ckµk. 905

906 Mi/M/s/NPRP/∞/∞ Model
To obtain tractable analytic results for multiserver priority systems, we must assume that each customer type has exponentially distributed service times with a mean of 1/µ, and that type I customers have interarrival times that are exponentially distributed with rate λi. Such a system with s servers is denoted by the notation Mi/M/s/NPRP/∞/∞. 906

907 For this model, 907

908 Preemptive Priorities
We close our discussion of priority queuing systems by discussion a single-server preemptive queuing system. In a preemptive queuing system, a lower priority customer can be bumped from service whenever a higher-priority customer arrives. Once no higher-priority customers are present, the bumped type i customer reenters service. In a preemptive resume model, a customer’s service continues from the point at which it was interrupted. 908

909 In a preemptive repeat model, a customer begins service anew each time he or she reenters service.
Of course, if service times are exponentially distributed, the resume and repeat disciplines are identical. In the Kendall-Lee notation, we denote a preemptive queuing system by labeling the fourth characteristic PRP. 909

910 We now consider a single-server Mi/Gi/1/NPRP/∞/∞ system in which the service time of each customer is exponential with mean 1/µ and the interarrival times for the ith customer type are exponentially distributed with rate λi. Then where a0 = 0 and 910

911 For obvious reasons, preemptive discipline are rarely used if the customers are people.
911

912 20.15 Transient Behavior of Queuing Systems
We have assumed the arrival rate, service rate and number of servers has stayed constant over time. This allows us to talk reasonably about the existence of a steady state. In many situations the arrival rate, service rate, and number of servers may vary over time. An example is a fast food restaurant. It is likely to experience a much larger arrival rate during the time noon-1:30 pm than during other hours of the day. Also the number of servers will vary during the day with more servers available during the busier periods. 912

913 When the parameters defining the queuing system vary over time we say the queuing system is non-stationary. Consider the fast food restaurant. We call these probability distributions transient probabilities. We now assume that at time t interarrival times are exponential with rate λ(t) and that s(t) servers are available at time t with service times being exponential with rate µ(t). We assume the maximum number of customers present at any time is given by N. 913

914 Assume k customers are currently present at a time t. Then we assume
To determine transient probabilities we choose a small length of time Δt and assume at most one event can occur during an interval of length Δt. Assume k customers are currently present at a time t. Then we assume The probability of an arrival during an interval Δt is λ(t)* Δt. The probability of more than one arrival during a time interval of length Δt is o(Δt). 914

915 Arrivals during different intervals are independent.
The probability of a service completion during an interval of length Δt is given by min(s(t),k)*µ(t) Δt . The probability of more than one service completion during a time interval of length Δt is o(Δt ). When arrivals are governed by the first three assumptions we say that arrivals follow a Nonhomogeneous Poisson Process. We now define Pi(t) to be the probability that i customers are present at time t. 915

916 Then given knowledge of Pi(t) we may computer Pi(t+Δt) as follows:
We will assume that the system is initially empty so that P0(0)=1 and for i>0 P0(i)=0. Then given knowledge of Pi(t) we may computer Pi(t+Δt) as follows: These equations are based on the assumption that if the state at time t is i, then during the next Δt, the probability of an arrival is λ(t) Δt and the probability of a service completion is min(s(t),i)µ(t) Δt. 916

917 Chapter 21 Simulation to accompany
Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 917

918 Description Simulation is a very powerful and widely used management science technique for the analysis and study of complex systems. Simulation may be defined as a technique that imitates the operation of a real-world system as it evolves over time. This is normally done by developing a simulation model. A simulation model usually takes the form of a set of assumptions about the operation of the system, expressed as mathematical or logical relations between the objects of interest in the system. Simulation has its advantages and disadvantages. We will focus our attention on simulation models and the simulation technique. 918

919 21.1 Basic Terminology In most simulation studies, we are concerned with the simulation of some system. Thus, in order to model a system, we must understand the concept of a system. Definition: A system is a collection of entities that act and interact toward the accomplishment of some logical end. Systems generally tend to be dynamic – their status changes over time. To describe this status, we use the concept of the state of a system. 919

920 Systems may be classified as discrete or continuous.
Definition: The state of a system is the collection of variables necessary to describe the status of the system at any given time. Customers arrive and depart, the status of the system changes. To describe these changes in status, we require a set of variables called the state variables. In a system, an object of interest is called an entity, and any properties of an entity are called attributes. Systems may be classified as discrete or continuous. 920

921 There are two types of simulation models, static and dynamic.
Definition: A discrete system is one in which the state variables change continuously over time. A bank is an example of a discrete system. Definition: A continuous system is one in which the state variables change continuously over time. There are two types of simulation models, static and dynamic. Definition: A static simulation model is a representation of a system at a particular point in time. 921

922 We usually refer to a static simulation as a Monte Carlo simulation.
Definition: A dynamic simulation is a representation of a system as it evolves over time. Within these two classifications, a simulation may be deterministic or stochastic. A deterministic simulation model is one that contains no random variables; a stochastic simulation model contains one or more random variables. 922

923 21.2 An Example of a Discrete-Event Simulation
To simulate a queuing system, we first have to describe it. We assume arrivals are drawn from an infinite calling population. There is unlimited waiting room capacity, and customers will be serve in the order of their arrival (FCFS). Arrivals occur one at a time in a random fashion. All arrivals are eventually served with the distribution of service teams as shown in the book. 923

924 Service times are also assumed to be random
Service times are also assumed to be random. After service, all customers return to the calling population. For this example, we use the following variables to define the state of the system: (1) the number of customers in the system; (2) the status of the server – that is, whether the server is busy or idle; and (3)the time of the next arrival. An event is defined as a situation that causes the state of the system to change instantaneously. 924

925 Next we schedule the departure time of the first customer.
All the information about them is maintained in a list called the event list. Time in a simulation is maintained using a variable called the clock time. We begin this simulation with an empty system and arbitrarily assume that our first event, an arrival, takes place at clock time 0. Next we schedule the departure time of the first customer. Departure time = clock time now + generated service time 925

926 Also, we now schedule the next arrival into the system by randomly generating an interarrival time from the interarrival time distribution and setting the arrival time as Arrival time = clock time now + generated interarrival time Both these events are their scheduled times are maintained on the event list. This approach of simulation is called the next- event time-advance mechanism, because of the way the clock time is updated. We advance the simulation clock to the time of the most imminent event. 926

927 As we move from event to event, we carry out the appropriate actions for each event, including any scheduling of future events. The jump to the next event in the next-event mechanism may be a large one or a small one; that is, the jumps in this method are variable in size. We contrast this approach with the fixed- increment time-advance method. With this method, we advance the simulation clock in increments of Δt time units, where Δt is some appropriate time unit, usually 1 time unit. 927

928 For most models, however, the next event mechanism tends to be more efficient computationally.
Consequently, we use only the next-event approach for the development of the models for the rest of the chapter. To demonstrate the simulation model, we need to define several variables: TM = clock time of the simulation AT = scheduled time of the next arrival 928

929 DT = scheduled time of the next departure
SS = status of the server (1=busy, 0=idle) WL = length of the waiting line MX = length (in time units) of a simulation run We now begin the simulation by initializing all the variables. Details of the example are found in the book. This simple example illustrates some of the basic concepts in simulation and the way in which simulation can be used to analyze a particular problem. 929

930 21.3 Random Numbers and Monte Carol Simulation
The procedure of generating these times from the given probability distributions is known as sampling from probability distributions, or random variate generation, or Monte Carlo sampling. We will discuss several different methods of sampling from discrete distributions. The principle of sampling from discrete distributions is based on the frequency interpretation of probability. 930

931 In addition to obtaining the right frequencies, the sampling procedure should be independent; that is, each generated service time should be independent of the service times that precede it and follow it. This procedure of segmentation and using a roulette wheel is equivalent to generating integer random numbers between 00 and 99. This follows from the fact that each random number in a sequence has an equal probability of showing up, and each random number is independent of the numbers that precede and follow it. 931

932 A random number, Ri, is defined as an independent random sample drawn from a continuous uniform distribution whose probability density function (pdf) is given by 932

933 Random Number Generators
Since our interest in random numbers is for use within simulations, we need to be able to generate them on a computer. This is done by using mathematical functions called random number generators. Most random number generators use some form of a congruential relationships. Examples of such generators include linear congruential generator, the multiplicative generator, and the mixed generator. The lineal congruential generator is by far the most widely used. 933

934 Random number generators must have these important characteristics:
Each random number generated using this methods will be a decimal number between 0 and 1. Random numbers generated using congruential methods are called pseudorandom numbers. Random number generators must have these important characteristics: The routine must be fast The routine should not require a lot of core storage The random numbers should be replicable; and The routine should have a sufficiently long cycle 934

935 Most programming languages have built-in library functions that provide random (or pseudorandom) numbers directly. 935

936 Computer Generation of Random Numbers
We now take the method of Monte Carlo sampling a stage further and develop a procedure using random numbers generated on a computer. The idea is to transform the U(0,1) random numbers into integer random numbers between 00 and 99 and then to use these integer random numbers to achieve the segmentation by numbers. We now formalize this procedure and use it to generate random variates for a discrete random variable. 936

937 The procedure consists of two steps:
We develop the cumulative probability distribution (cdf) for the given random variable, and We use the cdf to allocate the integer random numbers directly to the various values of the random variables. 937

938 21.4 An Example of Monte Carlo Simulation
The book uses a Monte Carlo simulation to simulate a news vendor problem. The procedure in this simulation is different from the queuing simulation, in that the present simulation does not evolve over time in the same way. Here, every day is an independent simulation. Such simulations are commonly referred to as Monte Carlo simulations. 938

939 21.5 Simulations with Continuous Random Variables
In many simulations, it is more realistic and practical to use continuous random variables. We present and discuss several procedures for generating random variates from continuous distributions. The basic principle is similar to the discrete case. We first generate U(0,1) random number and then transform it into a random variate from the specified distribution. 939

940 The selection of a particular algorithm will depend on the distribution from which we want to generate, taking into account such factors as the exactness of the random variables, the computations and storage efficiencies, and the complexity of the algorithm. The two most common used algorithms are the inverse transformation method (ITM) and the acceptance-rejection method (ARM). 940

941 Inverse Transformation Method
The inverse transformation method is generally used for distribution whose cumulative distribution function can be obtained in closed form. Examples include the exponential, the uniform, the triangular, and the Weibull distributions. For distributions whose cdf does not exist in closed form, it may be possible to use some numerical method, such as a power-series expansion, within the algorithm to evaluate the cdf. 941

942 The ITM is relatively easy to describe and execute.
It consists of the following steps: Step1: Given the probability density formula f(x) for a random variable X, obtain the cumulative distribution function F(x) as Step 2: Generate a random number r. Step 3: Set F(x) = r and solve for x. 942

943 We consider the distribution given by the function
A function of this type is called a ramp function. To obtain random variates from the distribution using the inverse transformation method, we first computer the cdf as 0 ≤ x ≤ 2 otherwise 943

944 In Step 2, we generate a random number r.
Finally, in Step 3, we set F(x) =r and solve for x. Since the service time are defined only for positive values of x, a service time of as the solution for x. This equation is called a random variate generator or a process generator. 944

945 Thus, to obtain a service time, we first generate a random number and then transform it using the preceding equation. As this example shows, the major advantage of the inverse transformation method is its simplicity and ease of application. 945

946 Acceptance – Rejection Method
There are several important distributions, including the Erlang (used in queuing models) and the beta (used in PERT), whose cumulative distribution functions do not exist in closed form. For these distributions, we must resort to other methods of generating random variates, one of which is the acceptance – rejection method (ARM). This method is generally used for distributions whose domains are defined over finite intervals. 946

947 Given a distribution whose pdf, f(x), is defined over the interval a ≤ x ≤ b, the algorithm consists of the following steps: Step 1: Select a constant M such that M is the largest value of f(x) over the interval [a, b]. Step 2: Generate two random numbers, r1 and r2. Step 3: Computer x* = a + (b – a)r1. (This ensures that each member of [a, b] has an equal chance to be chosen as x*.) Step 4: Evaluate the function f(x) at the point x*. Let this be f(x*). 947

948 Step 5: If deliver x* as a random variate from the distribution whose pdf is f(x). Otherwise, reject x* and go back to Step 2. Note that the algorithm continues looping back to Step 2 until a random variate is accepted. This may take several iterations. For this reason, the algorithm can be relatively inefficient. The efficiency, however, is highly dependent on the shape of the distribution. 948

949 There are several ways by which the method can be made more efficient.
One of these is to use a function in Step 1 instead of a constant. We now give an intuitive justification of the validity of the ARM. In particular, we want to show that the ARM does generate observations from the given random variable X. 949

950 Direct and Convolution Methods for the Normal Distribution
Both the inverse transformation method and the acceptance – reject method are inappropriate for the normal distribution, because (1) the cdf does not equal in closed form and (2) the distribution is not defined over a finite interval. Other methods such as – an algorithm based on convolution techniques, and then a direct transformation algorithm that produces two standard normal variates with mean 0 and variance 1. 950

951 The Convolution Algorithm
In the convolution algorithm, we make direct use of the Central Limit Theorem. The Central Limit Theorem states that the sum Y of n independent and identically distributed random variables ( say Y1, Y2,…Yn), each with mean µ and finite variance σ2) is approximately normally distributed with mean nµ and variance nµ2. If we want to generate a normal variate X with mean µ and variance σ2, we first generate Z using this process generator then transform it using the relation X = µ + σZ. Unique to normal distribution. 951

952 The Direct Method The direct methods for the normal distribution was developed by Box and Muller (1958). It’s not as efficient as some of the newer techniques, it is easy to apply and execute. The algorithm generates two U(0,1) random numbers, r1 and r2, and then transforms them into two normal variates, each with mean 0 and variance 1, using the direct transformation. 952

953 It is easy to transform these standardized normal variates intro normal variates X1 and X2 from the distribution with mean µ and variance σ2, using the equations 953

954 21.6 An Example of a Stochastic Simulation
Cabot Inc. is a large mail order firm in Chicago. Orders arrive into the warehouse via telephones. At present, Cabot maintains 10 operators on-line 24 hours a day. The operators take the orders and feed them directly into a central computer, using terminals. Each operator has one terminal. At present, the company has a total of 11 terminals. That is, if all terminals are working, there will be 1 spare terminal. 954

955 Cabot managers believe that the terminal system needs evaluation, because the downtime of operators due to broken terminals has been excessive. They feel that the problem can be solved by the purchase of some additional terminals for the spares pool. It has been determined that a new terminal will cost a total of $75 per week. It has also been determined that the cost of terminal downtime, in terms of delays, lost orders, and so on is $1000 per week. 955

956 This model is a version of the machine repair problem.
Given this information, the Cabot managers would like to determine how many additional terminals they should purchase. This model is a version of the machine repair problem. It is easy to find an analytical solution to the problem using the birth-death processes. However, in analyzing the historical data for the terminals, it has been determined that although the breakdown times can be represented by the exponential distribution, the repair times can be adequately represented only by the exponential distribution. 956

957 This implies that analytical methods cannot be used and that we must use simulation.
To simulate this system, we first require the parameters of both the distributions. The data show that the breakdown rate is exponential and equal to 1 per week per terminal. In other words, the time breakdowns for a terminal is exponential with a mean equal to 1 week. Analysis for the repair times shows that this distribution can be represented by the triangular distribution which has a mean of week. 957

958 The repair stuff on average can repair 13.33 terminals per week.
To find the optimal number of terminals, we must balance the cost of the additional terminals against the increased revenues generated as a result of the increase in the number of terminals. In this simulation we increase the number of terminals in the system, n, from the present total of 11 in increments of 1. 958

959 For this fixed value of n, we then run our simulation model to estimate the net revenue.
Net revenue here is defined as the difference between the increase in revenues due to the additional terminals and the cost of these additional terminals. We keep on adding terminals until the net revenue position reaches a peak. To calculate the net revenue, we first computer the average number of on-line terminals, ELn, for a fixed number of terminals in the system, n. 959

960 Once we have a value of ELn, we can computer the expected weekly downtime costs, given by (10-ELn). Then the increase in revenue as a result of increasing the number of terminals from 11 to n is 1000(ELn – EL11). Mathematically, we compute ELn 960

961 T = length of simulation
where T = length of simulation N(t) = number of terminals on-line at time t (0≤t≤T) Ai = area of rectangle under N(t) between ei-1 and ei (where ei is the time of the ith event) m = number of events that occur in the interval [0,T] Between time 0 and time e1, the time of the first event, the total on-line time for all the terminals is given by 10ei, since each terminal is on-line for a period of e1 time units. 961

962 If we now run this simulation over T time units and sum up the areas A1, A2, A3,…, we can get an estimate for EL10 by dividing this sum by T. This statistic is called a time-average statistic. We would like to set up the process in such way that it will be possible to collect the statistics to computer the areas A1, A2, A3,…. That is, as we move from event to event, we would like to keep track of at least the number of terminals on-line between the events and the time between events. 962

963 We first define the state of the system as the number of terminals in the repair facility.
The only time the state of the system will change is when there is either a breakdown or a completion of a repair. Therefore, there are two events in this simulation: breakdown and completion of repairs. To set up the simulation, our first task is to determine the process generators for both the breakdown and the repair times. 963

964 We use the ITM to develop the process generators.
For the exponential distribution the process generator is simply x = -log r In case of the repair times, applying the ITM gives us and as the process generators. 964

965 For each n, we start the simulation in the state where there are no terminals in the repair facility. In this state, all 10 operators are on-line and any remaining terminals are in the spares pool. Our first action is the simulation is to schedule the first series of events, the breakdown times for the terminals presently on-line. Having scheduled these events, we next determine the first event, the first breakdown, by searching through the current event list. 965

966 To process a breakdown, we take two separate series of actions
We then move the simulation clock to the time of this event and process this breakdown. To process a breakdown, we take two separate series of actions Determine whether a spare is available. Determine whether the repair staff is idle. These actions are summarized in the system flow diagram showed in the book in Figure 17. Otherwise, we process a completion of a repair. 966

967 To process the completion of a repair, we also undertake two series of actions.
At the completion of a repair, we have an additional working terminal, so we determine whether the terminal goes directly to an operator or to the spares pool. We check the repair queue to see whether any terminals are waiting to be repaired. We proceed with the simulation by moving from event to event until the termination time T. At this time, we calculate all the relevant measures of performance from the statistical counters. 967

968 Our key measure is the net revenue for the current value of n.
If this revenue is greater than the revenue for a system with n-1 terminals, we increase the value of n by 1 and repeat the simulation with n +1 terminals in the system. Otherwise, the net revenue has reached a peak. The simulation outlined in this example can be used to analyze other policy options that management may have. 968

969 The simulation model provides a very flexible mechanism for evaluating alternative policies.
969

970 21.7 Statistical Analysis of Simulations
As previously mentioned, output data from simulation always exhibit random variability, since random variables are input to the simulation model. We must utilize statistical methods to analyze output from simulations. The overall measure of variability is generally stated in the form of a confidence interval at a given level of confidence. Thus, the purpose of the statistical analysis is to estimate this confidence interval. 970

971 Determination of the confidence intervals in simulation is complicated by the fact that output data are rarely, if ever, independent. This means that the classical methods of statistics, which assumer independence are not directly applicable to the analysis of simulation output data. We must modify the statistical methods to make proper inferences from simulation data. 971

972 In addition to the problem of autocorrelation, we must have a second problem, in that the specification of the initial conditions of the system at time 0 may influence the output data. This initial period of time before a simulation reached steady state is called the transient period or warmup period. There are two ways of overcoming the problems associated with the transient period. The first approach is to use a set of initial conditions that are representative of the system in a steady state. 972

973 The alternative approach is to let the simulation run for a while and discard the initial part of the simulation. Since each simulation model is different, it is up to the analyst to determine when the transient period ends. This is a difficult process, there are some general guidelines one can use. 973

974 Simulation Types For the purpose of analyzing output data, we generally categorize simulations into one of two types: terminating simulations and steady- state simulations. A terminating simulation is one that runs for a duration of time TE, where E is a specified event (or events) that stops the simulation. A steady-state simulation is one that runs over a long period of time; that is, the length of the simulation goes to “infinity”. 974

975 The analysis for steady-state simulations is much more involved.
Often, the type of model determines which type of output analysis is appropriate for a particular simulation. However, the system or model may not always be the best indicator of which simulation would be the most appropriate. It is quite possible to use the terminating simulation approach for systems more suited to steady-state simulations, and vice versa. The analysis for steady-state simulations is much more involved. 975

976 The terminating event, E, is a fixed time.
To illustrate the terminating simulation approach, we use an example from the Cabot case. We assume there are 11 terminals in the system and we perform only 10 independent terminating runs of the simulation model. The terminating event, E, is a fixed time. That is, all 10 simulations are run for the same length of time. The overall average for these 10 runs for the expected number of terminals on-line turns out to be 976

977 If we now calculate the sample variance, we find S2 = 0.018.
Since t(.025,9)=2.26, we obtain as the 95% confidence interval for this sample. We could have also computed t(.025,9)in EXCEL with the formula =TINV(0.5,9). This yields 2.26, which is consistent with the T Table. 977

978 As we saw in this example, the terminating simulation approach offers a relatively easy method for analyzing simulation data may be more efficient for a give problem. 978

979 21.8 Simulation Languages One of the most important aspects of a simulation study is the computer programming. Several special-purpose computer simulation languages have been developed to simplify programming. The best known and most readily available simulation languages, including GPSS, GASP IV and SLAM. Most simulation languages use one of two different modeling approaches or orientations; event scheduling or process interaction. 979

980 GPSS uses the process-interaction approach.
SLAM allows the modeler to use either approach or even a mixture of the two, whichever is the most appropriate for the model being analyzed. Of the general-purpose languages, FORTRAN is the most commonly used in simulation. In fact, several simulation languages, including GASP IV and SLAM, use a FORTRAN base. 980

981 For the rest of the program, we use the GASP routines.
To use GASP IV we must provide a main program, an initialization routine, and the event routines. For the rest of the program, we use the GASP routines. Because of these prewritten routines, GASP IV provides a great deal of programming flexibility. GPSS, in contrast to GASP, is a highly structured special-purpose language. GPSS does not require writing a program in the usual sense. 981

982 Building a GPSS model then consist of combining these sets of blocks into a flow diagram so that it represents the path an entity takes as it passes through the system. SLAM was developed by Pritsket and Pegden (1979). It allows us to develop simulation model as network models, discrete-event models, continuous models, or any combination of these. In Chapter 10, we will show you how to use the user-friendly powerful process model package which is included on the books CD- ROM. 982

983 The simulation language offer several advantages.
The decision of which language to use is one of the most important that a modeler or an analyst must make in performing a simulation study. The simulation language offer several advantages. The most important of these is that the special-purpose languages provide a natural framework for simulation modeling and most of the features needed in programming a simulation model. 983

984 21.9 The Simulation Process
We now discuss the process for a complete simulation study and present a systematic approach of carrying out a simulation. A simulation study normally consists of several distinct stages. (See Figure 21 in the book) However, not all simulation studies consist of all these stages or follow the order stated here. On the other hand, there may even be considerable overlap between some of these stages. 984

985 The initial stage of any scientific study, including a simulation project, requires an explicit statement of the objectives of the study. The next stage is the development of the model and the collection of data. We next put it into a form in which it can be analyzed on the computer. This usually involves developing a computer program for the model. Once the program has been developed and debugged, we determine whether the program is working properly. 985

986 We now move to the validation stage.
In other words, is the program doing what it is supposed to do? This process is call the verification step and is usually difficult, since for most simulations, we will not have any results with which to compare the computer output. We now move to the validation stage. In this step we validate the model to determine whether it realistically represents the system being analyzed and whether the results from the model will be reliable. 986

987 If we are satisfied at this stage with the performance of the model, we can use the model to conduct the experiments to answer the questions at hand. The data generated by the simulation experiments must be collected, processed, and analyzed. The results are analyzed not only as the solution to the model but also in terms of statistical reliability and validity. Finally, after the results are processed and analyzed, a decision must be mage whether to perform any additional experiments. 987

988 Operations Research: Applications and Algorithms
Chapter 22 Simulation with Process Model to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 988

989 Description In Chapter 9 we learned how to build simulation models of many different situations. In this chapter we will explain how the powerful, user-friendly simulation package Process Model can be used to simulate queuing systems. 989

990 22.1 Simulating an M/M/1 Queuing System
After installing Process Model you can start it by selecting Start – Programs – Process Model. You will see the process model screen appear. The book contains a labeled diagram showing key icons. Process Model is an easy to use software that allows you to simulate queuing systems. To simulate a M/M/1 queuing systems having λ=10 arrivals/hour and µ=15 customers/hour. (See the file MM1.igx on the cd from the book.) 990

991 The book walks you through the steps of creating the model.
During the simulation an on-screen scoreboard tracks the following quantities: Quality Processed-Total Number of units to leave system Cycle Time-Average time a unit spends in system Value Added Time-Time which a unit spends in service Cost Per Unit-If costs are associated with the resources, the cost incurred per unit serviced is computed. After completing the simulation you are asked if you want to view the output. Sample output can be seen in the book. 991

992 22.2 Simulating a M/M/2 System
Let us suppose that we had two telephone operators who can handle calls. To modify the previous example we simply need to change the number of operators to 2 and ensure that up to 2 operators can be working on calls at the same time. See the file MM2.igx. The book contains the output of running this file for 1000 hours. 992

993 22.3 A Series System This section uses Process Model to simulate a series queuing system. The auto assembly line example used previously will be used. 993

994 Example The last two things that are done to a car before its manufacture is complete are installing the engine and putting on tires. An average of 54 cars per hour arrive requiring these two tasks. One worker is available to install the engine and can service an average of 60 cars per hour. After the engine is installed, the car goes to the tire station and waits for its tires to be attached Three workers server at the tire station. 994

995 Assume interarrival times and service times are exponential.
Each works on one car at a time and can put tires on a car in an average of 3 minutes. Assume interarrival times and service times are exponential. Simulate this system for 400 hours. 995

996 Solution See the file Carassembly.igx.
The key to creating a queuing network with Process Model is to build the diagram one service center at a time. Begin by creating arrivals in Example 1 and then create the engine production center and the tire center. The tire center will need to have the number of servers changed. The system is flexible and can be modeled to accommodate changes in Move Time and other factors. 996

997 The Effect of a Finite Buffer
The output form this model can be found in the book. Suppose we only have enough space for two cars to wait for Tire installation. This is called a buffer of size 2. To model this change the Input Capacity in the Tire Activity dialog box needs to be changed to 2. Running the simulation shows the average wait is 6 hours. Clearly we need more storage space. 997

998 22.4 Simulating Open Queuing Networks - Example
An open queuing network consists of two servers: Server 1 and Server 2. An average of 8 customers per hour arrive from outside at Server 1. An average of 17 customers per hour arrive from outside at Server 2. Interarrival times are exponential. Server 1 can server at an exponential rate of 20 customers per hour and Server 2 can serve at an exponential rate of 30 customers per hour. 998

999 Simulate this system for 400 hours.
After completing service at Server 1, half the customers leave the system and half go to Server 2. After completing Service at Server 2 75% of the customers complete service and 25% return to Server 1. Simulate this system for 400 hours. 999

1000 Solution See the file Open.igx.
To begin we create two arrival entities: one arrival entity representing external arrivals to Server 1 and one arrival entity representing external arrivals to Server 2. After creating the Servers we use the connector tool to create a link from Server 1 to Server 2, a link Server 2 to Server 1, and a link from Server 2 to Server 1, and a link from Server 1 and 2 to the exiting system. 1000

1001 The project diagram can be seen in the book.
We input the arrival and service rates, move times along with the routing information into the software. The project diagram can be seen in the book. Note as the simulation runs some calls move between the servers and some exit the system! Seeing this movement really makes the concept of an open queuing network come alive! 1001

1002 22.5 Simulating Erlang Service Times
Service times often do not follow exponential distribution. Usually the Erlang distribution is used to model nonexponential service times. An Erlang distribution can be defined by a mean and a shape parameter k. The shape parameter must be an integer. It can be shown that Standard Deviation of Erlang = 1002

1003 Therefore is we know the mean and the standard deviation of the service times we may determine an appropriate value of k. The syntax for generating Erlang service times in Process Model is ER(Mean, k). 1003

1004 22.6 What Else Can Process Model Do?
A brief description of other modeling features included in Process Model follows: Bulk Arrivals and Services Often at a restaurant people arrive in groups. This arrival pattern is called bulk arrivals. Reneging Perhaps people hang up when calling an 800 number if they are put on hold more than 5 minutes. Process Model can model such balking or reneging behavior. Variation in Arrival pattern At a restaurant or bank arrival rate varies substantially over the course of the day. Variable arrival rate patterns can easily be modeled with Process Model. 1004

1005 Variation of Number of Servers
During the day workers take breaks and go to lunch. Also many companies vary the number of servers during the day, Process Model can easily model variation in server capacity. Priorities In an emergency room more seriously ill patients are given priority over earlier arriving less ill patients. Process Model can handle complex priority mechanisms. 1005

1006 Operations Research: Applications and Algorithms
Chapter 23 Simulation with the Excel to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 1006

1007 Description Many simulations, particularly those involving financial applications can easily be performed with the EXCEL @RISK provides a complete statistical or graphical summary of the results. In this Chapter we will see can be used to simulate a wide variety of situations ranging from the NPV of a new project to the probability of winning at craps. 1007

1008 23.1 Introduction to @RISK: The Newspaper Problem
@RISK is used to model situations where we make decisions under certainty. We need to determine how many Year nature calendars to order in August It costs $2.00 to order each calendar and we sell each calendar for $4.50. After January 1, 2000 leftover calendars are returned for $0.75. Our best guess is that the number of calendars demanded is governed by the following probabilities. 1008

1009 How many calendars should we order?
Demand Probability 100 .3 150 .2 200 250 .15 300 .05 How many calendars should we order? Our final result is in file newsdiscrete.xls. 1009

1010 We proceed as follows: Step 1: Enter parameter values in C3:C5.
Step 2: It can be shown that ordering an amount equal to one of the possible demands for calendars always maximizes expected profit. Step to generate demand according to above probabilities. In particular, our simulated means, variances, and other statistics will be much more accurate than is we used the Monte Carlo simulation option. Step 4: In cell B7 computer Full price revenue with formula =C3*MIN(C1,C2) 1010

1011 Step 6: In B9 computer ordering costs with formula =C1*C5.
Step 5: In B8 computer salvage revenue with formula =C4*IF(C1>C2,(C1-C2),0) Step 6: In B9 computer ordering costs with formula =C1*C5. Step 7: In cell B10 computer profit with formula = B7+B8-B9. Step 8: In cell C1 enter the possible order quantities with the formula = RISKSIMTABLE([100,150,200,350,300}). If we obtain the arguments of function by pointing to a different cell then we do need to omit the { and } brackets. 1011

1012 Step 9: Select B10 as an output cell by selecting the single arrow icon.
Step 10: Select the Simulations Settings Icon and from the Iteration tab select 1000 iterations and 5 simulations. From the sampling tab choose LATIN HYPERCUBE from the SAMPLING option. This will to recalculate demand and profit times for each of the five order quantities. Step 11: To run the simulation select the start stimulation icon. The summary results are shown in the book. To obtain detailed statistics select INSERT DETAILED STATISTICS. 1012

1013 Interpretation of Statistical Output
From Figures 2 and 3 in the book we find that Average profit for 1000 trials when calendars are ordered is $ The RISKSIMTABLE function uses the same set of random numbers t generate demand for each simulation. You can return to the results at any time by selecting the RESULTS ICON. You can return to your worksheet from RESULTS by selecting WINDOW SHOW EXCEL WINDOW. 1013

1014 Finding Confidence Interval for Expected Profit
If we ran 1000 or more would generate a different set of profits we would get a different estimate of average profit. So we do not learn average profit exactly from any simulation. How accurate is our estimate of average profit? From Section 9.9 we know we can be 95% sure that average or expected profit for calendars is between (Mean Profit)±t(.25,199) Mean Standard Error) 1014

1015 Modeling Normal Demand with the =RISKNORMAL FUNCTION
The assumption of discrete demand is unrealistic. To model normal demand simply change cell C2’s formula to =RISKNORMAL(200,30). @RISK performs the generation of a normal random variable by using the Inverse Transformation Method. With normal demand any order quantity is reasonable, because demand may assume any value. 1015

1016 Finding Targets and Percentiles
At bottom of Detailed Statistics output we may enter targets as values or percentages. Enter a value tells you fraction of iterations for which output cell was less than or equal to target. 1016

1017 Creating Graphs with @RISK
To create a histogram of possible profits go to the RESULTS menu and right click on our output cell (Profit) from the Explorer style list. Then choose histogram and the third simulation and you will obtain a histogram. By moving the sliders at the bottom of the graph you may zero in on the probability of any range of values. By right clicking on a histogram or cumulative ascending graph and selecting FORMAT we can obtain a cumulative descending graph. 1017

1018 In a cumulative descending graph the Y- coordinate is probability profit excess x- coordinate.
1018

1019 Using the Report Settings Option
You may also directly create your graphs and statistical reports with the Report Settings Option. By choosing the Report Settings Icon you can choose any output to be sent directly to the current workbook or a new workbook. 1019

1020 Using the @RISK Statistics
Instead of generating huge reports you may just want the mean and standard deviation of your output cells placed in your spreadsheet. @RISK 4.0 contains statistical functions that accomplish this goal. 1020

1021 23.2 Modeling Cash Flows from a New Product
Managers often want to do a best case, worst case and most likely analysis before making a decision. They often fail to realize that any value between the best and worst case may occur. The triangular random variable makes it easy to make this happen. We will model Year 1 market use with a triangular random variable. will generate Year 1 market share by making the likelihood of a given market use proportional to the height of the “triangle” 1021

1022 The Lilly Model To model this from of the product life cycle we must model the following sources of uncertainty: Number of years for which unit sales increase. Average annual percentage increase in sales during sales increase portion of sales period. Average annual percentage decrease in sales during sales decrease portion of sales period. 1022

1023 Part © We now use a tornado graph to determine the key drivers of NPV.
To obtain a tornado graph you must have selected Collect All Outputs box from Simulation Settings Sampling dialog box. Each bar of the Correlation Tornado Graph gives the correlation of random variable with NPV. 1023

1024 In short, the uncertainty about Year 1 Unit Sales is very important for determining NPV, but other random variables could probably be replaced by their mean without changing the distribution of NPV by much. For random variable the Regression tornado graph computes the “standardized regression coefficient” for the @RISK random variable when we try and predict NPV from random variables in the spreadsheet. 1024

1025 A standardized regression coefficient tells us the number of standard deviations by which NPV changes when the random variable changes by 1 standard deviation. For example A one standard deviation change in Year 1 Unit Sales change NPV by .98 standard deviations. A one standard deviation in annual growth rate will increase NPV by .15 standard deviations. 1025

1026 23.3 Project Scheduling Models
We used linear programming to determine the length of time needed to complete a project. We learned how to identify critical activities, where an activity is critical if increasing its activity time by a small amount increases the length of time needed to complete the project by the same amount. We have assumed that all activity times are known with certainty. In reality, these times are usually uncertain. Of course, this implies that the length of time needed to complete the project is also uncertain. 1026

1027 A complete example (repeated from Chapter 5) is included.
It also implies that for each activity, there is a probability that the activity is critical. A complete example (repeated from Chapter 5) is included. The first step in the solution is to choose distributions for the uncertain activity times. We illustrate a distribution that has become popular in project scheduling, called the Pert distributions. 1027

1028 23.4 Reliability and Warranty Modeling
In today’s hi-tech world, it is very important to be able to computer the probability that a system made up of machines will work for a desired amount of time. The subject of estimating the distribution of machine failure times and the distribution of time to failure of a system is known as reliability theory. 1028

1029 Distribution of Machine Life
We assume the length of time until failure of a machine is a continuous random variable having a distribution function F(t) = P(X≤t) and a density function f(t). Thus for small Δt the probability that a machine will fail between time t and t + Δt is approximately f(t) Δt . The failure rate of a machine at time t (call it r(t) is defined to be (1/ Δt )times the probability that the machine will fail between time t and t+ Δt given that the machine has not failed by time t. 1029

1030 Thus If r(t) is an increasing function of t the machine is said to have an Increasing Failure Rate (IFR). If r(t) is a decreasing function of t the machine is said to have a Decreasing Failure Rate (DFR). Thus a machine whose lifetime follows an exponential random variable has constant failure rate. 1030

1031 The random variable that is most frequently used to model the time till failure of a machine is the Weibull random variable. The Weibull random variable has the following density and distribution functions 1031

1032 Common Types of Machine Combinations
Three types of common combinations of machines are as follows: A Series system functions only as long as each machine functions. A Parallel system functions as long as at least one machine functions. A k out of n system consists of n machines and is considered working as long as k machines are working. Of course by combining these types of combinations a very complex system may be modeled. 1032

1033 Estimating Warranty Expenses
If we know the distributions of the time till failure of a purchased makes it a simple matter to estimate the distribution of warranty costs associated with a product. 1033

1034 23.5 The RISKGENERAL Function
What if a continuous random variable such as market share does not appear to follow normal or triangular distribution? We can model it with the =RISKGENERAL function. A sample RISKGENERAL analysis can be found in the riskgeneral.xls file. 1034

1035 23.6 The RISKCUMULATIVE Random Variable
Recall that with the RISKGENERAL function we estimated relative likelihood of a random variable taking on various values. With the RISKCUMULATIVE function we estimate the cumulative probability that the random variable is less than or equal to several given values. With the RISKCUMULATIVE function we may approximate the cumulative distribution function for any continuous random variable. 1035

1036 Example A large auto company believes net income for North American Operations (NAO) for the next year may be between 0 and 10 billion dollars. The auto company estimates there is a 10% chance that net income will be less than or equal to 1 billion dollars, a 70% chance that net income will be less than or equal to 5 billion dollars , and a 90% chance that net income will be less than or equal to 9 billion dollars. to simulate NAO’s net income for the next year. 1036

1037 Solution Our work is in the file Cumulative.xls.
The RISKCUMULATIVE function takes as inputs the following quantities: The smallest value assumed by the random variable. Te largest value assumed by the random variable. Intermediate values assumed by the random variable. For each intermediate value of the random variable, the cumulative probability that the random variable is less than or equal to the intermediate value. 1037

1038 @RISK will now ensure that
For net income x between 0 and 1 billion the cumulative probability that net income is less than or equal to x rises with a slope equal to .1. For net income x between 1 and 5 billion the cumulative probability tat net income is less than or equal to x rises with a slope equal to .15. For net income x between 5 and 9 billion the cumulative probability that net income is less than or equal to x rises with a slope equal to .5. For net income x greater than 9 billion the cumulative probability that net income is less than or equal to x rises with a slope equal to .10. 1038

1039 23.7 The RISKTRIGEN Random Variable
When we use the RISKTRIANG function we are assuming we know the absolute worst and absolute best case that can occur. Many companies prefer to us a triangular random variable in which the worst case and best case are defined by a percentile of the random variable. 1039

1040 23.8 Creating a Distribution Based on a Point Forecast
Each day we are constantly inundated by forecasts. While the forecasts may be the best available, they are almost sure to be incorrect. In short, any single valued implies a distribution for the quantity being forecasted. How can we take a point forecast and find a random which correctly models the uncertainty inherent in the point forecast? 1040

1041 The key to putting a distribution around a point forecast is to have some historical data about the accuracy of past forecasts of the quantity of interest. We begin by seeing if past forecast exhibit any bias. For each past forecast we determine the Actual value/Forecasted value. Once we have eliminated forecast bias we look at the standard deviation of the percentage errors of the unbiased forecast and use the following random variable to model the quantity being forecasted 1041

1042 RISKNORMAL(Unbiased Forecast),(Percentage Standard Deviation of Unbiased Forecasts)*(Unbiased Forecast). 1042

1043 23.9 Forecasting Income of a Major Corporation
In many large corporations different parts of a company makes forecasts for quarterly net income and an analyst in the CEO’s office pulls together the individual forecast and creates a forecast for the entire company’s net income. In this section we show an easy way to pool forecasts from different portions of a company and create a probabilistic forecast for the entire company. 1043

1044 Recall that the correlation between two random variables must lie between -1 and +1 with
Correlation near +1 implies strong positive linear relationships Correlation near -1 implies strong negative linear relationships Correlation near +.5 implies a moderate positive linear relationships Correlation near -.5 implies moderate negative linear relationships Correlation near 0 implies weak linear relationships 1044

1045 23.10 Using Data to Obtain Inputs for New Product Simulations
Many companies use subjective estimates to obtain inputs to a new product simulation. In many situations, however, past data may be used to obtain estimates of key variables such as share, price, volume and cost uncertainty. The utility of each of our models will depend on the type of data you have available. 1045

1046 The Scenario Approach to Modeling Volume Uncertainty
When trying to model volume of sales for a new product in the auto and drug industries it is common to look for similar products that were sold in the past. For past products we often have knowledge of the following: Accuracy of forecasts for Year 1 sales volume Data on how sales change after the first year 1046

1047 Modeling Statistical Relationships with One Independent Variable
Suppose we want to model the dependence of a variable (Y) on a single independent variable (X). We proceed as follows: Step 1: We try and find the straight line, power and exponential curve tat best fits the data. Straight line is a curve of form Y=a+bX Power function is of form Y=axb Exponential function is of form Y=aebx Step 2: For each curve and each data point computer the percentage error 1047

1048 Step 4: Choose the curve that yields the lowest MAPE as the best fit.
Step 3: For each curve computer Mean Absolute Percentage Error (MAPE) by averaging the absolute percentage errors. Step 4: Choose the curve that yields the lowest MAPE as the best fit. Step 5: Assuming that at least one of the three curves appears to have some predictive value model the uncertainty associated with the relationship between X and Y as follows: If straight line is best fit then model Y as =RISKNORMAL (Prediction, Standard deviation of actual errors) If power or exponential curve is best fit then model Y as =RISKNORMAL (Prediction, Standard Deviation of Percentage Error) 1048

1049 23.11 Simulation and Bidding
In situations in which you must bid against competitors, simulation can often be used to determine an appropriate bid. In this chapter we show how to use simulation to determine a bid that maximizes your expected profit. 1049

1050 Uniform Random Variables
A random variable is said to be uniformly distributed on the closed interval if the random variable is equally likely to assume any value between a and b inclusive. To generate samples from a U(a,b) random variable enter the formula =RISKUNIFORM(a,b) into a cell. 1050

1051 23.12 Playing Craps with @RISK
Craps is a very complex game. it is easy to estimate the probability of winning at craps. In the game of craps a player tosses two dice. If on the first toss a 2,3, or 12 results the player loses. If on the first toss the player rolls a 7 or 11 the player wins. Otherwise the player continues tossing the die until she either matches the number thrown on the first roll or tosses a 7. 1051

1052 If you roll your point before rolling a 7 you win.
If you roll a 7 before you roll your point you lose. By complex calculations it can be shown that the player wins at CRAPS 49.3% of the time. to verify this. 1052

1053 Solution The key observation is to note that we do not know how many rolls the game will take. After each dice roll we keep track of the game status 0 = Game lost. 1 = Game won. 2 = Game still going. Our output cell will keep track of the status of the gram after the 50th toss. A “1” will indicate a win and a “0” will indicate a loss. Our work is in the file Craps.xls. 1053

1054 23.13 Simulating the NBA Finals
Our Indian Pacers came within two plays of winning the 2000 NBA championship. Before the series, what was the probability that the Lakers would win the series. From the Sagarin ratings we found that the Lakers are around 4 points better than Pacers. Home team has a 3-point edge and games play out according to a normal distribution with mean equal to our predication and a standard deviation of 12 points. 1054

1055 Past history shows the Sagarin forecasts exhibit no bias.
In file Finals.xls we simulate the NBA Finals. 1055

1056 @RISK Crib Sheet Once you are familiar with the function of icons, you will easy to learn. Icons allow you to open and simulations. The Simulation Settings icon allows you to control the settings for the simulation. Simulations Settings dialog box features four tabs: Iterations Tab #Iterations is how many times you to recalculate the spreadsheet. #Simulations – Leave this at 1 unless you have a =RISKSIMTABLE function in spreadsheet. 1056

1057 Pause on Errors box causes @RISK to pause if an error occurs in any cell during the simulation.
Update Display box to show the results of each iteration on the screen. Sampling Sampling Type allows you to select Latin Hypercube sampling or Monte Carlo sampling. Latin Hypercube is slower but more accurate. Standard Recalc – If you choose Expected Value of the random variable unless random variable is discrete. Collecting Distribution Samples – Check all if you want to get tornado graphs, a scenario analysis or extract data. Also check this box if you want statistics on cells generated functions. 1057

1058 Random Number Generator Seed – When the seed is set to 0, each time you run a simulation you will obtain different results. Other possible seed values are integers between 1 and 32,767. Whenever a nonzero seed is chosen the same values for the Input cells and Output cells will occur. Macro Tab to run a MACRO before or after each iteration of a simulation. After each iterations recalc and entering Macro1 after evaluating each output cell would result in the following sequence of events functions and calculate output cells Run Macro 1 functions and calculate output cells. Select Output Cells icon enables you to select an output cell for will create statistics. 1058

1059 List Input and Output Cells icons list all output cells.
Run Simulation icon starts the simulation. Show Results icon allows you to see results. There are two windows: Summary Results Window containing Minimum, Mean and Maximum for all Input and Output cells. Simulation Statistics containing more detailed statistics. Define Distribution icon allows you to see the mass function or density function for any random variable. 1059

1060 Graphing - To obtain a graph by right clicking on the cell from the Explorer interface. Then choose type of graph desired. A histogram gives the fraction of iterations assuming different values. A cumulative ascending graph the y-axis gives the fraction of iterations yielding a value <= the value on the x-axis axis. For a cumulative descending graph the y-axis gives the fraction of iterations yielding a value >= the value on the x-axis. 1060

1061 At the bottom of simulation statistics windows is a TARGET option
At the bottom of simulation statistics windows is a TARGET option. You may enter a VALUE or PERCENTILE fills in the one you left out. The Extracting Data icon allows you to see the values of =RISK functions and Output cells that @RISK created on the iterations run. The Sensitivity icon provides a Tornado graph right click on output cell and select Tornado graph. 1061

1062 @RISK Functions Riskdiscrete Function Riskduniform Function
This generates a discrete random variable that takes on a finite number of values with known probabilities. Riskduniform Function Use Riskduniform when a random variable assumes several equally likely values. Binomial Distribution Use =RISKBINUOMIAL when you have repeated independent trials each having the same probability of success. 1062

1063 RISKTRIANG RISKTRIGEN Function RISKUNIFORM Function
Enables us to model a nonsymmetrical continuous random variable. Generalizes the well-known idea of best case, worst case, and most likely scenario. RISKTRIGEN Function Sometimes we want to use a triangular random variable. But we are not sure of the absolute best and worst possibility. RISKUNIFORM Function Suppose a competitor’s bid is equally likely to be anywhere between 10 and 30 thousand dollars. This can be modeled by a uniform random variable with the formula =RISKUNIFORM(10,30) 1063

1064 RISKGENERAL Function What is a continuous random variable does not appear to follow a normal or triangular distribution? We can model it with the =RISKGENERAL function. The syntax of RISKGENERAL is as follows: Begin with the smallest and largest possible values Then enclose in {} the numbers for which you feel you can compare relative likelihoods. Finally enclose in {} the relative likelihoods of the numbers you have previously listed. 1064

1065 Modeling Correlations
Suppose we have 3 Normal random variables each having mean 0 and standard deviation 1 which are correlated as follow: Variable 1 and Variable 2 have .7 correlation Variable 1 and Variable 3 have .8 correlation Variable 2 and Variable 3 have .75 correlation 1065

1066 Truncating Random Variables
If the random variable assumes a value between 0 (the truncation value) and 1 (the upper truncation value) that value is retained. Otherwise, another value if generated. The truncation values must be within 5 standard deviations of the mean. RISKPERT Function The =RISKPERT function is similar to the =RISKTRIANG function. The RISKPERT function is used to model duration of projects. 1066

1067 While the =RISKTRIANG has a piecewise linear density function, the =RISKPERT density has no linear segments. It is a special case of a Beta random variable. Common Error Message Invalid number of arguments error message means an incorrect syntax is used with function. 1067

1068 Operations Research: Applications and Algorithms
Chapter 24 Forecasting Models to accompany Operations Research: Applications and Algorithms 4th edition by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. 1068

1069 Description We discuss two important types of forecasting methods: extrapolation methods and causal forecasting methods. Extrapolation methods (unlike the causal forecasting methods) don’t take into account what “caused” past data; they simply assume that past trends and patterns will continue in the future. Causal forecasting methods attempt to forecast future values of a variable (called the dependent variable) by using past data to estimate the relationship between the dependent variable and one or more independent variables. 1069

1070 24.1 Moving-Average Forecasting Methods
Let x1, x2,…,xt,… be observed values of a time series, where xt is the value of the time series observed during period t. One of the most commonly used forecasting methods is the moving-average method. We define ft,1 to be the forecast period for period t+1 made after observing xt. For the moving-average method, ft,1 = average of the last N observations =average of xt, xt-1, xt-2,…,xt-N+1 where N is a given parameter. 1070

1071 Choice of N We will use the mean absolute deviation (MAD) as our measure of forecast accuracy. Before defining the MAD, we need to define the concept of a forecast error. Given a forecast for xt, we define et to be the error in our forecast for xt, to be given by et=xt-(forecast for xt) The MAD is simply the average of the absolute values of all the et’s. We begin with an explanation of the Excel OFFSET function. 1071

1072 This function lets you pick out a cell range relative to a given location in the spreadsheet.
The syntax of the OFFSET function is as follows: OFFSET (reference, rows, columns, height, width) Reference is the cell from which you base the row and column references. Rows helps locate the upper left-hand corner of the OFFSET range. Rows is measured by number of rows up or down from the cell reference. Columns helps locate the upper left-hand corner of the OFFSET range. Columns is measured by the number of columns left or right from the cell references. 1072

1073 Height is the number of rows in the selected range.
Width is the number of columns in the selected range. File Offsetexample.xls contains some examples of how the OFFSET function works. The nice thing about the OFFSET function is that it can be copied like any formula. Moving-average forecast perform well for a time series that fluctuates about a constant base level. We find that air conditioner sales is a good example. They exhibit seasonality: The peaks and valleys of the series repeat at regular 12 month intervals. 1073

1074 24.2 Simple Exponential Smoothing
If a time series fluctuates about a base level, simple exponential smoothing may be used to obtain good forecasts for future values of the series. To describe simple exponential smoothing let At = smoothed average of a time series after observing xt. After observing xt, At is the forecast for the value of the time series during any future period. The key equation in simple exponential smoothing is At = αxt + (1- α)At-1. 1074

1075 In the above equation, α is a smoothing constant that satisfies 0< α<1.
As with moving-average forecasts, we let ft,k be the forecast for xt+k made at the end of period t. Then At=ft,k Assuming that we are trying to forecast one period ahead, our error for predicting xt is given by et=xt-ft-1,1 = xt-At-1. Thus, our new forecast At=ft, 1 is equal to our old forecast (At-1) plus a fraction of our period t error (et). 1075

1076 This implies that if we “overpredict” xt, we lower our forecast, and if we “underpredict” xt, we raise our forecast. For larger values of the smoothing constant α, more weight is given to the most recent observation. 1076

1077 24.3 Holt’s Method: Exponential Smoothing with Trend
If we believe that a time series exhibits a linear trend, Holt’s method often yields good forecasts. At the end of the tth period, Holt’s method yields an estimate of the base level (Lt) and the per-period trend (Tt) of the series. To computer Lt, we take a weighted average of the following two quantities: Xt, which is an estimate of the period t base level from the current period Lt-1+Tt-1, which is an estimate of the period t base level based on previous data 1077

1078 To computer Tt, we take a weighted average of the following two quantities:
An estimate of trend from the current period given by the increase in the smoothed base from t-1 to period t Tt-1, which is an estimate of the trend As before, we define ft,k to be the forecast for xt+k made at the end of period t. Then ft,k=Lt,kTt To initialize Holt’s method, we need an initial estimate of the base and an initial estimate (call it T0) of the trend. 1078

1079 A multiplicative version of Holt’s method can be used to generate good forecasts for a series of the form xt=abtεt Here , the value of b represents the percentage growth in the base level of the series during each period. 1079

1080 A Spreadsheet Interpretation of the Holt Method
The file Holt.xls contains an implementation of the Holt method. We can use an Excel one-way data table to determine values of α and β that yields a small MAD. 1080

1081 24.4 Winter’s Method: Exponential Smoothing with Seasonality
The appropriately named Winter’s method is used to forecast time series for which trend and seasonality are present. To describe Winter’s method, we require two definition. Let c = the number of periods in the length of the seasonal pattern. Let st be an estimate of a seasonal multiplicative factor for month t obtained after observing xt. 1081

1082 In what follows Lt and Tt have the same meaning as they did in Holt’s method.
Each period Lt, Tt, and st are updated by using the following equations. Again, α, β, and γ are smoothing constants, each of which is between 0 and 1. 1082

1083 The first equation above updates the estimate of the series bas by taking a weighted average of the following two quantities: Lt-1 + Tt-1 which is our base level estimate before observing xt The deseasonlized observation , which is an estimate of the base obtained from the current period The second equation above is identical to the Tt equation used to update trend in the Holt method. 1083

1084 The third equation above updates the estimates of month t’s seasonality by taking a weighted average of the following two quantities: Our most recent estimate of month t’s seasonality (st-c) , which is an estimate of month t’s seasonality, obtained from the current month 1084

1085 Initialization of Winter’s Method
To obtain good forecasts with Winter’s method, we must obtain good initial estimates of base, trend, and all seasonal factors. 1085

1086 Forecasting Accuracy For any forecasting model is which forecast errors are normally distributed, we may use MAD to estimate se = standard deviation of our forecasts The relationship between MAD and se is given in this formula: se=1.25MAD 1086

1087 24.5 Ad Hoc Forecasting Suppose we want to determine how many tellers a back must have working each day to provide adequate service. Can we develop a single forecasting model to help the bank predict the number of customers who will enter search day? Let xt = number of customers entering the bank on day t. 1087

1088 We postulate that xt = BxDWt x Mtx εt , where
B = base level of customer traffic corresponding to an average day DWt = day of the week factor corresponding to the day of the week on which day t falls Mt = month factor corresponding to the month during the εt = random error teams whose average value equals 1 To begin, we estimate B = average number of arrivals per day the bank is open. We illustrate the estimation by 1088

1089 Similarly, we find the rest of the days that the bank is open.
To estimate for Mt (say for May) we write In a similar fashion, we find the Mt for the remaining months. Our simple model yielded a MAD of 79.1. If this method were used to generate forecast for the coming year, however, the MAD would probably exceed 79.1. 1089

1090 This is because we have fit our parameters to past data; there is no guarantee that future data will “know” that they should follow the same pattern as past data. Suppose the bank manager observes that on the day after a holiday, bank traffic is much higher than the model predicts. How can we use this information to obtain more accurate customer forecasts for days after holidays. We find that the average value of Actual/Forecast for days after a holiday is 1.15. 1090

1091 Thus, for any day after a holiday, we obtain a new forecast simply by multiplying our previous forecast by 1.15. 1091

1092 24.6 Simple Linear Regression
Often, we predict the value of one variable (called the dependent variable) from the value of another variable (the independent variable). If the dependent variable and the independent variable are related in a linear fashion, simple linear regression can be used to estimate this relationship. To illustrate simple linear regression, let’s recall the Giapetto problem. To set up this problem, we need to determine the cost of producing a soldier and the cost of producing a train. 1092

1093 We call the least squares regression line.
The value and minimizing F( , ) are called the least squares estimates of and We call the least squares regression line. Essentially, if the least squares line fits the points well, we will use as our predication for y1. 1093

1094 Every least squares line has two properties:
It passes through the point The least squares line “splits” the data points, in the sense that the sum of the vertical distances from point above the least squares line to the least squares line equals the sum of the vertical distances from points below the least squares line to the least squares line. 1094

1095 How Good a Fit? How do we determine how well the least squares line fits on the data points? To answer this question, we need to discuss three components of variation: sum of squares total (SST), sum of squares error (SSE), and sum of square regression (SSR). Sum of squares total is given by SST measures the total variation of yi about its mean. Sum of squares error is given by If the least squares line passes through all the data points, SSE=0. 1095

1096 The sums of squares regression can be shown that SST=SSR+SSE
SST is a function only of the values of y. For a good fit, SSE will be small so the equation above shows that SSR will be large for a good fit. More formally, we may define the coefficient of determination (R2) for y by A measure of linear association between x and y is the sample linear correlation rxy. 1096

1097 A sample correlation near +1 indicates a strong positive linear relationship between x and y; a sample correlation near -1 indicates a strong negative linear relationship between x and y; and a sample correlation near 0 indicates a weak linear relationship between x and y. 1097

1098 Forecasting Accuracy A measure of the accuracy of predictions derived from regression is given by the standard error of the estimate (se). If we let n = number of observations, se is given by Any observation for which y is not within 2se of called an outlier. 1098

1099 T-Tests in Regression Using a t-test, we can test the significance of a linear relationship. We can compute the t-statistic given by 1099

1100 Assumptions Underlying the Simple Linear Regression Model
The variance of the error term should not depend on the value of the independent variable x. This assumption is call homoscedasticity. If the variance of the error term depends on x, then we say that heteroscedasticity is present. Assumption 2 Errors are normally distributed. Assumption 3 The errors should be independent. This assumption is often violated when data are collected over time. 1100

1101 Independence of the errors implies that knowing the value of one error should tell us nothing about the value of the next (or any other) error. This sequence of errors exhibits the following pattern: a positive error is usually followed by another positive error and a negative error is usually followed by a negative error. This pattern indicates that successive errors are not independent; it is referred to as positive autocorrelation. 1101

1102 In other words, positive autocorrelation indicates that successive errors have a positive linear relationship and are not linearly independent. If the sequence of errors in time sequence resembles Figure 13 in the book we have negative autocorrelation. Here the successive errors have a negative linear relationship and are not independent. When there is no obvious pattern present, and the independence assumption appears to be satisfied. 1102

1103 Observe that the errors “average out” to 0, so we would expect about half our errors to be positive and half to be negative. Thus, if there is no pattern in the errors, we would expect the errors to change sign about half the time. This observation enables us to formalize the preceding discussion as follows. If the errors change sign very rarely, they probably violate the independence assumption, and positive autocorrelation is probably present. 1103

1104 If the error change sign very often, they probably violate the independence assumption, and negative autocorrelation is probably present. If the errors change sign about half the time, they probably satisfy the independence assumption. If positive or negative autocorrelation is present, correcting for the autocorrelation will often result in much more accurate forecasts. 1104

1105 Running Regressions with Excel
Cost.xls illustrates how to run a regression with Excel. Important numbers in the output include: R Square This is r2. Multiple R This is the square root of r2, with the sign of Multiple R being the same as the slope of the regression line. Standard Error This is se Observations This is the number of data points. SS column The Regression entry is SSR. The Residual entry is SSE. The Total entry is SST. 1105

1106 Coefficients column The Intercept entry and the trains entry.
t stat This gives the observed t-statistic for the Intercept and the trains variable. Standard Error column The Intercept entry gives the standard error, and the trains entry gives the standard error. The coefficient entry divided by the standard error entry yields the t-statistic for the intercept or slope. P-value For the intercept and slope, this gives Probability(|tn-2|≥|Observed t-statistic|). 1106

1107 Obtaining a Scatterplot with Excel
To obtain a scatterplot with Excel, let the range where your independent variable is be the X range. 1107

1108 24.7 Fitting Nonlinear Relationships
Often, a plot of points of the form (xi,yi) indicates that y is not a linear function of x. The plot may indicate that there is a nonlinear relationship between x and y. 1108

1109 Using a Spreadsheet to Fit a Nonlinear Relationship
VCR.xls shows how we can use a spreadsheet to fit a curve to the data. 1109

1110 Utilizing the Excel Trend Curve
The excel Trend Curve makes it easy to fit an equation to a set of data. After creating an X-Y scatterplot, click on the points in the graph until they turn gold. Then select Chart Add Trendline. Choosing Linear yields the straight line that best fits the points. Choose Logarithmic if the scatterplot looks like e or f from Figure 17 in the book. Choose Power if the scatterplot looks like a or b in Figure 17 of the book. 1110

1111 Choose Exponential if the scatterplot looks like c or d from Figure 17.
Choose Polynomial of order n (n = 1,2,3,4,5, or 6)yields the best-fitting equation of the form 1111

1112 24.8 Multiple Regression In many situations, more than one independent variable may be useful in predicting the value of a dependent variable. We then use multiple regression. In multiple regression, we model the relationship between y and the k independent variables by Where εi is an error term with mean 0, representing the fact that the actual value of yi may not equal 1112

1113 Estimation of the βi’s We call the least squares regression equation.
1113

1114 Goodness of Fit Revisited
For multiple regression, we define SSR, SSE and SST as we did earlier. We also find the R2 =SSR/SST = percentage of variation in y explained by the k independent variables and 1-R2 = percentage of variation in y not explained by the k independent variables. If we define the standard error of the estimate as 1114

1115 Hypothesis Testing If we have included independent variables x1, x2,…xk in a multiple regression, we often want to test H0:βi = 0 against Ha:βi ≠ 0. To test this hypotheses, we compute where StdErr( ) measures the amount of uncertainty present in our estimate of βi 1115

1116 Choosing the Best Regression Equation
How can we choose between several regression equations having different sets of independent variables? We usually want to choose the equation with the lowest value of se, since that will yield the most accurate forecasts. We also want the t- statistics for all variables in the equation to be significant. These two objectives may conflict, in which case it is difficult to determine the “best” equation. The Cp statistic, then the regression chosen should have a Cp value close to +1. 1116

1117 Multicollinearity If an estimated regression equation contains two or more independent variables that exhibit a strong linear relationship, we say that Multicollinearity is present. A strong linear relationship between some of the independent variables may make the computer’s estimates of the βi’s unreliable. By the way, if an exact linear relationship exists between two or more independent variables, there are an infinite number of combinations of the which will minimize the sum of the squared errors, and most computer packages will print an error message. 1117

1118 Dummy Variables Often, a nonqualitative or qualitative independent variable may influence the dependent variable. To model the effect of a categorical variable on a dependent variable, we define c-1 dummy variables as follows: 1118

1119 Interpretation of Coefficients of Dummy Variables
How do we interpret the coefficients of dummy variables? To illustrate, let’s determine how whether or not a day is a payday affects credit union traffic. We calculate the difference between paydays and nonpays by setting payday variable x5 =1 for paydays and =0 for non paydays. 1119

1120 Multiplicative Models
Often, we believer that there is a relationship of the following form Thus to estimate we run a multiple regression with the dependent variable being ln Y and the independent variables being ln x1, ln x2,…ln xk. 1120

1121 Heteroscedasticity and AutoCorrelation in Multiple Regression
By plotting the errors in time-series sequence, we may check to see whether the errors from a multiple regression are independent. If autocorrelation is present and the errors do not appear to be independent, then correcting for autocorrelation will usually yield better forecasts. By plotting the errors (on the y-axis) against the predicated value of y we can determine whether homoscedasticity or heteroscedasticity is present. 1121

1122 If homoscedasticity is present, the plot should show no obvious pattern, whereas if heteroscedasticity is present, the plot should show an obvious pattern indicating that the errors somehow depend on the predicated value of y. If heteroscedasticity is present, the t-tests described in this section are invalid. 1122

1123 Implementing Multiple Regression on a Spreadsheet
Credit.xls includes the output for the regression The regression output has the following interpretation: Intercept This is Standard Error This is se R-Square This is R2. It means that together all the independent variables in the regression explain % of the variation in the number of customers arriving daily. Observations This is the number of data points. Total df This is the degrees of freedom used for the t-test of H0:βi = 0 against H1:β ≠ 0 1123

1124 STANDARD ERROR For each independent variable, this row yields StdErr .
Coefficients For each independent variable, this column yields the coefficient of the independent variable in the least squares equation. STANDARD ERROR For each independent variable, this row yields StdErr t stat This gives the observed t-statistic for the Intercept and all independent variables. Standard Error column The Intercept entry gives the standard error, and the coefficient entries give the standard error for each independent variable. P-value For the intercept and each independent variable in a regression with k independent variables, this gives Probability(|tn-k-1≥Observed t-statistic|). 1124


Download ppt "Operations Research: Applications and Algorithms"

Similar presentations


Ads by Google