In session one, we compared decision making processes in business settings with and without significant uncertainty. We have used the wireless data plan problem as an example illustrating a high uncertainty business setting. We have learned about reward and risk measures, and in this session, we will look at how we can estimate the values of reward and risk for any course of action we choose using a simulation toolkit. Let's begin. Okay, in section two, we're going to build an algebraic model for evaluating a wireless data plan, and then use the simulation toolkit to obtain estimates of the reward and the risk measures associated with this plan. Here's a reminder of the business context we're looking at. The current wireless data plan that our consultant has charges $10 per gigabyte of data. The new data plan charges a fixed fee for all data usage up to 20 gigabytes, and then it also charges $15 per gigabyte of data above 20. For example, if the data usage in a particular month is 22 gigabytes, she'll have to pay $160 for the first 20 gigabytes and $30 for the next two. So her total payment in such month will be $190. If, on the other hand, her data usage in a particular month does not exceed 20 gigabytes, her monthly payment will be just $160. Based on historical data usage values, our consultant estimates that her family's monthly data usage is normally distributed with mean 23 and a standard deviation of 5 gigabytes. So she also knows the distribution of her monthly payments under the current plan, normal with mean $230 and a standard deviation of $50. What about the distribution of monthly payments under the new plan? If we want to estimate the reward and the risk measures for this new plan, in other words, the expected monthly payment and the standard deviation of monthly payments, what are they? Let us use the verbal description of the new plan to connect the random input, data usage U to the random output, whose probability distribution we're interested in, monthly payment P. If U is below or at 20, then the monthly payment P is 160. If U is above 20, then the monthly payment is 160 + 15 x (U- 20). In other words, extra payment of $15 for every gigabyte above 20. Now we can combine both cases using one EXCEL formula, IF. The IF formula has the following form, IF(Condition, Choice1, Choice2). This function evaluates condition and IF happens to be true, then U becomes equal to Choice1. And if the condition happens to be false, then IF becomes equal to Choice2. So in this case, if U is greater than 20, then IF selects Choice1, which is 15 x (U- 20). Otherwise, it selects Choice2, no extra payment. We have a formula that expresses the monthly payment P as a function of monthly data usage U. So let's return to our main question. If U is distributed as a normal random variable with mean 23 gigabytes and standard deviation of 5 gigabytes, what is the probability distribution of P? What reward value does our consultant get if she subscribes to the new plan? What risk will she be facing? As far as the reward is concerned, the expected payment value, can we just plug in the expected value of U 23 into our formula, and get the expected value of P? It's very tempting to try and get the estimate of the expected value of a random key performance indicator by replacing all the random factors it depends on by the expected values. For example, if we plug in 23 gigabytes into that formula, we get a value of $205. Is this the expected value of our monthly payment P? Unfortunately, while there are cases when you can get the correct estimate for the expected value of a key performance indicator in this way, in many other cases, this is the wrong way to go. If you have a contract where payment, P, is a linear function of U, in other words, P is equal to U times constant plus or minus another constant, then this approach will work. For example, in the old data plan, our consultant pays $10 for each gigabyte. So under that plan, the monthly payment P is equal to 10 times the data usage U. 10 times U is a linear function of U, and the expected value of P in this case is just 10 times the expected value of U, or 10 times 23, which is 230. But here's a simple example showing us how it all can go wrong. Let's say our data usage U can only take 2 values, each with 50% probability, 18 gigabytes and 28 gigabytes. The expected value of U under this probability distribution is still 23 gigabytes, and the standard deviation is still 5 gigabytes. Let's see what kind of distribution we will get for the monthly payment amount, P. If U is 18, then P is equal to 160. If U is 28, then P is 280. The expected value of P, is 0.5 x 160 + 0.5 x 280, $220. Now this is very different from $205 we obtained earlier by replacing the random variable, U, by its expectation. Please keep this observation in mind. In general, we cannot simply replace random variables in our formulas and hope to get the right estimates of the key performance measures. So we're back to asking the same question. What are the reward and risk measures that describe our new plan? How do we estimate them? The answer is use simulation. Simulation is a tool for converting probability distributions of random factors we cannot perfectly control, like data usage, into probability distributions for outcomes we're interested in, like monthly payment. In simulation, we'll call random factors like data usage random inputs, and outcomes with distribution would like to obtain random outputs. Here's how simulation works. In each simulation step, we generate a random instance of the input quantity like data usage U. In other words, we know the distribution of this random input, in our case, it is normal with mean 23 and standard deviation of 5 gigabytes. And we will instruct Excel or any other simulation tool we use to pull one value from that distribution. Then, we use the formula that connects our random input U, in our case, and the random output P, in our case, to calculate the value of the output random variable corresponding to that instance of the random input variable we just generated. In other words, we ask a simulation tool to generate an instance of a random data usage value U, and then calculate the corresponding monthly payment value, P. We can repeat the simulation step, which we will call a simulation run As many times as we like. Since at every simulation run, we convert a random input value, such as data usage U into the corresponding output value, such as payment P, the simulation basically converts the set of random input numbers into a set of random output numbers. We will use the term input sample and output sample to describe the sets of numbers generated during the simulation. Once we generate a sample of output values, for example, is sample of payment values. Or can use the sample to estimate the expected value of the output, standard deviation etc. In other words, we can estimate reward and risk measures that we will use later to choose the best course of action. We will use Excel for running simulations and for analysing their results. Specifically, we will use an Excel add-in called Analysis Tool Pak to run simulations. This add-in is a part of standard Excel installation on Windows. If you're using Mac, here's a link to a free software that has similar capabilities as Analysis Tool Pak. If you're using Google Sheets, an add-on called XLMiner Will have similar functionality. Okay. We're ready to use Excel to set up and run a simulation for the monthly payment values under the new wireless data plan. We have created an Excel template, dataplan_0, you can use to follow our setup. In this first example, we'll set up and run a simulation with just ten simulation runs to help us understand how simulation works. Okay let's go to dataplan_0. We start with a template dataplan_0 that contains all the data we need to set up a simulation. In our analysis, we'll use Excel 2013 on Windows. We'll be setting up our simulation using Analysis Tool Pack, which is a standard add-in in Excel. It is usually located under data tab in the portion called analysis, right next to the solver button. If you do not see data analysis there, you should go to file, options, add-ins. And here what says, manage Excel add-ins, you click go, and you wanna make sure that the Analysis Tool Pak is checked. While setting up our simulation, keep in mind that the ultimate purpose of running a simulation is to obtain estimates for the measures of reward and risk associated with a new data plan. As we discussed earlier, we will use the expected monthly payment under the new plan as a measure of reward, and the standard deviation of monthly payment as a measure of risk. Okay, in order to prepare our simulation let's first add headers to cells C1, D1, and E1. In C1, let's enter simulation run. This header will indicate that in column C, we will count the instances of our random variables. In D1, let's put data usage, U in gigabytes. In column D, we will be generating random instances of the monthly data usage. Finally, in E1, let's put payment, P in dollars. Column E will contain the monthly payment amounts calculated using the random instances of the monthly data usage from column D. In this example, we will generate ten random instances of monthly data usages U and calculate ten corresponding values of monthly payment P. Every time we generate a random instance of data usage, we will count it as a simulation run. So let's number the simulation runs will conduct 1 through 10, and place the simulation run identifiers in column C, in cells C2 through C11. Okay, the actual random instances of monthly data usage for each simulation run will be store in the cells D2 through D11. In particular, the cell D2 will contain the first random instance of monthly data usage, the cell D3 the second random instance of monthly data usage, and so on. But how do we generate those random instances? Well, Analysis Tool Pad provides us with the tool to generate random numbers. And that's the tool we're going to use. Let's go to data, click on data analysis, select random number generation, and click okay. In the random number generation dialogue, let's put in one, into the number variables box. This tells Excel that we're going to generate the instances of a single, random variable monthly data usage in our case. Next, we'll put ten. Into number of random numbers box. This will instruct Excel to generate ten random instances of monthly data usage. Next, in the distribution pull down menu, we choose normal and specify 23 and five as mean and standard deviation of the normal random variable. Now, we have instructed Excel to generate ten random instances of monthly data usage using a normal distribution with the parameters we specified. As you can see from this drop down menu, Excel can generate random variables using a number of different probability distributions, in addition to the normal. Now, let's put some combination of numbers say, one, two, three into the random seed box. Random seed instructs itself to generate random numbers from the distribution with specified in a particular way. We do not need to worry to much about the seed value in order to run and interpret the simulation. Just keep in mind that if you are running Excel 2013 and Windows, and you set up your model in the same way as I do, and use the same number of simulation runs and the same seed, you will generate exactly the same sequence of monthly usage values that I do. So, if you want to compare your simulation results to mine, you should be setting up your simulation model in the same way as I do. Use the same number of simulation runs, like ten that I use in this case, and the same seed, like 1, 2, 3 that I use in this case. In practice, it does not matter much what seed you select as long as you run a fairly long simulation. In otherwords, as long as the value in the number of random numbers box is high. We'll talk more about short versus long simulations and about different seed values later this week. Okay. One last thing, in the output options, let's select output range starting in the cell D2. This way Excel will put the first random monthly usage number it generates into the cell D2. The second random monthly usage number into the cell D3 and so on. Now, when you click OK, Excel will generate 10 random monthly data usage numbers in cells D2 from D11, using the distribution and the parameters we specified. The numbers in the cells D2 through D11 are all instances of a normal random variable with mean 23 and standard deviation 5. In other words Excel generated, or using another word, simulated for us, ten instances of what the future might hold in terms of monthly data usage. Let's make sure that the values in the cells D2 through D11 are visually distinguishable from the other values. Let's change the font in the cells into green and bold. We'll use this same visual designation for the random inputs into our simulation models. Now, what about the simulation outputs? The quantity we're interested in is the monthly payment amount, P. Let's go into the cells E2 through E11 and put in formulas that will calculate the monthly payment value P, for any monthly Data Usage, U. Let's start with the cell E2. The way the new data plan works is by charging $160 up front, and then by adding $15 for each gigabyte above 20. Let's put this formula into the cell E2 using the value in the cell D2 as the first possible instance of the monthly data usage. We have already discussed an algebraic formula that calculates the monthly payment for any value of monthly data usage. The algebraic formula we put in the cell E2 is $160, that's B5, plus if the data usage, D2, exceeds 20, B4, then we're charged extra $15 for each gigabyte of usage above 20. Otherwise, there are no extra charges. The result in this instance is $160 since the monthly usage in this instance fell below 20. If we want to calculate the values of monthly payment corresponding to the remaining nine random instances of data usage We need to copy and paste the formula in E2 into the cells E3 through E11. But before we do this we must use absolute cell references for the cells B4, B5, and B6 for the formula in the cell E2. Those are the parameters of our data plan and they do not change when we copy and paste the payment formula. So we go to the cell E2, And use the shortcut F4 to put the dollar signs around B4, B5, and B6. Let's start with B5, And then move to B4, and then B6 and then B4 again. Okay. After that, we can copy and paste the payment formula into the cells E3 through E11. So we now have 10 random monthly payment values that correspond to 10 random monthly data usage that Excel generated. In other words, Excel generated a random sample of the future data usage values and we have used the payment formula to convert the sample into a sample of future payment values. This random sample of payment values is the output of our simulation. We will use blue color and bold font to make sure that those values look different from the random input values, and from other values on the spreadsheet. To facilitate the future analysis of the results of our simulation, let us calculate the average and the standard deviation of the simulated samples of the monthly data usage, and the corresponding monthly payment values. The average of a sample of random numbers is also called sample mean. And the standard deviation of a sample of random numbers is also called a sample standard deviation. Let's put out these headers into the cells C13 and C14. Sample mean, and sample standard deviation. Let's look first at the sample of monthly data usage, where we'll calculate the average of these ten numbers in the cell D13. So we put in the formula, average of D2 through D11. As we can see, the sample mean in this case is about 25. 25 gigabytes. And in the cell D14, we will calculate the value of the standard deviation of the same sample. The formula we put in the cell D14 is, STDEV of the same cells, (D2:D11). There's several formulas for calculating the standard deviation in Excel, and those formulas are applicable in different settings. The STDEV formula is used for a sample of random numbers. The sample standard deviation for this particular sample is around 7.8 gigabytes. Let's use the font for the cells D13 and D14, as we did for the cells D2 and D11. So it's a green font and bold. Now we're ready to compute the estimates of the reward and risk measures associated with the new data plan. As we discussed earlier, we will use the estimate of the expected value of the monthly payments as a reward measure, and the estimate of the standard deviation of the monthly payments as the measure of risk. In order to calculate the sample mean and sample standard deviation for our sample of payment values, all we have to do now is to just copy and paste the formulas from the cells D13 and D14 into the cells E13 and E14. We'll use the same font scheme for E13 and E14 as we did for E2 through E11, so it's blue and bold. So, The estimate for the measure of reward associated with the new data plan is about $253 and the estimate for the measure of risk is about $92. For completeness, here's the picture of the Excel file data plan 10 we created with all the formulas that helped us to set up and run the simulation. In the next session, we'll have a more detail look at the simulation results. In this session, we have learned to use simulation to estimate the reward and risk measure associated with any potential decision we can make. Next time, we'll look at the interpretation of the simulation results. In particular, we will compare the results of short and long simulations, to see how precise our simulation estimates are. We will also look at histograms as a convenient way of presenting the results of the simulation. See you next time.