So, in this section, let's look at some examples related to the material in lecture set six, and this will be fun. What I want to show you for two different simulations we did, is how the results compare from our simulations to what would be predicted by the central limit theorem. So, for the first sampling distribution I showed for means, we looked at sample mean heights for cross multiple samples of different sizes taken from a population where the true mean is 167 centimeters, and the standard deviation in the population height values is 2.5 centimeters. So, just reviews the results, here are the means and standard deviations of the sample means in each simulated distribution. So, the mean of the 5,000 sample means each based on the sample size 20 was 167 centimeters, the underlying truth. But there was variation in these values around the mean. As we've said several times now, as we increase the sample size that each of the 5,000 means was based upon, the center or average of the sample means remained the same, but the variation in those sample means around that average was reduced with increased sample size. So, the central limit theorem tells us that the theoretical sampling variability of sample means based on samples of size 'n' is a function of the true population standard deviation. In our case, it's known because we created this population, it's 2.5 centimeters and the size of each sample which varies across these simulations. So, let's see how our observed variability in sample means and the 5,000 sample means we've had matches up with what would be predicted by the central limit theorem in the situation where we know the true population standard deviation. Again, in real life we won't know this but we will be able to estimate it based on the single sample using the sample standard deviation. So, for the 5,000 means, each based on 20 observations, the mean of the 5,000 means with 167 centimeters, as we said before, the underlying true mean, but there is variation in the estimates around this true mean, they varied around that mean of themselves which happened to equal to true mean, and this is the observed variation in them. On average, any single sample mean fell 0.57 centimeters from the mean of the 5,000 sample means. That is the empirical standard error estimate based on the simulation of 5,000 means. What is predicted? The theoretical standard error is 2.5, the true population variability in individual. High values divided by the size of the samples we used here, each was of size 20, you can see that it's very close, it's 0.56 centimeters compared to the observed, 0.57 in our simulation. In the other two examples with 50 and 150 observations that each sample mean is based on across the 5,000 samples, you can see that our observed variability in the 5,000 sample means equals exactly what is told to us, what it should be based on the central limit theorem. Such just kind of neat not only to see the empirical results of these simulations visually, but to see that the actual results we get in terms of the variability of our estimate syncs up with what is predicted by that central limit theorem. Again, in real life, we're only going to take one sample of any given size and we're not going to know the true standard error because we don't know the true variability of individual values in the population. We can only estimate it based on the individual variability and values in our sample. But 10 of that will tend to be a good estimate, and so we will be able, based on any single one sample, to get a good estimate of the standard error of the mean based on samples of our size. Let's do the same thing with our simulation based on proportion. So, we had the true population proportion of persons and poverty in Baltimore was 25 percent. We looked at several different simulations to look at the behavior or variability in sample proportions based on samples of given sizes taken from a population where the true population proportion is 25 percent. So, when we looked at samples of size 50, the mean of our 5,000 sample origins was equal to 25 percent, which is actually the underlying truth, but these estimates varied around that mean and the standard deviation observed, in our 5,000 means, each based on 50 persons was 0.061. Similarly, for the other two simulations where we lose samples of size 150 and 500, respectively, the mean of the 5,000 samples persisted at 0.25 but the variability in the observed 5,000 sample proportion estimates decreased with increasing sample size. So, again, the theoretical standard given by the central limit theorem is given by this formula here, and let's see how it compares to the observed variability in our 5,000 sample proportions each based on these respective sample sizes. So, for samples of size 50, we had 5,000 proportions, each based on 50 persons, the average of these values again was the true proportion of the population but there was variation in these values around that average on the order of 0.061 or 6.1 percent. That's exactly in sync with the theoretical standard error estimate. Similarly, we see the same sort of line up between what we observed in our 5,000 proportions, their variability compared to what is expected or given by the central limit theorem. So, essentially, I could have filled out this entire table without doing the simulations if I knew I was sampling from a population with the true proportion of 25 percent and I knew the sizes of each sample I was working with, sizes of each samples that I was taking multiple replicates of. So, what we're going to do in real life is, of course, we're only going to observe one sample, we're not going to do a study 5,000 times, and it want to be able to exactly get the true standard error, but we can certainly estimated standard error of the proportion estimate in our sample by replacing these P's, the true proportions, in the theoretical standard error formula with our best estimate p hats. So we'll get an estimate of the standard error and we can use that as we'll see in section seven to create what's called confidence interval for the true proportion. So, hopefully this was kind of cool and helpful just to see that the central limit theorem we demonstrated that some results empirically in the previous lectures the fact that the resulting estimates were bell-shaped and approximately normally distributed around their mean, regardless of whether they were based on continuous measures or binary, and now we're showing that the variability in the simulations lines up with what it should be based on the central limit theorem.