So, the previous section we talked about the potential impact of confounding and at least one example said one way to handle it, is to break out and measure separate estimates of the association for different levels of the confounder. For example, when we looked at the relationship between disease and smoking separately for males and females. But if there's similar associations in those respective levels of the confounder, so for example, we saw positive associations between smoking disease in both males and females, wouldn't it be nice to pull those consistent estimates across different levels in the confounder into one overall estimate. And that's what we're going to talk about here now, something called unadjusted estimate. Look, in this section, we'll talk about the presentation of such estimates, the interpretation, and their utility in ceci confounding. And then the next section will give us a conceptual overview of the idea behind adjustment. So, for this section, I'd like you to come out understanding how to interpret estimates of association that have been adjusted to control for confounder and compare and contrast the comparisons being made by unadjusted and adjusted association estimates. So, adjustment is a method for making comparable comparisons between groups in the presence of a confounder or confounding variables. We will discuss the basics of the mechanics behind adjustment in the next section but here, we'll focus on how to interpret them. So, again recall the results from our first fictitious study that we did in the previous section, the study that was done to investigate the association between smoking and certain disease outcome in male and female adults. And we saw when we looked at everyone in one two-by-two table looking at the relationship between smoking disease, the relative risk of disease for smokers to non-smokers was slightly less than one or 0.93. But we saw that this is being influenced by the difference in sex distributions among smokers and non-smokers, smokers were more likely to be male, and males were less likely to have disease. So, the relative risks that we looked at compares all smokers to all non-smokers. That relative risk of 0.93, it did not take any other factors into account. So, such an estimate is called an unadjusted or crude estimated association. In this case, the unadjusted or crude estimated association between disease and smoking. Adjustment provides a mechanism for estimating the outcome exposure relationship after removing the potential distortion or negation that comes from a confounder or multiple confounders. In the fictional example, for example the relationship between disease and smoking can be adjusted for sex. And so since I made up this example, I had access to all the raw data, so even though this isn't based on a real scenario, there is data to represent this. So, frequently what you'll see in journal articles and we'll explore this again in more detail in the multiple regression sections, the bridge presentation of results from non-randomized study will include a table that shows side by side unadjusted and adjusted measures of association. So, one one this may manifest itself if our primary concern is about disease and smoking, we might have a table that looks like this. We'll show the relative risk of disease and now I've added 95 percent confidence intervals and these can be computed and we'll discuss that idea in the next section as well. All I'm showing here is the relative risk of disease for smokers to non-smokers. In the second column here, it's the crude or unadjusted and so the setup declares non-smokers to be the reference group and the implication is that this relative risk of 0.93 is a comparison of the relative risk of disease for smokers to the reference of nonsmokers with a confidence interval of 0.68 to 1.27. When we do the adjustment and adjust only for sex, the relative risk comparing smokers to non-smokers goes up, it's now 1.57. You can think of it, and we'll talk more about this in the next section, it's sort of a weighted average between the sex-specific relative risks we saw for smokers to non-smokers, the 1.8 for males and the 1.5 for females. So, this unadjusted relative risk, this 0.93, compares the risk of disease for all smokers compared to all non-smokers in the sample regardless of sex or any other characteristic and is, hence, the estimates the comparisons of all smokers to non-smokers in the population sample. This adjusted estimated relative risk of 1.57, this compares the risk of disease for smokers to non-smokers of the same sex in the sample and hence estimates the comparison of smokers to non-smokers of the same sex in the population sample. So, this 1.57 is the estimated relative risk of disease for male smokers to male non-smokers and for female smokers to female non-smokers. So, as long as we're comparing males to males or females to females, the relative risk adjusted for sex differences between smokers and non-smokers of disease is 1.57 for smokers to non-smokers in both sexes. So, as long as we're comparing individuals of the same sex, whether be female smokers to female non-smokers or male smokers to male non-smokers, the relative risk of disease for smokers to non-smokers is 1.57, it's the same for males and females. The unadjusted and adjusted associations can be compared both numerically and qualitatively to assess confounding by at least some of the adjustors. In this case we've only adjusted for one thing, sex. And we can see the estimated association goes from something close to the null of one, so something substantially larger than one. So, I would say there is some confounding in these data. From a statistical standpoint, one could argue that once we account for the uncertainty in the estimates, there's some crossover in the confidence intervals here, so it's not clear whether these are statistically distinguishable. And if they weren't, then from a statistical standpoint, this would just be sampling variability. But generally speaking, I'm willing to say that since the estimate change by such a large amount and these confidence intervals are a function of sample size as well as other things, I would go ahead and say there was some evidence of confounding in this data given that the adjusted is qualitatively different than the unadjusted. Let's go back to our observational study that we looked at to examine the association between arm circumference in Nepalese children. And again, it was a 150 randomly selected children, age zero to 12 months who had arm circumference, weight, and height measured. So, what I want to look at here and I'm going to compare in this table the unadjusted and adjusted associations between arm circumference in both height and weight. In this situation, let's see what we got, let's focus on height first and foremost. So, what we have here are the regression slopes from models with arm circumference as the outcome. So in this row, we have regression slopes for height, so this first model here is from a regression that relates y hat is arm circumference and there some intercept, I don't remember what it is offhand, but the slope for height was positive. Then, the second model after adjusting for weight is another regression model where we have some intercept but now the slope for height is negative. In this first scenario, this is the crude or unadjusted association, that relationship between arm circumference to height is positive. But after adjusting for weight, it's negative. Similarly, in this line here, I have a regression coefficient for weight when the outcome is arm circumference. And we can see here the unadjusted association is positive, we saw a graphic of this in the previous lecture set. When we adjust for height differences in the weight groups, it's still positive but gets larger in value. So, both the relationships between arm circumference and height and arm circumference and weight change from their unadjusted versions when we adjusted then adjusted for each of those two things. So, let's talk about with regard to height, what this difference in the interpretations of these slopes are. The unadjusted linear regression slope estimate for height of positive 0.16, this estimates the average difference in arm circumference between two groups of children who differ by one centimeter in height. The average in change in arm circumference per one centimeter increase in height. After adjustment for weight, this slope becomes negative 0.16. So I call this beta hat height with a little star to indicate that it's different from this one here. This estimates the average difference in arm circumference between two groups of children who differ by one centimeter in height, but are of the same weight, is the average change in arm circumference per one centimeter increase in height adjusted for weight. So, why would this become negative? Let's think about that for a moment, after adjusting for weight, this estimates the common relationship between arm circumference and height in people of the same weight. Well you can think of if people are of the same weight, the taller they get, it's almost as if somebody's being stretched out to maintain the same weight. So, the taller one is in a comparable weight group, the more stretched out or lanky they are and hence potentially the more lanky their arms are. So, that may make some sense. This is a different comparison than this one here where we ignore differences in weight between the height groups being compared. Again, the unadjusted and adjusted associations are both numerically qualitatively to assess confounding by at least some of the adjustors. So we can see very clearly that not only did the estimated slope changed in value, changed in sign, the relationship between arm circumference and height was positive ignoring weight and negative when taking into account weight. So, a very different association after adjustment. The relationship between arm circumference and weight is interesting. It was positive and statistically significant when ignoring height and it was still positive but larger and statistically significant with a confidence interval shifted up relative to the unadjusted one after accounting for height. So it looks to me like both the weight and height relationships with arm circumference were influenced or confounded by each other. So, in summary, adjustment is a method for making comparable comparisons between groups in the presence of a confounder or confounding variables. The group comparisons made by adjusted associations are more specific than those made by unadjusted or crude associations. And comparing the magnitude, significance, and confidence intervals of unadjusted and adjusted associations is useful for identifying confounding. So, we'll spend more time doing such comparisons when we get into the realm of multiple regression which begins in a lecture set after this one. In the next section, we'll look at just conceptually the idea behind what is going on when we get an adjusted association, what is the process of adjustment like conceptually.