[MUSIC] Hi, now I'd like to introduce the issue of representativeness. As you may remember, the goal of a study is to develop a general understanding of a population. A population that we are interested in making claims about might be made of people, it might be made up of families, or firms, or countries. Now most of the time, it's pretty hard to study an entire population in its entirety. We can't go out and interview everybody in a country if we want to say something about that country. So what we do most of the time is for conducting for a quantitative study, we may recruit a sample that is a subset of that larger population, to be part of our study. Here, we can look at the white figures as the population and the circled figures as the ones in our sample. Now, when people are conducting qualitative studies, they may select individual cases that are intended to be representative of other similar cases in the population. So to generalize, it's really important that the people we recruit into our study, that is the ones that we've circled here, are in some way representative of the larger population about which we would like to make claims. By representative, we mean that the characteristics of the population in our sample are similar to those of the population as a whole. So here we've got a sample where the proportion green is roughly similar to the proportion green in the larger population. So we might think this is a reasonably representative sample. If we draw a sample and the characteristics of the people or the other entities that make up our sample, are very different from those in the larger population. Here everyone in the sample is colored white, and we have not included any green people, then we have a problem. And it may be hard to generalize from the contents of our sample to processes in the larger population. So proper study design requires careful attention to what we call sampling, or the selection of cases, to ensure representativeness. For quantitative research, there is an entire field devoted to the process of sampling. That is, if you're trying to conduct a study which will tell you something about a larger population, and it's impossible to measure everybody in that population, what's the best way to select people from that larger population into your study to maximize the chances that you can actually make a more general claim about the population as a whole? For qualitative research, researchers have to spend a lot of time trying to justify their selection of cases. Perhaps the countries that they're studying, or the cities that they are studying, to try to convince their audience that these cases are meaningful, in terms of telling us something about a broader selection of cases of similar situations. Let's look at some common problems with representativeness in the sorts of studies you might have been exposed to over the last few years. You've probably, at some point in the past, seen some results, based on claims made from an online survey. You may very well have participated in an online survey. Now there's real questions about how representative the respondents, the people that participate in an online survey, are of the larger populations about which we would like to make claims. We really don't know how internet users really represent the larger population. And then, among the internet users that have the time and the inclination to answer the questions in a study, are they even representative of internet users overall? Why did they even have the time to conduct the study? Why are they motivated? Maybe in a customer survey, maybe it's only the angry, upset people that bothered to complete the survey. And we're losing the people that are satisfied. A related issue comes up with the predecessors to contemporary online surveys which is readers polls from magazines, newspapers. Used to be the case that newspapers, magazines sometimes, when they where trying to learn about say, people's opinions about some contemporary phenomenon would conduct meter polls. Where they included a form, perhaps in there magazine or the newspaper. And asked their readers to fill it in and then send it back. Results where tabulated. But of course there were questions about whether the readers of any particular newspaper or any particular magazine were actually representative of the larger population. Obviously, certain types of people are more likely to read certain types of newspapers or magazines. Finally, it's very common to see people making claims based on what we call convenient samples. So you see, sometimes, studies conducted where rather than engaging in a formal procedure to recruit respondents for a survey, people conduct man on the street interviews. Or person on the street interviews, where they stand on a sidewalk and then try to recruit passers-by, get their attention and then ask them to complete a survey. Perhaps they hang out in subway stops or airports if they're trying to conduct some sort of customer satisfaction survey. Well, there are real issues about the representativeness of the types of people that can be recruited for these kinds of surveys. In a particular place, a subway stop, a busy sidewalk or an airport, are the people that are passing through there even representative of the larger population? And then, are the people that are actually willing to stop and take the time to complete the survey representative of other people in that location? Maybe they just have a lot of time on their hands. Or they really enjoy sharing their opinions. So that's a real question about the representativeness of these, what we call, convenience samples. A couple of other issues that I'd like to highlight. One is that a larger sample does not compensate for problems with representativeness. So sometimes people that are collecting online surveys will claim that because they have tens of thousands, or hundreds of thousands, or perhaps even millions of respondents, that their survey is actually better than a small carefully designed study that only includes 2,000 people. That's not correct. Study size, that is, the number of people surveyed, cannot compensate for fundamental problems with unrepresentativeness. It turns out that a small survey of just 500 people, 1,000, 2,000 people, where measures have been taken during the design to ensure the representativeness of the sample. Those results are easier to generalize to a larger population than a survey of hundreds of thousands of people, where no attention has been given to the issue of representativeness. Now sometimes people that conduct large surveys, for example online reader polls, etc., they claim that through a technical fix they can re-weight the data to try to make some claim about the population by fixing the problems with representiveness. This is extremely controversial. I can't get into the technical details here but re-weighting involves making a lot of assumptions about how the people in a casually recruited sample may or may not differ from the larger population. It involves scaling responses up and down. Most of the time claims based on re-weighted samples are highly controversial. And there's a great deal of skepticism about them, so you should be weary as well. So overall in this module, I've tried to sensitize you to the importance of the issue of representativeness, and identify some of the common problems that emerge with, you might say, casual approaches to study design. I hope that you'll become a more informed consumer of research that is reported in the news and the media based on social science research. And that you will look at claims made from studies with an eye to seeing whether any of the claims made from the research actually suffer from any of the issues that I've just identified.