This video is intended to describe for you how to specify a performance measure, and identify some common data sources for measurement. To begin, one needs to decide what is the level of performance that one hopes to measure? On this slide I've listed a number of different levels of analysis of performance measures. These are ordered generally by the smallest to the largest with some gray area there in the middle. So, it's most micro one could measure the performance of an individual clinician. A step up from now would be a group of clinicians. So that might be a practice group of 10 clinicians, 15 clinicians. A step up from that would be facilities, things like hospitals, skilled nursing facilities. A step up from that would be an accountable care organization or an integrated delivery system. A step up from that would be a health plan level. And at the most macro level of performance, one can measure the health of a population. In addition to the levels of performance, one also needs to think about the types of performance measurements scores, and performance measures can actually take on many different forms. One type is adherence rates. And this is what I think most people commonly think of when they think of performance measures. An example this would be that 83 percent of heart attack patients, received an aspirin of their arrival at the hospital. Another way of measuring performance is counts. This I think is really useful and helpful when you're looking at things such as never events or serious reportable events, things that don't happen very often and for which a actual a single count might be a meaningful way of tracking performance. So, 15 patients had surgery performed on the wrong body part. The third way of looking at performance would be a composite scores. These can be things like star ratings, letter grades. And this may be some sort of assessment of care of quality for either cost or a particular condition. When one thinks about specifying a performance measure, one typically thinks of having to define a numerator and a denominator. My guidance to you which seems counter intuitive to a lot of folks is start with defining the denominator first. In specifying the denominator, you're actually identifying what is the population of interest. For example, you might define the denominator is the number of diabetic patients who had a health care encounter in the last 12 months. Denominators in particular are in important in understanding and interpreting the data. So it's very important that you're careful to use the appropriate denominator. If you don't select the correct denominator, you may under or overstate performance. For example, if you wanted to calculate the percentage of diabetic patients with low density lipoprotein below 100, you'd want your denominator to be the number of diabetic patients with an LDL test, not just the number of diabetic patients. So the numerator reflects the desired performance or the performance that's desirable, things such as patient survival or did the patient receive aspirant at arrival to the emergency department. So there may be appropriate times to actually exclude certain populations from your measure. And I'm going to outline for you some reasons you may want to have exclusions for your performance measure, and reasons you may want to consider not having exclusions. In terms of reasons you might want to have exclusions is exclusions allows you to narrow the target population to a more homogeneous subgroup. So, if you wanted to look at a very targeted population, you may, you know, exclude certain patients from that subgroup. There also may be certain situations that are seen as being outside of the provider's control. For example if a patient leaves the hospital against medical advice, it's often considered that, you know, what happens to the patient in terms of their outcomes may not necessarily be attributable to the performance of what was the care that was provided in the hospital. Reasons you may want to limit your exclusions is exclusions are often used as a method to weed out more difficult cases. So you want to make sure that you aren't necessarily trying to eliminate those cases that are difficult to achieve good performance. Also if you have too many exclusions, you can actually make the measure really difficult to understand or apply. So you do want to be careful with how many exclusions that you have, and be thoughtful about why you're using those exclusions. One of the common questions I receive from health care providers is how do you ensure that my outcome measures are fair? Health care providers often feel that their patients are sicker than patients that are seen by other providers, and they want to make sure that they aren't penalized for the acuity of their patients. There's really three approaches for addressing this issue of fairness and outcome measures. We don't necessarily put much focus on fairness as it relates to structural measures and process measures, as those are typically independent of patient acuity. So one approach that we often use for assuring fairness and outcome measures is this idea of risk adjustment. This is where we use statistical methods to "level the playing field" and quote by adjusting for the effects of patient characteristics that may vary across providers. Some examples of patient characteristics include age, gender, the patient's medical history, maybe the patient's comorbid illnesses, their behavioral and social factors, as well as other factors. If we didn't use risk adjustment, users might draw incorrect conclusions, because the hospitals or the physician organizations that appear to have the worst outcomes may simply have the sickest patients. But one limitation of risk adjustment and perhaps the biggest limitation is that we can only account for measurable and reported risk factors. Another approach for addressing the fairness of outcome measures is risk stratification. This is where we divide patients into two or more groups according to their expected risk of the process or outcome of interest. As one example, the Center for Medicare and Medicaids Nursing Home Compare website, they include two measures, stratifying patients by their risk of pressure sores. So, one measure is the percentage of high-risk long-stay residents who had a pressure sore. The other measure is the percentage of low-risk long-stay patients who had a pressure sore. And they identify those as separate measures of performance. The advantage of risk stratification is it may be particularly helpful for exposing possible disparities in care between groups. So one of the concerns with risk adjustment is that you're potentially adjusting away important information about the patient, and you may be adjusting away potential disparities. By using stratification, you can understand how performance varies amongst group, and see if there's potential disparities in race, gender, or socioeconomic status. The disadvantage of risk stratification is it does require you to typically have larger sample sizes than reporting aggregated data together. The third way of addressing this issue of fairness and outcome measures is to use exclusions. And we talked a little bit about this on the previous slide where we talked about exclusions for performance measures. So, one option is to exclude patients who don't qualify for a process of care in question, or for whom the process of care has not been shown to confer a clear benefit. One example of using exclusions is to define very narrow patient cohorts, where the baseline risk is similar for all patients in that cohort. So one example of this is some of the work I've done with the Leapfrog Group. We've looked at mortality for surgical patients. And what we've done is we've actually defined very narrow cohorts of patients. That way, we don't necessarily need to risk adjust because the risk for each of those very narrow cohorts is seen as being somewhat equal or equivalent to each other. There's many common data sources for performance measures. So, once you define the measure you actually need to find a data source to populate that measure. Some of these data sources are perhaps unique to the United States, but I think a lot of them could be broadly applicable to a global audience. Here in the United States we use administrative claims data for a number of performance measures. These are the billing data that healthcare providers and hospitals submit to the payer, reflecting perhaps ICD 10 diagnosis and procedure codes. And that information can actually be used in a number of ways to assess the quality and safety of care. Another common data source is charts or record reviews. So, this would be going into the individual patient's chart, and looking for information about the quality and safety of care that was provided. As we as a global healthcare system move towards electronic data sources, this represent I think really exciting opportunities for data. This include things like electronic health records, lab systems, pharmacy systems, and clinical registries. One area of potential future research and interest is this idea of clinically enhanced claims data. So, could we take the billing data and add some information say about pharmacy systems or lab systems to that administrative claims data, and could a limited number of data elements actually really enhance those claims data to be a much more useful source? And one area that I think everyone needs to think about is the burden of performance measurement, and the burden of collecting data. So, think about sampling. Where is it appropriate? Is that something that can be used to ease the measurement burden.