Research and Reporting Methods12 May 2020

Harmonized Outcome Measures for Use in Depression Patient Registries and Clinical Practice

    Author, Article and Disclosure Information

    Abstract

    Major depressive disorder is a common mental health condition that affects an estimated 16.2 million adults and 3.1 million adolescents in the United States. Yet, a lack of uniformity remains in measurements and monitoring for depression both in clinical practice and in research settings. This project aimed to develop a minimum set of standardized outcome measures relevant to both patients and clinicians that can be collected in depression registries and clinical practice. Twenty-nine depression registries and related data collection efforts were identified and invited to submit outcome measures. Additional measures were identified through literature searches and reviews of quality measures. A multistakeholder panel representing clinicians; payers; government agencies; industry; and medical specialty, health care quality, and patient advocacy organizations categorized the 27 identified measures using the Agency for Healthcare Research and Quality's supported Outcome Measures Framework. The panel identified 10 broadly relevant measures and harmonized definitions for these measures through in-person and virtual meetings. The harmonized measures represent a minimum set of outcomes that are relevant to clinicians and patients and appropriate for use in depression research and clinical practice. Routine and consistent collection of these measures in registries and other systems would support creation of a national research infrastructure to efficiently address new questions, improve patient management and outcomes, and facilitate care coordination.

    Major depressive disorder (MDD) is a common mental health condition characterized by heterogeneous presentations in mood, cognitive function, and physical function that persist for 2 weeks or longer. Many questions about depression treatment and outcomes exist. Evidence is lacking about the comparative effectiveness of treatment approaches, how to select the most appropriate initial course of treatment for an individual patient, and when to modify treatment approaches or discontinue pharmacotherapy for patients whose symptoms do not improve. Why depression recurs in some patients and others achieve sustained remission is unclear, and consensus does not exist on how to define or treat treatment-resistant depression (1) or how treatment-resistant depression may be related to antidepressant treatment (2).

    Longitudinal observational studies, such as patient registries, already capture a wealth of data on depression treatment patterns and outcomes. Registries are increasing in number (3), and linkage with electronic health records and other sources would provide researchers with the foundation for a harmonized research infrastructure to consistently and efficiently collect high-quality data. Yet, variation in the outcome measures captured in registries and routine clinical practice makes it challenging, if not impossible, to link and compare the data.

    To address these issues, the U.S. Department of Health and Human Services, led by the Agency for Healthcare Research and Quality and in collaboration with the Food and Drug Administration and National Library of Medicine, has supported the development of the Outcome Measures Framework, a conceptual model for classifying outcomes that are relevant to patients and providers across most conditions (4). This project had the following 4 objectives: to test the utility of the Outcome Measures Framework for categorizing depression outcomes and supporting harmonization across treatment pathways and care settings, to identify a minimum set of outcome measures that could be captured in depression patient registries and clinical practice, to agree on harmonized definitions for each outcome in the minimum measure set, and to map the harmonized definitions to standardized terminologies to support consistent implementation and collection of the outcome measures within electronic health records.

    Methods

    This harmonization effort focused on outcome measures that are currently collected in depression registries, other observational studies, and clinical practice. The Appendix describes the harmonization methodology.

    Results

    Twenty-seven registries and quality improvement efforts were invited to participate. Representatives of 13 of these efforts, representing multiple purposes, patient populations, and care settings, agreed to participate in the registry workgroup (Table 1). Appendix Table 1 describes registries and other efforts that declined to participate. The 15 stakeholders from 12 participating organizations represented patient advocacy organizations, health information technology, professional societies, payers, and federal agencies (Appendix Table 2).

    Table 1. Registry Workgroup Participants
    Appendix Table 1. Invited Registries That Declined to Participate
    Appendix Table 2. Stakeholder Participants

    The workgroup identified 27 outcome measures from registries and other sources and categorized them according to the Outcome Measures Framework. Eleven (41%) were categorized as clinical response; 10 (37%) were categorized as patient-reported outcome measures; and the remaining 6 were divided evenly into survival (7%), events of interest (7%), and resource use (7%). No measures of patient experience of care were identified as currently being captured in the participating registries. Patient-reported outcome measures that are used primarily to measure depression severity and response to treatment are included in the clinical response category, whereas measures that capture other domains (such as quality of life) are included in the patient-reported category.

    The minimum measure set comprises 10 measures that are intended to apply to all patients with a diagnosis of MDD or its equivalent at any time (5). Table 2 lists measure definitions, and the following sections describe the rationale for selection of the measures and definitions. The workgroup emphasized the importance of documenting enrollment procedures, including criteria for inclusion, exclusion, and MDD diagnosis. The measures could apply to patients with comorbid psychiatric conditions, but information on comorbid conditions and other patient characteristics must be collected to support risk adjustment and to identify factors that influence outcomes in subpopulations (Figure). In addition, although this workgroup focused on MDD, the measures may be relevant for all clinically significant depression; further work is needed to explore their relevance for other types of depression. Last, registries should report clearly on efforts to minimize loss to follow-up and the proportions of patients who are lost to follow-up at the specified intervals.

    Table 2. Depression Minimum Measure Set and Harmonized Definitions
    Figure. Outcome Measures Framework, as completed for depression characteristics, treatments, and outcomes.

    The Outcome Measures Framework depicts the minimum set of outcome measures recommended by the workgroup (right column), as well as the key characteristics of the participant, disease, and provider that should be captured to support risk adjustment (left column). Treatments of interest are listed in the center column.

    * Area for future investigation.

    Survival

    Two survival measures, all-cause mortality and death from suicide, are included in the minimum measure set. Some research has suggested that depression is associated with higher mortality rates (6), and data from large, diverse patient registries could be helpful to describe any such association. However, information on cause of death may not be readily available, and a different cause of death may be listed for suicides because of perceived stigma. Without systematic ascertainment, survival measures may be inaccurate. The workgroup recognized these limitations and acknowledged that these measures may be difficult to collect in some settings. However, all-cause mortality and death from suicide are highly relevant from both the patient and the clinician perspective, and a minimum measure set would be incomplete without these measures. The workgroup recommended that they be captured when systematic ascertainment is possible, and registries should provide clear guidance about appropriate methods of ascertainment.

    Clinical Response

    The clinical response measures are grouped into 2 categories: improvement in depressive symptoms and worsening of depressive symptoms. Patients whose symptoms neither improve nor worsen are considered unchanged. The workgroup noted the importance of capturing both remission and response in alignment with widely used quality measures (7, 8). However, it broadened the definitions to allow a wider time frame for measurement and substitution of other instruments.

    The clinical response measures are defined on the basis of the Patient Health Questionnaire–9 (PHQ-9). Participating registries use various instruments, including the Hamilton Depression Rating Scale, PHQ-9 (9), and Montgomery–Åsberg Depression Rating Scale (10), that differ in mode of administration, domains covered, length, and time to administer. The use of different instruments among the registries reflects the broader variation seen in research and clinical practice. Clinical trials use such instruments as the Hamilton Depression Rating Scale and Montgomery–Åsberg Depression Rating Scale as primary end points, but these instruments require specialized training and clinician time to administer, making them impractical in many routine clinical care settings.

    Because the proposed measure set is intended for wide use in the settings of primary care and mental health care, the workgroup focused on recommending an instrument that is short, easy to score, patient-reported, and publicly available. The PHQ-9 meets these criteria and is an acceptable instrument for the remission and response quality measures. The U.S. Preventive Services Task Force recommendation also discusses the PHQ as an instrument for screening. Recognizing that researchers may wish to use other tools in some settings, the workgroup noted that instruments for which a crosswalk to the PHQ-9 exists are acceptable for these measures. Registries may also include other instruments as supplemental measures.

    Measurement at regular intervals is important to capture changes in depressive symptoms. As a patient-administered instrument, the PHQ-9 can be captured directly from patients at regular intervals without a provider encounter.

    Although remission and response are widely captured in registries, analogous measures for worsening of depressive symptoms were difficult to identify. The workgroup identified recurrence as an important outcome to collect consistently and defined the concept on the basis of the PHQ-9. This definition requires further validation. In addition, consistent measurement of worsening of depressive symptoms would provide information that helps us understand outcomes in patients who do not achieve remission and for whom treatment efficacy wanes over time. This concept is considered supplemental because it requires additional refinement.

    Events of Interest

    Suicidal ideation and behavior and adverse events are captured in the minimum measure set. To facilitate consistent measurement across patient populations and care settings, the workgroup recommended a stepped approach to measuring suicidal ideation, in which all patients complete the PHQ-9 and those who score on the suicidal ideation item (question 9) receive additional screening and possibly intervention. Additional screening should be completed using an appropriate, brief, validated instrument, such as the Concise Health Risk Tracking Scale (11). Suicidal behaviors are also important to capture and should be included in this measure when systematic ascertainment from all possible sites of care is possible. Because of the practical challenges of systematic ascertainment, this measure may be infeasible in some settings.

    Many questions exist about adverse events related to long-term medication use and discontinuation of medication therapy (12). Clinicians and patient representatives commented that patients may have adverse events (side effects) that they do not recognize as treatment-related or are reluctant to discuss with providers (such as sexual function). However, a single measure to capture all possible side effects across all treatment approaches does not exist. The workgroup recommended that registries consider using the Frequency, Intensity, and Burden of Side Effects Rating Scale, a brief, validated, patient-reported measure designed to capture the side effects of depression medications.

    Patient-Reported Outcome Measures

    Measurement in depression often focuses on core symptoms, but other factors, such as work and social engagement and quality of life, are important to patients. Participating registries capture a range of patient-reported outcome instruments depending on their patient population and objectives. The workgroup discussed these and other instruments but did not identify an instrument that is patient-centered, widely used in clinical practice, relevant across a range of populations and care settings, validated, and publicly available. The Quality of Life Enjoyment and Satisfaction Questionnaire (13) is provided as an example of a quality-of-life instrument that meets most of the criteria, although it is not commonly used outside research settings.

    Resource Use

    Depression-related resource use captures payer and patient costs related to treatment or management of depression. The workgroup cautioned that access to care plays an important role in resource use for many patients. Many factors may limit a patient's access to mental health care, including lack of health insurance and shortage of mental health professionals, particularly in rural areas. A patient's ability to access appropriate care, as well as the number and type of visits, should be considered when calculating or interpreting this measure. In addition to direct costs, absenteeism from work and reduced productivity are relevant when considering the overall economic burden of depression. The workgroup recommended the Work Productivity and Activity Impairment Questionnaire for this purpose (14) but noted that few examples exist of the use of this measure in depression registries.

    Treatments and Characteristics

    The workgroup identified depression-specific characteristics of the participant, disease, and provider for which published evidence shows a correlation with patient outcomes (Figure). It also identified treatments of interest. Collection of the characteristics and detailed treatment data are critical for risk adjustment when measuring depression outcomes and comparing outcomes across care settings and patient populations. Although we do not define a specific risk adjustment approach here, registries and other systems that use the measures must evaluate differences in patient populations and consider the effect on outcome measures.

    Standardized Library

    The narrative definitions produced by the workgroup were translated into standardized definitions, including data elements, value sets, and the accompanying logic necessary to consistently capture and extract the data from electronic health records (15). Standard codes and value sets for the Quality of Life Enjoyment and Satisfaction Questionnaire and the Work Productivity and Activity Impairment Questionnaire do not exist.

    Discussion

    We identified patient- and clinician-relevant measures that are feasible to capture in routine clinical practice across care settings and are sufficiently robust to support research and quality improvement efforts related to depression. Of note, this minimum measure set is intended to create the foundation for a learning health care system by bridging the gap between outcomes that are considered relevant to clinicians and patients and those that are relevant in research, so that information from research studies could be tied directly to decisions in clinical practice and measurement of meaningful outcomes in quality initiatives. These measures are also designed to provide a consistent framework for sharing meaningful information across providers to facilitate care coordination for patients with depression.

    A major strength of this effort was participation by a wide range of stakeholders who provided different perspectives on the outcomes that are most important to measure and potentially feasible to capture across registries and care settings. A second strength is the translation of the narrative definitions into standardized terminologies. Standardization is intended to reduce duplicate data collection by harmonizing data requirements across the learning health care system.

    The minimum measure set is similar to the recommendations for measuring health outcomes in depression and anxiety developed by the International Consortium for Health Outcomes Measurement (ICHOM). Although the objectives of this project differ from those of ICHOM, the proposed case-mix variables, outcomes, and measurement time frames align in most areas. Of note, both proposals recommend the PHQ-9 to measure symptoms and include measures of medication side effects and absenteeism. Significant differences exist in 3 areas. First, the ICHOM proposal includes a measure of functioning, whereas this workgroup prioritized depression-related quality of life. Second, this workgroup included a measure of suicidal ideation and behavior, noting the increasing rate of suicide in the United States. Last, the minimum measure set includes all-cause mortality, which may be challenging to assess but is necessary to provide an understanding of the possible association between depression and mortality. These concepts are not included in the ICHOM proposal.

    The minimum measure set has limitations. First, 4 measures rely on the PHQ-9, but workgroup members noted that other instruments may be more appropriate for some purposes, such as efficacy trials. Given the complexity of the instruments used in efficacy trials, such instruments are unlikely to become widely used in routine practice. To bridge this gap, development of new tools to allow comparisons across the PHQ-9 and other instruments is critical to allow researchers to select the most appropriate instrument for their objectives while maintaining the ability to link data from routine clinical practice with research findings. Work is ongoing to develop such crosswalks or common metrics; of note, the depression metric used in the Patient-Reported Outcomes Measurement Information System is a common reporting metric that can be used to link 3 depression scales, including the PHQ-9 (16).

    Second, the workgroup did not reach consensus on any patient-reported outcome domains other than depression-specific quality of life. The workgroup believed that the most appropriate domain depended on the patient population of interest. Even within a specific population, individual patients have different treatment goals, and a patient-reported outcome should ideally align with those goals. The workgroup also noted that more information is needed to understand the acceptability from the patient perspective of completing the 3 patient-reported instruments included in the minimum measure set on a regular basis. Clinicians and patient representatives cautioned that some patients may find it difficult to answer these questions repeatedly, especially if their symptoms are not improving.

    Third, additional work is needed to operationalize some of the measures. More work is necessary to validate the definition of recurrence, develop a clinician- and patient-relevant definition for worsening symptoms of depression, and facilitate use of an instrument for measuring treatment side effects. Questions remain about how some patient characteristics influence outcomes and how to risk-adjust the measures, and further work is needed to develop a framework for capturing the necessary data to describe and compare psychotherapy or other behavioral health treatments.

    Fourth, the scope of this project was limited to registries and other efforts collecting data in the United States. Engagement with registries and other efforts that declined to participate would have provided more perspective on the feasibility of adopting the measures in industry-funded research and the value of the measures for collaborative care.

    Last, demonstration of the feasibility of capturing the minimum measure set and the value and validity of these measures for clinical care and research is essential. The Agency for Healthcare Research and Quality recently awarded a pilot project to implement the measures within a health system and 2 patient registries, with the goal of assessing feasibility, burden, and value. That project should provide important information to guide the further adoption of the minimum measure set.

    The harmonized measures represent a minimum set of outcomes that are relevant in depression research and multiple clinical practice settings. Consistent collection of these measures in registries and other data collection efforts would create opportunities to link and compare data across sources, potentially enabling new research and assisting in the development of learning health care systems.

    Appendix: Methods

    This effort focused on harmonizing outcome measures that are currently used in patient registries, quality improvement initiatives, and clinical practice. The methods are similar to those of similar efforts to harmonize outcome measures for atrial fibrillation (17) and asthma (18). The Agency for Healthcare Research and Quality (AHRQ) selected depression as the condition area for this project. With support from project staff, investigators from OM1 identified existing depression registries through systematic searches of the Registry of Patient Registries (https://effectivehealthcare.ahrq.gov/topics/registry-of-patient-registries/) and ClinicalTrials.gov; reviews of the qualified clinical data registries list maintained by the Centers for Medicare & Medicaid Services, the postmarketing commitment studies list on the Food and Drug Administration website, and projects funded by the Patient-Centered Outcomes Research Institute; and searches of the published medical literature using PubMed and Google Scholar. Investigators also reviewed registries suggested by AHRQ and other experts. They identified registries meeting definitional criteria for a patient outcomes–focused registry (3) and collecting data in the United States and invited these registries to participate as voluntary members of the registry workgroup; investigators also invited 2 clinical experts in depression treatment (1 psychiatrist and 1 psychologist) who were not affiliated with a specific registry.

    Participating registries provided specifications for outcome measures. Investigators also reviewed depression-related quality measures identified in the National Quality Forum database (www.qualityforum.org) and the COMET (Core Outcome Measures in Effectiveness Trials) Initiative database (www.comet-initiative.org), as well as published outcome measure definitions from other clinical studies, including from registries that declined to participate. Investigators organized the collected outcome measures and presented them to the registry workgroup.

    The registry workgroup met virtually and in person 5 times over a 6-month period to develop the harmonized measures. Investigators from OM1 did background research, prepared meeting materials, and moderated the meetings. The workgroup began by categorizing all identified measures using the Outcome Measures Framework categories of survival, clinical response, events of interest, patient-reported, resource use, and experience of care. Within each category, measures representing similar concepts were identified and grouped accordingly. Workgroup members rated the priority of each measure concept, and the workgroup used the weighted average of the ratings as the starting point for developing the minimum measure set.

    The minimum measure set is intended for use as a core set of outcomes that is feasible to collect in all depression registries and in clinical practice across care settings. For each measure in the set, investigators prepared detailed comparisons of existing definitions. The workgroup discussed the clinical significance of the differences, reasons for the differences, and possible approaches to harmonization (for example, recommending use of an existing definition or modifying an existing definition to incorporate concepts from other definitions). Harmonized definitions proposed during workgroup meetings were circulated to participants afterward for additional review and presented at subsequent meetings to ensure that all participants agreed with the definition.

    To provide broader perspective, investigators identified and invited representatives of organizations that are interested in depression treatment and outcomes but are not directly involved with patient registries to participate in a stakeholder group. The combined workgroup and stakeholder group met in person to review the proposed minimum measure set and identify other data that should be captured (such as patient, disease, and provider characteristics and treatments).

    Clinical terminologists then mapped the narrative definitions produced by the workgroup to standardized terminologies to produce a library of common data definitions suitable for implementation within electronic health records. Where possible, existing common data elements and value sets were used. The narrative definitions and standardized definitions were posted on the AHRQ website for public comment, after which the measure set was finalized. This effort was funded by AHRQ, and AHRQ staff participated in project planning, in workgroup meetings, and as authors of this article.

    References

    • 1. Gaynes BN, Asher G, Gartlehner G, et al. Definition of Treatment-Resistant Depression in the Medicare Population. Technology Assessment Program. Agency for Healthcare Research and Quality. 2018. Accessed at www.ncbi.nlm.nih.gov/books/NBK526366 on 7 April 2020. Google Scholar
    • 2. Ali ZA Nuss S , and  El-Mallakh RS Antidepressant discontinuation in treatment resistant depression. Contemp Clin Trials Commun2019;15:100383. [PMID: 31193850] doi:10.1016/j.conctc.2019.100383 CrossrefMedlineGoogle Scholar
    • 3. Gliklich RE Dreyer NA Leavy MB eds Registries for Evaluating Patient Outcomes: A User's Guide. 3rd ed. Agency for Healthcare Quality and Research 2014. Google Scholar
    • 4. Gliklich RE Leavy MB Karl J et alA framework for creating standardized outcome measures for patient registries. J Comp Eff Res2014;3:473-80. [PMID: 25350799] doi:10.2217/cer.14.38 CrossrefMedlineGoogle Scholar
    • 5. American Psychiatric AssociationDiagnostic and Statistical Manual of Mental Disorders. 5th ed. American Psychiatric Assoc 2013. Google Scholar
    • 6. Machado MO Veronese N Sanches M et alThe association of depression and all-cause and cause-specific mortality: an umbrella review of systematic reviews and meta-analyses. BMC Med2018;16:112. [PMID: 30025524] doi:10.1186/s12916-018-1101-z CrossrefMedlineGoogle Scholar
    • 7. National Quality Forum. Depression remission at twelve months. Updated 6 March 2015. Accessed at www.qualityforum.org/QPS/0710e on 19 June 2019. Google Scholar
    • 8. National Quality Forum. Depression response at twelve months- progress towards remission. Updated 6 March 2015. Accessed at www.qualityforum.org/QPS/1885 on 19 June 2019. Google Scholar
    • 9. Kroenke K Spitzer RL , and  Williams JB The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med2001;16:606-13. [PMID: 11556941] CrossrefMedlineGoogle Scholar
    • 10. Montgomery SA  and  Asberg M A new depression scale designed to be sensitive to change. Br J Psychiatry1979;134:382-9. [PMID: 444788] CrossrefMedlineGoogle Scholar
    • 11. Trivedi MH Wisniewski SR Morris DW et alConcise Health Risk Tracking scale: a brief self-report and clinician rating of suicidal risk. J Clin Psychiatry2011;72:757-64. [PMID: 21733476] doi:10.4088/JCP.11m06837 CrossrefMedlineGoogle Scholar
    • 12. Maund E Stuart B Moore M et alManaging antidepressant discontinuation: a systematic review. Ann Fam Med2019;17:52-60. [PMID: 30670397] doi:10.1370/afm.2336 CrossrefMedlineGoogle Scholar
    • 13. Endicott J Nee J Harrison W et alQuality of life enjoyment and satisfaction questionnaire: a new measure. Psychopharmacol Bull1993;29:321-6. [PMID: 8290681] MedlineGoogle Scholar
    • 14. Reilly MC Zbrozek AS , and  Dukes EM The validity and reproducibility of a work productivity and activity impairment instrument. Pharmacoeconomics1993;4:353-65. [PMID: 10146874] CrossrefMedlineGoogle Scholar
    • 15. Gliklich RE, Leavy MB, Li F. Standardized Library of Depression Outcome Measures: Research White Paper. Agency for Healthcare Research and Quality; 2018. AHRQ Publication No. 18(19)-EHC026-EF. Google Scholar
    • 16. Choi SW Schalet B Cook KF et alEstablishing a common metric for depressive symptoms: linking the BDI-II, CES-D, and PHQ-9 to PROMIS depression. Psychol Assess2014;26:513-27. [PMID: 24548149] doi:10.1037/a0035768 CrossrefMedlineGoogle Scholar
    • 17. Calkins H Gliklich RE Leavy MB et alHarmonized outcome measures for use in atrial fibrillation patient registries and clinical practice: endorsed by the Heart Rhythm Society Board of Trustees. Heart Rhythm2019;16:e3-e16. [PMID: 30449519] doi:10.1016/j.hrthm.2018.09.021 CrossrefMedlineGoogle Scholar
    • 18. Gliklich RE Castro M Leavy MB et alHarmonized outcome measures for use in asthma patient registries and clinical practice. J Allergy Clin Immunol2019;144:671-681.e1. [PMID: 30857981] doi:10.1016/j.jaci.2019.02.025 CrossrefMedlineGoogle Scholar

    This article was published at Annals.org on 12 May 2020.

    Comments