On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Hannah Penton who has a PhD from the University of Sheffield. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.
An investigation into the psychometric performance of existing measures of health, quality of life and wellbeing in older adults
Tracey Young, Claire Hulme, Christopher Dayson
How is outcome measurement different for older people?
It has been shown that older adults have different conceptualisations and priorities over what is important to their health and quality of life. This could be due to the different health problems they have experienced, varying generational attitudes, and shifting expectations of their health. These differences can affect their understanding of concepts and wording, change response behaviour, and alter the ability of a set of items to comprehensively and relevantly describe an individual’s health or quality of life.
Old age is associated with frailty, which is characterised by declining health and increasing numbers of health conditions, which require complex combinations of health and social care services over the long-term. These aged care services often impact broader elements of quality of life beyond solely health, such as independence and social participation. Traditionally, outcome measures such as the EQ-5D and SF-12 have focussed on health-related quality of life. In recent years, broader measures have been developed to capture aspects beyond health, such as wellbeing in the Warwick Edinburgh Mental Wellbeing Scale (WEMWBS) and the Office for National Statistics Personal Wellbeing Questions (ONS-4). It is important to assess how these measures of health and wellbeing compare in capturing what is important to the health and quality of life of older adults.
Were appropriate datasets readily available for your study?
Given the range of measures of interest and the sample size requirements of item response theory analyses, there was no single large dataset available which included all measures. As a solution, three datasets were used in the thesis. The Health Improvements and Patient Outcomes (HIPO) dataset contained the EQ-5D-5L, SF-12v2, and the ONS-4; the Adult Social Care Survey contained the ASCOT; and the Health Survey for England provided the WEMWBS. Unfortunately, this meant that not all measures could be compared directly in the same participants. However, the subset of measures from the HIPO survey could be directly compared.
Hasn’t the psychometric performance of these measures been assessed before?
The first study of this thesis was a systematic review, which investigated the psychometric performance of the EQ-5D, SF-12, ASCOT, WEMWBS, and ONS-4 in older adults. The amount of psychometric evidence available varied between measures. A broad range of psychometric aspects were tested in older adults for the ASCOT during its development, including testing of content and construct validity, reliability, and responsiveness. Some psychometric studies were identified for the EQ-5D and SF-12, but these were mostly focused on construct validity. No studies investigating the psychometric performance of the WEMWBS and ONS-4 in older adults were identified. Of course, older adults are also included in studies not specifically focused on this population. However, they often make up a very small percentage of participants and results are not presented separately, which makes it difficult to draw conclusions specific to older adults from these studies.
Where psychometric studies have been performed specifically in older adults, these used classical test theory methods and not item response theory. Item response theory can provide additional important information, such as more detailed investigation of the performance of item response levels and how they are used by respondents and investigation of differential item functioning.
Traditionally, patient reported outcome measures were developed by groups of experts, without testing the content validity in target groups of respondents. Where content validity has been examined, older adults are often underrepresented as they are considered a vulnerable and hard to reach group. However, older adults represent an intensive group of health and social care users, whose conceptualisation of quality of life can differ substantially from other groups. Therefore, it is important to examine the performance of measures in this group.
What did your qualitative research tell you that the quantitative analysis didn’t?
While item response theory signalled various psychometric issues in the measures, it was not able to explain the causes of those issues. An individual’s responses to patient reported outcome measures depend on their understanding of the items, their view on the underlying concept, the relevance and comprehensiveness of the items included, and the respondent’s attitude when answering questions. Therefore, interpretation of psychometric problems requires an understanding of the thought processes of respondents. These thought processes can be investigated using qualitative cognitive interviews.
Cognitive interviews ask respondents to think aloud when completing a questionnaire, allowing the researcher to record respondents’ understanding of the questions and the type of information considered when selecting a response option. Cognitive interviews also allow the researcher to ask additional questions about the respondent’s understanding of the concepts, the relevance of those concepts, and what is missing from a measure. This information gives a much fuller understanding of the way respondents interact with a measure and how measures could be improved, which cannot be fully understood from quantitative results.
Can you recommend one measure above all others?
It is difficult to recommend one single measure as all of them have their own issues in different areas.
In general, participants tended to find the functional focused EQ-5D-5L items easier to answer and mostly relevant to their situation. The relevance of broader subjective wellbeing items such as those of the WEMWBS and ONS-4 were commonly questioned as they did not reflect priorities in older adults’ conceptualisation of their quality of life. However, dimensions such as social contact and independence were considered of paramount importance, with respondents suggesting that these concepts should be included in a comprehensive measure of the quality of life of older adults.
The item response theory analyses identified issues with differential item functioning in both the EQ-5D and SF-12v2. These issues are likely caused by the response behaviour of older adults who are more likely to signal lower levels of problems than younger adults for many items, after controlling for their underlying health-related quality of life. The cognitive interviews suggested that this was likely due to the fact that older adults had reduced their benchmark of what they considered to be good health and functioning as they aged. In relation to this lower benchmark they would therefore rate their current state higher than would a younger adult with higher expectations. Future research should focus on either reducing this effect, maybe through different item wording, or controlling for it in data analysis.
How did your research change from your initial ideas at the start of your PhD?
I come from a quantitative background and therefore, at the start of my PhD, my main interest lay in the item response theory element of the study. As I uncovered issues in psychometric performance from the item response theory results, I could theorise why these issues occurred. However, I quickly became aware that the only way to truly understand the reasons behind psychometric issues was by talking directly with respondents. I did not have prior experience of qualitative research and, like many quantitative researchers, I undervalued the insight it could provide. Only when I started the interviews did I truly understand the level of information and understanding of respondent behaviour and content validity that could be gleaned from cognitive interviewing. It is no surprise that this type of testing is increasingly understood to be a vital tool in the development of patient reported outcome measures.