36th EuroQol Plenary Meeting

The 36th EuroQol Plenary Meeting will be held on 18-21 September 2019 in Brussels, Belgium.

  • 10 April 2019: Deadline submitting abstracts
  • 11 April – 21 April 2019: Review and selection of abstracts
  • 29 April 2019: Abstract acceptance notification
  • 12 June 2019: Deadline submitting papers and posters
  • 13 June – 26 June 2019: Review of submitted papers and posters
  • 8 July 2019: Papers and posters published on EuroQol members’ website

Chris Sampson’s journal round-up for 25th March 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

How prevalent are implausible EQ-5D-5L health states and how do they affect valuation? A study combining quantitative and qualitative evidence. Value in Health Published 15th March 2019

The EQ-5D-5L is able to describe a lot of different health states (3,125, to be precise), including some that don’t seem likely to ever be observed. For example, it’s difficult to conceive of somebody having extreme problems in pain/discomfort and anxiety/depression while also having no problems with usual activities. Valuation studies exclude these kinds of states because it’s thought that their inclusion could negatively affect the quality of the data. But there isn’t much evidence to help us understand how ‘implausibility’ might affect valuations, or which health states are seen as implausible.

This study is based on an EQ-5D-5L valuation exercise with 890 students in China. The valuation was conducted using the EQ VAS, rather than the standard EuroQol valuation protocol, with up to 197 states being valued by each student. Two weeks after conducting the valuation, participants were asked to indicate (yes or no) whether or not the states were implausible. After that, a small group were invited to participate in a focus group or interview.

No health state was unanimously identified as implausible. Only four states were unanimously rated as not being implausible. 910 of the 3,125 states defined by the EQ-5D-5L were rated implausible by at least half of the people who rated them. States more commonly rated as implausible were of moderate severity overall, but with divergent severities between states (i.e. 5s and 1s together). Overall, implausibility was associated with lower valuations.

Four broad themes arose from the qualitative work, namely i) reasons for implausibility, ii) difficulties in valuing implausible states, iii) strategies for valuing implausible states, and iv) values of implausible states. Some states were considered to have logical conflicts, with some dimensions being seen as mutually inclusive (e.g. walking around is a usual activity). The authors outline the themes and sub-themes, which are a valuable contribution to our understanding of what people think when they complete a valuation study.

This study makes plain the fact that there is a lot of heterogeneity in perceptions of implausibility. But the paper doesn’t fully address the issue of what plausibility actually means. The authors describe it as subjective. I’m not sure about that. For me, it’s an empirical question. If states are observed in practice, they are plausible. We need meaningful valuations of states that are observed, so perhaps the probability of a state being included in a valuation exercise should correspond to the probability of it being observed in reality. The difficulty of valuing a state may relate to plausibility – as this work shows – but that difficulty is a separate issue. Future research on implausible health states should be aligned with research on respondents’ experience of health states. Individuals’ judgments about the plausibility of health states (and the accuracy of those judgments) will depend on individuals’ experience.

An EU-wide approach to HTA: an irrelevant development or an opportunity not to be missed? The European Journal of Health Economics [PubMed] Published 14th March 2019

The use of health technology assessment is now widespread across the EU. The European Commission recently saw an opportunity to rationalise disparate processes and proposed new regulation for cooperation in HTA across EU countries. In particular, the proposal targets cooperation in the assessment of the relative effectiveness of pharmaceuticals and medical devices. A key purpose is to reduce duplication of efforts, but it should also make the basis for national decision-making more consistent.

The authors of this editorial argue that the regulation needs to provide more clarity, in the definition of clinical value, and of the quality of evidence that is acceptable, which vary across EU Member States. There is also a need for the EU to support early dialogue and scientific advice. There is also scope to support the generation and use of real-world evidence. The authors also argue that the challenges for medical device assessment are particularly difficult because many medical device companies cannot – or are not incentivised to – generate sufficient evidence for assessment.

As the final paragraph argues, EU cooperation in HTA isn’t likely to be associated with much in the way of savings. This is because appraisals will still need to be conducted in each country, as well as an assessment of country-specific epidemiology and other features of the population. The main value of cooperation could be in establishing a stronger position for the EU in negotiating in matters of drug design and evidence requirements. Not that we needed any more reasons to stop Brexit.

Patient-centered item selection for a new preference-based generic health status instrument: CS-Base. Value in Health Published 14th March 2019

I do not believe that we need a new generic measure of health. This paper was always going to have a hard time convincing me otherwise…

The premise for this work is that generic preference-based measures of health (such as the EQ-5D) were not developed with patients. True. So the authors set out to create one that is. A key feature of this study is the adoption of a framework that aligns with the multiattribute preference response model, whereby respondents rate their own health state relative to another. This is run through a mobile phone app.

The authors start by extracting candidate items from existing health frameworks and generic measures (which doesn’t seem to be a particularly patient-centred approach) and some domains were excluded for reasons that are not at all clear. 47 domains were included after overlapping candidates were removed. The 47 were classified as physical, mental, social, or ‘meta’. An online survey was conducted by a market research company. 2,256 ‘patients’ (people with diseases or serious complaints) were asked which 9 domains they thought were most important. Why 9? Because the authors figured it was the maximum that could fit on the screen of a mobile phone.

Of the candidate items, 5 were regularly selected in the survey: pain, personal relationships, fatigue, memory, and vision. Mobility and daily activities were also judged important enough to be included. Independence and self-esteem were added as paired domains and hearing was paired with the vision domain. The authors also added anxiety/depression as a pair of domains because they thought it was important. Thus, 12 items were included altogether, of which 6 were parts of pairs. Items were rephrased according to the researchers’ preferences. Each item was given 4 response levels.

It is true to say (as the authors do) that most generic preference-based measures (most notably the EQ-5D) were not developed with direct patient input. The argument goes that this somehow undermines the measure. But there are a) plenty of patient-centred measures for which preference-based values could be created and b) plenty of ways in which existing measures can be made patient-centred post hoc (n.b. our bolt-on study).

Setting aside my scepticism about the need for a new measure, I have a lot of problems with this study and with the resulting CS-Base instrument. The defining feature of its development seems to be arbitrariness. The underlying framework (as far as it is defined) does not seem well-grounded. The selection of items was largely driven by researchers. The wording was entirely driven by the researchers. The measure cannot justifiably be called ‘patient-centred’. It is researcher-centred, even if the researchers were able to refer to a survey of patients. And the whole thing has nothing whatsoever to do with preferences. The measure may prove fantastic at capturing health outcomes, but if it does it will be in spite of the methods used for its development, not because of them. Ironically, that would be a good advert for researcher-centred outcome development.

Proximity to death and health care expenditure increase revisited: a 15-year panel analysis of elderly persons. Health Economics Review [PubMed] [RePEc] Published 11th March 2019

It is widely acknowledged that – on average – people incur a large proportion of their lifetime health care costs in the last few years of their life. But there’s still a question mark over whether it is proximity to death that drives costs or age-related morbidity. The two have very different implications – we want people to be living for longer, but we probably don’t want them to be dying for longer. There’s growing evidence that proximity to death is very important, but it isn’t clear how important – if at all – ageing is. It’s important to understand this, particularly in predicting the impacts of demographic changes.

This study uses Swiss health insurance claims data for around 104,000 people over the age of 60 between 1996 and 2011. Two-part regression models were used to estimate health care expenditures conditional on them being greater than zero. The author analysed both birth cohorts and age classes to look at age-associated drivers of health care expenditure.

As expected, health care expenditures increased with age. The models imply that proximity-to-death has grown in importance over time. For the 1931-35 birth cohort, for example, the proportion of expenditures explained by proximity-to-death rose from 19% to 31%. Expenditures were partly explained by morbidity, and this effect appeared to be relatively constant over time. Thus, proximity to death is not the only determinant of rising expenditures (even if it is an important one). Looking at different age classes over time, there was no clear picture in the trajectory of health care expenditures. For the oldest age groups (76-85), health care expenditures were growing, but for some of the younger groups, costs appeared to be decreasing over time. This study paints a complex picture of health care expenditures, calling for complex policy responses. Part of this could be supporting people to commence palliative care earlier, but there is also a need for more efficient management of chronic illness over the long term.

Credits

My quality-adjusted life year

Why did I do it?

I have evaluated lots of services and been involved in trials where I have asked people to collect EQ-5D data. During this time several people have complained to me about having to collect EQ-5D data so I thought I would have a ‘taste of my own medicine’. I measured my health-related quality of life (HRQoL) using EQ-5D-3L, EQ-5D-VAS, and EQ-5D-5L, every day for a year (N=1). I had the EQ-5D on a spreadsheet on my smartphone and prompted myself to do it at 9 p.m. every night. I set a target of never being more than three days late in doing it, which I missed twice through the year. I also recorded health-related notes for some days, for instance, 21st January said “tired, dropped a keytar on toe (very 1980s injury)”.

By doing this I wanted to illuminate issues around anchoring, ceiling effects and ideas of health and wellness. With a big increase in wearable tech and smartphone health apps this type of big data collection might become a lot more commonplace. I have not kept a diary since I was about 13 so it was an interesting way of keeping track on what was happening, with a focus on health. Starting the year I knew I had one big life event coming up: a new baby due in early March. I am generally quite healthy, a bit overweight, don’t get enough sleep. I have been called a hypochondriac by people before, typically complaining of headaches, colds and sore throats around six months of the year. I usually go running once or twice a week.

From the start I was very conscious that I felt I shouldn’t grumble too much, that EQ-5D was mainly used to measure functional health in people with disease, not in well people (and ceiling effects were a feature of the EQ-5D). I immediately felt a ‘freedom’ of the greater sensitivity of the EQ-5D-5L when compared to the 3L so I could score myself as having slight problems with the 5L, but not that they were bad enough to be ‘some problems’ on the 3L.

There were days when I felt a bit achey or tired because I had been for a run, but unless I had an actual injury I did not score myself as having problems with pain or mobility because of this; generally if I feel achey from running I think of that as a good thing as having pushed myself hard, ‘no pain no gain’. I also started doing yoga this year which made me feel great but also a bit achey sometimes. But in general I noticed that one of the main problems I had was fatigue which is not explicitly covered in the EQ-5D but was reflected sometimes as being slightly impaired on usual activities. I also thought that usual activities could be impaired if you are working and travelling a lot, as you don’t get to do any of the things you enjoy doing like hobbies or spending time with family, but this is more of a capability question whereas the EQ-5D is more functional.

How did my HRQoL compare?

I matched up my levels on the individual domains to EQ-5D-3L and 5L index scores based on UK preference scores. The final 5L value set may still change; I used the most recent published scores. I also matched my levels to a personal 5L value set which I did using this survey which uses discrete choice experiments and involves comparing a set of pairs of EQ-5D-5L health states. I found doing this fascinating and it made me think about how mutually exclusive the EQ-5D dimensions are, and whether some health states are actually implausible: for instance, is it possible to be in extreme pain but not have any impairment on usual activities?

Surprisingly, my average EQ-5D-3L index score (0.982) was higher than the population averages for my age group (for England age 35-44 it is 0.888 based on Szende et al 2014); I expected them to be lower. In fact my average index scores were higher than the average for 18-24 year olds (0.922). I thought that measuring EQ-5D more often and having more granularity would lead to lower average scores but it actually led to high average scores.

My average score from the personal 5L value set was slightly higher than the England population value set (0.983 vs 0.975). Digging into the data, the main differences were that I thought that usual activities were slightly more important, and pain slightly less important, than the general population. The 5L (England tariff) correlated more closely with the VAS than the 3L (r2 =0.746 vs. r2 =0.586) but the 5L (personal tariff) correlated most closely with the VAS (r2 =0.792). So based on my N=1 sample, this suggests that the 5L is a better predictor of overall health than the 3L, and that the personal value set has validity in predicting VAS scores.

Figure 1. My EQ-5D-3L index score [3L], EQ-5D-5L index score (England value set) [5L], EQ-5DL-5L index score (personal value set) [5LP], and visual analogue scale (VAS) score divided by 100 [VAS/100].

Reflection

I definitely regretted doing the EQ-5D every day and was glad when the year was over! I would have preferred to have done it every week but I think that would have missed a lot of subtleties in how I felt from day to day. On reflection the way I was approaching it was that the end of each day I would try to recall if I was stressed, or if anything hurt, and adjust the level on the relevant dimension. But I wonder if I was prompted at any moment during the day as to whether I was stressed, had some mobility issues, or pain, would I say I did? It makes me think about Kahneman and Riis’s ‘remembering brain’ and ‘experiencing brain’. Was my EQ-5D profile a slave to my ‘remembering brain’ rather than my ‘experiencing brain’?

One thing when my score was low for a few days was when I had a really painful abscess on my tooth. At the time I felt like the pain was unbearable so had a high pain score, but looking back I wonder if it was that bad, but I didn’t want to retrospectively change my score. Strangely, I had the flu twice in this year which gave me some health decrements, which I don’t think has ever happened to me before (I don’t think it was just ‘man flu’!).

I knew that I was going to have a baby this year but I didn’t know that I would spend 18 days in hospital, despite not being ill myself. This has led me to think a lot more about ‘caregiver effects‘ – the impact of close relatives being ill; it is unnerving spending night after night in hospital, in this case because my wife was very ill after giving birth, and then when my baby son was two months old, he got very ill (both are doing a lot better now). Being in hospital with a sick relative is a strange feeling, stressful and boring at the same time. I spent a long time staring out of the window or scrolling through Twitter. When my baby son was really ill he would not sleep and did not want to be put down, so my arms were aching after holding him all night. I was lucky that I had understanding managers in work and I was not significantly financially disadvantaged by caring for sick relatives. And glad of the NHS and not getting a huge bill when family members are discharged from hospital.

Health, wellbeing & exercise

Doing this made me think more about the difference between health and wellbeing; there might be days where I was really happy but it wasn’t reflected in my EQ-5D index score. I noticed that doing exercise always led to a higher VAS score – maybe subconsciously I was thinking exercise was increasing my ‘health stock‘. I probably used the VAS score more like an overall wellbeing score rather than just health which is not correct – but I wonder if other people do this as well, and that is why there are less pronounced ceiling effects with the VAS score.

Could trials measure EQ-5D every day?

One advantage of EQ-5D and QALYs over other health outcomes is that they should be measured over a schedule and use the area under the curve. Completing an EQ5D every day has shown me that health does vary every day, but I still think it might be impractical for trial participants to complete an EQ-5D questionnaire every day. Perhaps EQ-5D data could be combined with a simple daily VAS score, possibly out of ten rather than 100 for simplicity.

Joint worst day: 6th and 7th October: EQ-5D-3L index 0.264, EQ-5D-5L index 0.724; personal EQ-5D-5L index 0.824; VAS score 60 – ‘abscess on tooth, couldn’t sleep, face swollen’.

Joint best day: 27th January, 7th September, 11th September, 18th November, 4th December, 30th December: EQ-5D-3L index 1.00;  both EQ-5D-5L index scores 1.00; VAS score 95 – notes include ‘lovely day with family’, ‘went for a run’, ‘holiday’, ‘met up with friends’.