Why did I do it?
I have evaluated lots of services and been involved in trials where I have asked people to collect EQ-5D data. During this time several people have complained to me about having to collect EQ-5D data so I thought I would have a ‘taste of my own medicine’. I measured my health-related quality of life (HRQoL) using EQ-5D-3L, EQ-5D-VAS, and EQ-5D-5L, every day for a year (N=1). I had the EQ-5D on a spreadsheet on my smartphone and prompted myself to do it at 9 p.m. every night. I set a target of never being more than three days late in doing it, which I missed twice through the year. I also recorded health-related notes for some days, for instance, 21st January said “tired, dropped a keytar on toe (very 1980s injury)”.
By doing this I wanted to illuminate issues around anchoring, ceiling effects and ideas of health and wellness. With a big increase in wearable tech and smartphone health apps this type of big data collection might become a lot more commonplace. I have not kept a diary since I was about 13 so it was an interesting way of keeping track on what was happening, with a focus on health. Starting the year I knew I had one big life event coming up: a new baby due in early March. I am generally quite healthy, a bit overweight, don’t get enough sleep. I have been called a hypochondriac by people before, typically complaining of headaches, colds and sore throats around six months of the year. I usually go running once or twice a week.
From the start I was very conscious that I felt I shouldn’t grumble too much, that EQ-5D was mainly used to measure functional health in people with disease, not in well people (and ceiling effects were a feature of the EQ-5D). I immediately felt a ‘freedom’ of the greater sensitivity of the EQ-5D-5L when compared to the 3L so I could score myself as having slight problems with the 5L, but not that they were bad enough to be ‘some problems’ on the 3L.
There were days when I felt a bit achey or tired because I had been for a run, but unless I had an actual injury I did not score myself as having problems with pain or mobility because of this; generally if I feel achey from running I think of that as a good thing as having pushed myself hard, ‘no pain no gain’. I also started doing yoga this year which made me feel great but also a bit achey sometimes. But in general I noticed that one of the main problems I had was fatigue which is not explicitly covered in the EQ-5D but was reflected sometimes as being slightly impaired on usual activities. I also thought that usual activities could be impaired if you are working and travelling a lot, as you don’t get to do any of the things you enjoy doing like hobbies or spending time with family, but this is more of a capability question whereas the EQ-5D is more functional.
How did my HRQoL compare?
I matched up my levels on the individual domains to EQ-5D-3L and 5L index scores based on UK preference scores. The final 5L value set may still change; I used the most recent published scores. I also matched my levels to a personal 5L value set which I did using this survey which uses discrete choice experiments and involves comparing a set of pairs of EQ-5D-5L health states. I found doing this fascinating and it made me think about how mutually exclusive the EQ-5D dimensions are, and whether some health states are actually implausible: for instance, is it possible to be in extreme pain but not have any impairment on usual activities?
Surprisingly, my average EQ-5D-3L index score (0.982) was higher than the population averages for my age group (for England age 35-44 it is 0.888 based on Szende et al 2014); I expected them to be lower. In fact my average index scores were higher than the average for 18-24 year olds (0.922). I thought that measuring EQ-5D more often and having more granularity would lead to lower average scores but it actually led to high average scores.
My average score from the personal 5L value set was slightly higher than the England population value set (0.983 vs 0.975). Digging into the data, the main differences were that I thought that usual activities were slightly more important, and pain slightly less important, than the general population. The 5L (England tariff) correlated more closely with the VAS than the 3L (r2 =0.746 vs. r2 =0.586) but the 5L (personal tariff) correlated most closely with the VAS (r2 =0.792). So based on my N=1 sample, this suggests that the 5L is a better predictor of overall health than the 3L, and that the personal value set has validity in predicting VAS scores.

Reflection
I definitely regretted doing the EQ-5D every day and was glad when the year was over! I would have preferred to have done it every week but I think that would have missed a lot of subtleties in how I felt from day to day. On reflection the way I was approaching it was that the end of each day I would try to recall if I was stressed, or if anything hurt, and adjust the level on the relevant dimension. But I wonder if I was prompted at any moment during the day as to whether I was stressed, had some mobility issues, or pain, would I say I did? It makes me think about Kahneman and Riis’s ‘remembering brain’ and ‘experiencing brain’. Was my EQ-5D profile a slave to my ‘remembering brain’ rather than my ‘experiencing brain’?
One thing when my score was low for a few days was when I had a really painful abscess on my tooth. At the time I felt like the pain was unbearable so had a high pain score, but looking back I wonder if it was that bad, but I didn’t want to retrospectively change my score. Strangely, I had the flu twice in this year which gave me some health decrements, which I don’t think has ever happened to me before (I don’t think it was just ‘man flu’!).
I knew that I was going to have a baby this year but I didn’t know that I would spend 18 days in hospital, despite not being ill myself. This has led me to think a lot more about ‘caregiver effects‘ – the impact of close relatives being ill; it is unnerving spending night after night in hospital, in this case because my wife was very ill after giving birth, and then when my baby son was two months old, he got very ill (both are doing a lot better now). Being in hospital with a sick relative is a strange feeling, stressful and boring at the same time. I spent a long time staring out of the window or scrolling through Twitter. When my baby son was really ill he would not sleep and did not want to be put down, so my arms were aching after holding him all night. I was lucky that I had understanding managers in work and I was not significantly financially disadvantaged by caring for sick relatives. And glad of the NHS and not getting a huge bill when family members are discharged from hospital.
Health, wellbeing & exercise
Doing this made me think more about the difference between health and wellbeing; there might be days where I was really happy but it wasn’t reflected in my EQ-5D index score. I noticed that doing exercise always led to a higher VAS score – maybe subconsciously I was thinking exercise was increasing my ‘health stock‘. I probably used the VAS score more like an overall wellbeing score rather than just health which is not correct – but I wonder if other people do this as well, and that is why there are less pronounced ceiling effects with the VAS score.
Could trials measure EQ-5D every day?
One advantage of EQ-5D and QALYs over other health outcomes is that they should be measured over a schedule and use the area under the curve. Completing an EQ5D every day has shown me that health does vary every day, but I still think it might be impractical for trial participants to complete an EQ-5D questionnaire every day. Perhaps EQ-5D data could be combined with a simple daily VAS score, possibly out of ten rather than 100 for simplicity.
Joint worst day: 6th and 7th October: EQ-5D-3L index 0.264, EQ-5D-5L index 0.724; personal EQ-5D-5L index 0.824; VAS score 60 – ‘abscess on tooth, couldn’t sleep, face swollen’.
Joint best day: 27th January, 7th September, 11th September, 18th November, 4th December, 30th December: EQ-5D-3L index 1.00; both EQ-5D-5L index scores 1.00; VAS score 95 – notes include ‘lovely day with family’, ‘went for a run’, ‘holiday’, ‘met up with friends’.
Thanks for the efforts. I was always wondering about this. I have also been wondering if QALYs would be different when we have different number of data points. Obviously, if the condition of interest has a tendency to fluctuate, it would be better to have more data points. But what about chronic conditions? Usually, we have month 0, 6 and 12 in a trial. But what if we have month 0, 3, 6, 9, 12?
Thanks. One to find out might be to collect at 0,3,6,9,12 and compare the results with only using the data from 0.6.12. I suppose there is an interaction between sample size and number of measurements, so having more frequent measurements may be a more efficient way of reducing the uncertainty around the total QALYs than increasing the sample size, but at the moment I think the uncertainty around QALYs doesn’t factor this in, it is just accepted that we interpolate between point in time estimates of HrQOL.
Hi Blythe thanks for reading the blog.
For your first point, the variance in index scores will be partly constrained by the ceiling effects and ‘lumpiness’ of the increments in reference scores with respect to the EQ5D profiles so not completely by me. I haven’t tried to fit scores to a distribution which might be interesting; a lot of days were just 11111 and a VAS score of 90-95 so not that much variation.
For your second point, I agree I can’t be sure about this without more data. Firstly I thought my health was worse than average for my age. And actually I still do, despite this being evidence to the contrary. But also, without any comparator I know that I very quickly forget days when I feel sub par, so if I did the EQ5D once a week, then if it fell on a day when I was 11111, I think there is a good chance I night incorrectly remember the whole week as being 11111; I don’t know if this is a general issue or specific to me.
If you know any journals who might want to publish it then please let me know!
Thanks for reading David and thanks for this paper; very interesting. I think having some kind of more frequent measure is important in measuring the impact of diseases with unpredictable or constantly fluctuating patterns of symptoms.
Excellent work and post. I want to learn more about the variance around your means using each measure. Which measure variance do you feel best represented the real flux in health stock you experienced in feeling and function?
It surprised me when you concluded “I thought that measuring EQ-5D more often and having more granularity would lead to lower average scores but it actually led to high average scores” instead of considering yourself a person with above-average health. It is hard to make a case for causality attributed to the frequency of measurement without some other data from a period when you measured less often.
Consider writing this material as a letter to an editor for a journal that cares about this stuff.
This is great – sorry about the opportunity cost to you, but it’s outweighed by the external benefits to the rest of the world. Lots to think about and learn from, but a point about your suggestion of using it in trials is that it may depend on the frequency of fluctuations in health related to the condition being studied. Some years ago, we used a daily patient diary, including a VAS, for a study of people who have MS, in whom symptoms can change a lot from day to day. We had a good response rate for completion every day for 6 weeks.
https://doi.org/10.1016/j.socscimed.2003.10.015