# Sam Watson’s journal round-up for 10th September 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Probabilistic sensitivity analysis in cost-effectiveness models: determining model convergence in cohort models. PharmacoEconomics [PubMed] Published 27th July 2018

Probabilistic sensitivity analysis (PSA) is rightfully a required component of economic evaluations. Deterministic sensitivity analyses are generally biased; averaging the outputs of a model based on a choice of values from a complex joint distribution is not likely to be a good reflection of the true model mean. PSA involves repeatedly sampling parameters from their respective distributions and analysing the resulting model outputs. But how many times should you do this? Most times, an arbitrary number is selected that seems “big enough”, say 1,000 or 10,000. But these simulations themselves exhibit variance; so-called Monte Carlo error. This paper discusses making the choice of the number of simulations more formal by assessing the “convergence” of simulation output.

In the same way as sample sizes are chosen for trials, the number of simulations should provide an adequate level of precision, anything more wastes resources without improving inferences. For example, if the statistic of interest is the net monetary benefit, then we would want the confidence interval (CI) to exclude zero as this should be a sufficient level of certainty for an investment decision. The paper, therefore, proposed conducting a number of simulations, examining the CI for when it is ‘narrow enough’, and conducting further simulations if it is not. However, I see a problem with this proposal: the variance of a statistic from a sequence of simulations itself has variance. The stopping points at which we might check CI are themselves arbitrary: additional simulations can increase the width of the CI as well as reduce them. Consider the following set of simulations from a simple ratio of random variables $ICER = gamma(1,0.01)/normal(0.01,0.01)$:The “stopping rule” therefore proposed doesn’t necessarily indicate “convergence” as a few more simulations could lead to a wider, as well as narrower, CI. The heuristic approach is undoubtedly an improvement on the current way things are usually done, but I think there is scope here for a more rigorous method of assessing convergence in PSA.

Mortality due to low-quality health systems in the universal health coverage era: a systematic analysis of amenable deaths in 137 countries. The Lancet [PubMed] Published 5th September 2018

Richard Horton, the oracular editor-in-chief of the Lancet, tweeted last week:

There is certainly an argument that academic journals are good forums to make advocacy arguments. Who better to interpret the analyses presented in these journals than the authors and audiences themselves? But, without a strict editorial bulkhead between analysis and opinion, we run the risk that the articles and their content are influenced or dictated by the political whims of editors rather than scientific merit. Unfortunately, I think this article is evidence of that.

No-one debates that improving health care quality will improve patient outcomes and experience. It is in the very definition of ‘quality’. This paper aims to estimate the numbers of deaths each year due to ‘poor quality’ in low- and middle-income countries (LMICs). The trouble with this is two-fold: given the number of unknown quantities required to get a handle on this figure, the definition of quality notwithstanding, the uncertainty around this figure should be incredibly high (see below); and, attributing these deaths in a causal way to a nebulous definition of ‘quality’ is tenuous at best. The approach of the article is, in essence, to assume that the differences in fatality rates of treatable conditions between LMICs and the best performing health systems on Earth, among people who attend health services, are entirely caused by ‘poor quality’. This definition of quality would therefore seem to encompass low resourcing, poor supply of human resources, a lack of access to medicines, as well as everything else that’s different in health systems. Then, to get to this figure, the authors have multiple sources of uncertainty including:

• Using a range of proxies for health care utilisation;
• Using global burden of disease epidemiology estimates, which have associated uncertainty;
• A number of data slicing decisions, such as truncating case fatality rates;
• Estimating utilisation rates based on a predictive model;
• Estimating the case-fatality rate for non-users of health services based on other estimated statistics.

Despite this, the authors claim to estimate a 95% uncertainty interval with a width of only 300,000 people, with a mean estimate of 5.0 million, due to ‘poor quality’. This seems highly implausible, and yet it is claimed to be a causal effect of an undefined ‘poor quality’. The timing of this article coincides with the Lancet Commission on care quality in LMICs and, one suspects, had it not been for the advocacy angle on care quality, it would not have been published in this journal.

Embedding as a pitfall for survey‐based welfare indicators: evidence from an experiment. Journal of the Royal Statistical Society: Series A Published 4th September 2018

Health economists will be well aware of the various measures used to evaluate welfare and well-being. Surveys are typically used that are comprised of questions relating to a number of different dimensions. These could include emotional and social well-being or physical functioning. Similar types of surveys are also used to collect population preferences over states of the world or policy options, for example, Kahneman and Knetsch conducted a survey of WTP for different environmental policies. These surveys can exhibit what is called an ’embedding effect’, which Kahneman and Knetsch described as when the value of a good varies “depending on whether the good is assessed on its own or embedded as part of a more inclusive package.” That is to say that the way people value single dimensional attributes or qualities can be distorted when they’re embedded as part of a multi-dimensional choice. This article reports the results of an experiment involving students who were asked to weight the relative importance of different dimensions of the Better Life Index, including jobs, housing, and income. The randomised treatment was whether they rated ‘jobs’ as a single category, or were presented with individual dimensions, such as the unemployment rate and job security. The experiment shows strong evidence of embedding – the overall weighting substantially differed by treatment. This, the authors conclude, means that the Better Life Index fails to accurately capture preferences and is subject to manipulation should a researcher be so inclined – if you want evidence to say your policy is the most important, just change the way the dimensions are presented.

Credits

# Method of the month: Mobile data collection with a cloud server

Once a month we discuss a particular research method that may be of interest to people working in health economics. We’ll consider widely used key methodologies, as well as more novel approaches. Our reviews are not designed to be comprehensive but provide an introduction to the method, its underlying principles, some applied examples, and where to find out more. If you’d like to write a post for this series, get in touch. This month’s method is mobile data collection with a cloud server.

## Principles

A departure from the usual format of this feature to bring you more of a tutorial based outline of this method. Surveys are an important data collection tool for social scientists. Among the most commonly used surveys for health economists are the British Household Panel Survey and Health Survey for England. But we also interrogate patients in trials and follow-up to complete well-being and quality of life surveys like the EQ-5D. Sometimes, when there’s funding, we’ll even do our own large-scale household survey. Many studies use paper-based methods of data collection when doing these surveys, but this is more expensive than it’s digital equivalent and more error-prone.

## Implementation

Rather than printing out survey forms then having to transcribe the results back into digital format, the surveys can be completed on a tablet or smartphone, uploaded to a central server, automatically error checked, and then saved in a ready to use format. However, setting up a system to do digital data collection can seem daunting and difficult. Most people do not know how to run a server or what software to use. In this post, we will describe the process of setting up a server using a cloud server provider and using open source survey software. Where other tutorials exist on individual steps we link to them, otherwise, the information is provided here.

### Software

OpenDataKit (ODK) is a family of programs designed for the collection of data on Android devices using a flexible system for designing survey forms. The programs include ODK Collect, which is installed on phones or tablets to complete forms; ODK Aggregate, which collates responses on a server; and ODK Briefcase, which works on a computer and ‘pulls’ data from the server for use on the computer.

Using ODK, mobile data collection can be implemented using a cloud server in the following 7 steps.

### 1. Set up a cloud server

For this tutorial we’ll be using Digital Ocean, a cloud server provider that is easy to use, permits you to deploy as many ‘droplets’ (i.e. cloud servers) as you want, and enables you to specify your hardware requirements. ODK Aggregate will run on Amazon Web Services and Google App Engine as well and is, in fact, easier to deploy on these platforms. However, we’ve chosen to go with Digital Ocean to make sure we control where our data are being stored – a key issue to ensure we’re compliant with EU data protection regulations, especially the GDPR. Digital Ocean met all our needs for information security.

You will need an account with Digital Ocean with a credit card to pay for services. The server we use costs around \$15 a month to run (or 2 cents an hour), but can be ‘destroyed’ when not in use and then rebooted when needed by storing a ‘snapshot’ before you destroy it. Once you are logged in, create a new droplet with Ubuntu 16.04.4 x64. For our purposes, 2 GB of memory and 2 vCPUs will be sufficient. You will receive an email with the root password of the droplet and IP address. More info can be found here.

To log into the server from a Windows computer, you can download and run Putty. From Linux based or Mac OS X you can use the ssh command in the terminal. We recommend you follow this tutorial to perform the initial server set up, i.e. creating a user and password.

ODK Aggregate requires Tomcat to run, so we will install that. We then provide an optional step to allow access to the server only over https: (i.e. encrypted internet connections). This provides an extra layer of security, however, ODK also features RSA-key encryption of forms when transmitting to the server, so https: can be avoided if required. You will require a registered domain name to use https:.

### 2. Install Tomcat 8

Once you’re logged into the server and using your new user (it’s better to avoid using the root account when possible), run the following code:

```sudo apt-get update
sudo apt-get install default-jdk
sudo useradd -s /bin/false -g tomcat -d /opt/tomcat tomcat
cd /tmp```

In the next line, you may want to replace the line of code with a link to a more recent version of Tomcat

```wget http://mirror.ox.ac.uk/sites/rsync.apache.org/tomcat/tomcat-8/v8.5.27/bin/apache-tomcat-8.5.27.tar.gz -O apache-tomcat-8.5.24.tar.gz
sudo mkdir /opt/tomcat
sudo tar xzvf apache-tomcat-8*tar.gz -C /opt/tomcat --strip-components=1
cd /opt/tomcat
sudo chgrp -R tomcat /opt/tomcat
sudo chmod -R g+r conf
sudo chmod g+x conf
sudo chown -R tomcat webapps/ work/ temp/ logs/```

Now we’re going to open up a file and edit the text using the nano text editor:

`sudo nano /etc/systemd/system/tomcat.service`

Then copy and paste the following:

```[Unit]
Description=Apache Tomcat Web Application Container
After=network.target

[Service]
Type=forking

Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64/jre
Environment=CATALINA_PID=/opt/tomcat/temp/tomcat.pid
Environment=CATALINA_HOME=/opt/tomcat
Environment=CATALINA_BASE=/opt/tomcat
Environment='CATALINA_OPTS=-Xms512M -Xmx1024M -server -XX:+UseParallelGC'

ExecStart=/opt/tomcat/bin/startup.sh
ExecStop=/opt/tomcat/bin/shutdown.sh

User=tomcat
Group=tomcat
RestartSec=10
Restart=always

[Install]
WantedBy=multi-user.target```

Then save and close (Ctrl+x then Y then Enter). Going on,

```sudo systemctl daemon-reload
sudo systemctl start tomcat
sudo systemctl status tomcat
sudo ufw allow 8080
sudo systemctl enable tomcat```

`sudo nano /opt/tomcat/conf/tomcat-users.xml`

```<tomcat-users . . .>
</tomcat-users>```

Now, we need to comment out two blocks of text in two different files. The block of text is

``` <Valve className="org.apache.catalina.valves.RemoteAddrValve"
allow="127\.\d+\.\d+\.\d+|::1|0:0:0:0:0:0:0:1" />```

and the two files can be accessed respectively with

```sudo nano /opt/tomcat/webapps/manager/META-INF/context.xml
sudo nano /opt/tomcat/webapps/host-manager/META-INF/context.xml```

then restart Tomcat

`sudo systemctl restart tomcat`

### 3. Install an SSL certificate (optional)

Skip this step if you don’t want to use an SSL certificate. If you do, you will need a domain name (e.g. www.aheblog.com), and you will need to point that domain name to the IP address of your droplet. Follow these instructions to do that. It is possible to self-sign an SSL certificate and so not need a domain name, however, this will not work with ODK Collect as the certificate will not be trusted by Android. SSL certificates are issued by trusted authorities, a service for which they typically charge. However, Let’s Encrypt does it for free. To use it we need to install and use certbot:

```sudo apt-get update
sudo apt-get install software-properties-common
sudo apt-get update
sudo apt-get install certbot```

Now, use  to get the certificates for your domain:

`sudo cerbot certonly`

Completing all the questions that are prompted – you want a certificate for a ‘standalone’.

Now, we need to convert the certificate files into the Java Key Store format so that it can be used by Tomcat. We will need to do this as the root user (replace the domain name and passwords as appropriate):

```su - root
cd /etc/letsencrypt/www.domainnamehere.com

openssl pkcs12 -export -in fullchain.pem -inkey privkey.pem -out fullchain_and_key.p12 -name tomcat keytool -importkeystore -deststorepass PASSWORD  -destkeypass PASSWORD -destkeystore MyDSKeyStore.jks -srckeystore fullchain_and_key.p12 -srcstoretype PKCS12 -srcstorepass PASSWORD -alias tomcat

mkdir /opt/tomcat/ssl
cp MyDSKeyStore.jks /opt/tomcat/ssl/```

We can then switch back to our user account

`su - USER`

Open the following file

`sudo nano /opt/tomcat/conf/server.xml`

and replace the text with in the document where you see

`<!-- Define a SSL Coyote HTTP/1.1 Connector on port 8443 -->`

making sure to input the correct password:

```<Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol"
clientAuth="false" sslProtocol="TLS" keystoreFile="/opt/tomcat/ssl/MyDSKeyStore.jks"

Exit the file (Ctrl+X, Y, and Enter) and restart the service

`sudo systemctl restart tomcat`

Now, we want to block unsecured (HTTP) connections and allow only encrypted (HTTPS) connections:

```sudo ufw delete allow 8080
sudo ufw allow 8443```

If you use an SSL certificate, it will need to be renewed every 90 days. This is an automatic process, however, converting to JKS and saving into the Tomcat directory is not. You can either do it manually each time or write a Bash script to do it.

### 4. Install ODK Aggregate

We’ll firstly install the database software that will hold the collected data, we’ll use PostgreSQL (you can always use MySQL as well). The following code will install the right packages:

```sudo apt-get update
sudo apt-get install postgresql-9.6```

Now we are going to install ODK Aggregate. This is straightforward as the ODK Aggregate code provides a walkthrough. It is important to select the correct options during this process, which should be obvious from the above tutorial. In the first line of the following code, you may need to replace the link in the first command if the download does not work.

```sudo wget https://opendatakit.org/download/4456 -O /tmp/linux-x64-installer.run
sudo chmod +x /tmp/linux-x64-installer.run
sudo /tmp/linux-x64-installer.run```

then follow the instructions. Now, we are going to connect to the database to ODK Aggregate:

```sudo -u postgres psql
\cd '/ODK/ODK Aggregate'
\i create_db_and_user.sql
\q```

Now copy the ODK installation to the Tomcat directory

`sudo cp /ODK/ODK\ Aggregate/ODKAggregate.war /opt/tomcat/webapps`

And the final step is to restart Tomcat

`sudo systemctl restart tomcat`

To test whether the installation has been successful go to (replacing the URL as appropriate):

`https://www.domainnamehere.com:8443/ODKAggregate`

or if you do not have a domain name use:

`http://<IP address>:8443/ODKAggregate`

Use this URL for accessing ODK Aggregate.

To log on, use the ‘admin’ account and the password ‘aggregate’. Once you are logged in you can change the admin account password and create new user accounts as required.

### 5. Programme a survey

Surveys can be complex with logical structures that skip questions for certain responses, require different type of responses like numbers, text, or multiple choice, and can need signatures or images. Multiple languages are often needed. All of this is possible in ODK as well. To programme your survey into a format that ODK can use, you can use XLSform, a standard for authoring forms in Excel. The website has a good tutorial. It is important to try to learn as much as possible as it is very flexible. A few key things and tips to note:

• Skip and logical sequences are managed with the ‘Relevant’ column;
• If you want to be able to skip big groups of questions, you should use ‘groups’;
• If the same question is to be asked multiple (but an unknown number of) times you should use ‘Repeats’. Note that when you download the data the multiple responses from repeat type questions will be grouped in their own .csv files rather than with the rest of the survey;
• Encryption of a survey is managed by putting an RSA public key in the Settings tab;
• There are multiple different question types, learn them!;
• You can use the response from one question as text in another question, for example, someone’s name – using \${} syntax to refer to questions as with the ‘Relevant’ column;
• You can automatically collect default data like start and end time to check there’s been no cheating by data collectors.

Once your form is complete, you can convert it here, it will notify you of any errors. Once you have a .xml file ready to go, you can upload it in the ‘Form Management’ section of the ODK Aggregate interface.

### 6. Use ODK Collect

Submissions to the ODK Aggregate server need to be downloaded to a computer in order to be decrypted and the data processed and analysed. Exporting files from the server requires a number of pieces of software on the computer to which it is being downloaded:

• Unlimited Strength JCE Policy Files. These files are necessary if you are using encrypted forms and can be downloaded from here. To install the files, extract the contents of the compressed file, and copy and paste the files in the folder to the following location [java-home]/lib/security/policy/unlimited.

When you first launch ODK Briefcase the first thing you must do is choose a storage location where it will save all data downloads. Once you have done this go to the ‘Pull’ tab to download the surveys. Click ‘Connect…’ to input the URL of the ODK Aggregate instance. Select which forms you will need to download data for and click ‘Pull’ in the lower right-hand side of the window.

To download data submissions go to the ‘Export’ tab. The available forms are listed in this window. By the form for which you want to download submissions, input a storage location for the data. In the next row, select the location of the private RSA key if you are using encrypted forms, which must be in .pem format. Many programs will generate .pem keys, but some provide the keys as strings of text that will be saved as text files. If the text begins with ‘— BEGIN RSA PRIVATE KEY —‘, then simply change the file type to .pem.

## Applications

Now you should be good to go. Potential applications are extensive. The ODK system has been employed in evaluative studies (of performance-based financing, for example), to conduct discrete choice experiments, or for more general surveillance of health service use and outcomes.

There are ways of extending these tools further, such as collecting GPS and location and map data through Open Map Kit, which links to Open Street Map. There are also private companies who use OpenDataKit-based products to offer data collection and management services like SurveyCTO. However, we have found a key part of complying with data protection rules involves knowing exactly where data will be stored and having complete control over accessing it, which many services cannot offer. The flexibility of managing your own server permits more control, for example, you can write scripts to check for data submissions and to process them and upload them to another server or you can host other data collection tools when you need. Many universities or institutions may provide these services ‘in-house’, but if they do not support the software it can be difficult using company servers. A cloud server provider gives us an alternative solution that can be up and running in an hour.

Credit

# Method of the month: Q methodology

Once a month we discuss a particular research method that may be of interest to people working in health economics. We’ll consider widely used key methodologies, as well as more novel approaches. Our reviews are not designed to be comprehensive but provide an introduction to the method, its underlying principles, some applied examples, and where to find out more. If you’d like to write a post for this series, get in touch. This month’s method is Q methodology.

## Principles

There are many situations when we might be interested in people’s views, opinions or beliefs about an issue, such as how we allocate health care resources or the type of care we provide to dementia patients. Typically, health economists may think about using qualitative methods or preference elicitation techniques, but Q methodology could be your new method to examine these questions. Q methodology combines qualitative and quantitative techniques which allow us to first identify the range of the views that exist on a topic and then describe in-depth those viewpoints.

Q methodology was conceived as a way to study subjectivity by William Stephenson and is detailed in his 1953 book The Study of Behaviour. A more widely available book by Watts and Stenner (2012) provides a great general introduction to all stages of a Q study and the paper by Baker et al (2006) introduces Q methodology in health economics.

## Implementation

There are two main stages in a Q methodology study. In the first stage, participants express their views through the rank-ordering of a set of statements known as the Q sort. The second stage uses factor analysis to identify patterns of similarity between the Q sorts, which can then be described in detail.

### Stage 1: Developing the statements and Q sorting

The most important part of any Q study is the development of the statements that your participants will rank-order. The starting point is to identify all of the possible views on your topic. Participants should be able to interpret the statements as opinion rather than facts, for example, “The amount of health care people have had in the past should not influence access to treatments in the future”. The statements can come from a range of sources including interview transcripts, public consultations, academic literature, newspapers and social media. Through a process of eliminating duplicates, merging and deleting similar statements, you want to end up with a smaller set of statements that is representative of the population of views that exist on your topic. Pilot these statements in a small number of Q sorts before finalising and starting your main data collection.

The next thing to consider is from whom you are going to collect Q sorts. Participant sampling in Q methodology is similar to that of qualitative methods where you are looking to identify ‘data rich’ participants. It is not about representativeness according to demographics; instead, you want to include participants who have strong and differing views on your topic. Typically this would be around 30 to 60 people. Once you have selected your sample you can conduct your Q sorts. Here, each of your participants rank-orders the set of statements according to an instruction, for example from ‘most agree to most disagree’ or ‘highest priority to lowest priority’. At the end of each Q sort, a short interview is conducted asking participants to summarise their opinions on the Q sort and give further explanation for the placing of selected statements.

### Stage 2: Analysis and interpretation

In the analysis stage, the aim is to identify people who have ranked their statements in a similar way. This involves calculating the correlations between the participants Q sorts (the full ranking of all statements) to form a correlation matrix which is then subject to factor analysis. The software outlined in the next section can help you with this. The factor analysis will produce a number of statistically significant solutions and your role as the analyst is to decide how many factors you retain for interpretation. This will be an iterative process where you consider the internal coherence of each factor: i.e. does the ranking of the statements make sense, does it align with the comments made by the participants following the Q sort as well as statistical considerations like Eigen Values. The factors are idealised Q sorts that are a complete ranking of all statements, essentially representing the way a respondent who had a correlation coefficient of 1 with the factor would have ranked their statements. The final step is to provide a descriptive account of the factors, looking at the positioning of each statement in relation to the other statements and drawing on the post Q sort interviews to support and aid your interpretation.

### Software

There are a small number of software packages available to analyse your Q data, most of which are free to use. The most widely used programme is PQMethod. It is a DOS-based programme which often causes nervousness for newcomers due to the old school black screen and the requirement to step away from the mouse, but it is actually easy to navigate when you get going and it produces all of the output you need to interpret your Q sorts. There is the newer (and also free) KenQ that is receiving good reviews and has a more up-to-date web-based navigation, but I must confess I like my old time PQMethod. Details on all of the software and where to access these can be found on the Q methodology website.

## Applications

Q methodology studies have been conducted with patient groups and the general public. In patient groups, the aim is often to understand their views on the type of care they receive or options for future care. Examples include the views of young people on the transition from paediatric to adult health care services and the views of dementia patients and their carers on good end of life care. The results of these types of Q studies have been used to inform the design of new interventions or to provide attributes for future preference elicitation studies.

We have also used Q methodology to investigate the views of the general public in a range of European countries on the principles that should underlie health care resource allocation as part of the EuroVaQ project. More recently, Q methodology has been used to identify societal views on the provision of life-extending treatments for people with a terminal illness.  This programme of work highlighted three viewpoints and a connected survey found that there was not one dominant viewpoint. This may help to explain why – after a number of preference elicitation studies in this area – we still cannot provide a definitive answer on whether an end of life premium exists. The survey mentioned in the end of life work refers to the Q2S (Q to survey) approach, which is a linked method to Q methodology… but that is for another blog post!

Credit