STATISTICS AND ME
Malcolm Hooper
September 2011
Statistics
and the conclusions drawn from them are nowhere more crucial than in
the delivery of medical care. Drawing appropriate conclusions from
correctly processed and interpreted data is vital. Where this
doesn’t happen, the consequences can be devastating.
Randomised controlled trials (RCTs)
are seen as the gold standard – so how is it possible to arrive at a
translation into clinical practice that presents a nightmare to some
of the sickest people in the country? This is what happened with the
PACE Trial, in which it was possible for a participant to
deteriorate physically over the course of the trial yet still be
reported as having “recovered”.
PACE is the acronym for “Pacing,
Activity, and Cognitive behavioural therapy, a
randomised Evaluation”; it cost over £5 million and was
described as a Government-funded RCT of rehabilitation strategies
for patients with Chronic Fatigue Syndrome (CFS)/Myalgic
Encephalomyelitis (ME).
Introduction
Imagine the following fictitious
scenario: a high impact medical journal publishes the results of a
clinical trial described as a large RCT of “Eutensia”, a new drug
for refractory hypertension; the results are impressive, with 30% of
those given “Eutensia” reported as having normal blood pressure at
the end of the trial. However, the trial investigators had
redefined “the normal range” of blood pressure such that it was
possible for a person receiving “Eutensia” to leave the trial with
higher blood pressure than before treatment but still be
counted as having blood pressure “within the normal range”.
Statistics and how they are reported
are currently a hot topic for people with ME following publication
of the report of the PACE trial in The Lancet earlier this year (1)
where a similar scenario actually exists.
Prior to publication, the Principal
Investigators deviated from the statistical analysis described in
the Trial Protocol (2) with the
result that a participant could deteriorate on both primary outcome
measures following treatment and still fall within the redefined “normal
range” (interpreted as “normal” health).
Even worse, an
accompanying Comment (3) in The
Lancet described around one third of participants as “recovered”,
an error that The Lancet’s senior editor acknowledged in writing but
which has still not been corrected, so it remains on the record to
be cited uncritically.
The SF-36
physical function score of 60 used by the Investigators to define
the threshold of the “normal range” specifically for the PACE
Trial (discussed below) contradicts how the authors themselves
previously defined the markers of recovery in the same disorder
using the same measure -- in 2007 they stated: “A patient had
to score 80 or higher to be considered as recovered”
(4) and in 2009 asserted: “A
cut-off of less than or equal to 65 was considered to reflect severe
problems with physical functioning” (5).
Moreover “recovery” is described in the Protocol as a
physical function score of 85 or above.
In a post-publication letter sent by
the Investigators to The Lancet, they acknowledge that: “Being
within a ‘normal range’ is not necessarily the same as being
‘recovered’”, yet they have failed to correct this
widely-reported misperception in the media and the medical press.
Indeed one of the PACE Principal Investigators added to it at the
press conference convened to launch the paper: ”twice as many
people on graded exercise therapy and cognitive behaviour therapy
got back to normal”; this was reported verbatim the
following day in The Guardian, whose health correspondent stated: “More
people recover if they are helped to try to do more than they think
they can” (6). To date, no
“recovery” data have been published.
What is ME?
The World Health Organisation has
classified ME as a neurological disorder since 1969 and codes the
term “chronic fatigue syndrome” (CFS) only to ME and
explicitly not to other syndromes of chronic fatigue such as
those seen in psychiatric disorders (7).
Taxonomically and clinically, chronic fatigue and CFS are not the
same, as confirmed in 1990 by the American Medical Association (8);
by the WHO in 2001, 2004 and 2009 (9)
and by the US Centres for Disease Control in 2011 (10).
The International Consensus Criteria
for ME (11) produced by 26 world
experts from 13 countries point to widespread inflammation and
multisystemic neuropathology, of which the cardinal symptom is
post-exertional neuroimmune exhaustion: “Myalgic
encephalomyelitis (ME), also referred to in the literature as
chronic fatigue syndrome, is a complex disease involving profound
dysregulation of the central nervous system and immune system,
dysfunction of cellular energy metabolism and ion transport and
cardiovascular abnormalities. The underlying pathophysiology
produces measureable abnormalities in physical and cognitive
function and provides a basis for understanding the symptomatology”.
PACE – what’s it all about?
From its inception, PACE was
controversial because it was based on the Investigators’ belief that
ME/CFS is a psychogenic illness that is reversible with cognitive
behavioural therapy (CBT) to “change the behavioural and
cognitive factors assumed to be responsible for perpetuation of the
participants’ symptoms and disability” (1), together with
graded exercise therapy (GET) designed to reverse their “deconditioning”.
The Investigators’ beliefs are
undermined by substantial clinical and biomedical evidence,
including that of international experts such as Professor Paul
Cheney from the US, who has reported that “We see cardiac
diastolic dysfunction in almost every case” and that some ME
patients’ heart function “is so poor they would fit well into a
cardiac ward awaiting transplant”. On graded exercise he is
unequivocal: “The whole idea that you can take a disease like
this and exercise your way to health is foolishness. It is insane”
(12).
The PACE participants’ leaflet stated:
“Medical authorities are not certain that CFS is exactly the same
illness as ME, but until scientific evidence shows that they are
different they have decided to treat CFS and ME as if they are one
illness”. However, the Investigators’ standpoint on CFS bears no
relationship to the WHO classification nor to what biomedical
experts mean by the same term. This has created confusion amongst
clinicians and unnecessary suffering for ME patients.
People with ME have long been saying
that conflating ME/CFS with psychogenic fatigue is at the root of
public and medical misperception and mistreatment.
Despite many submissions of concern,
and whilst insisting that they were studying ME, the Investigators
used entry criteria for chronic “fatigue” known as the Oxford
criteria (13) which have neither an
appropriate degree of sensitivity to identify those with ME, nor the
specificity to separate them from the wider “fatigued” population.
Writing to The Lancet’s
editor-in-chief following publication, the Investigators implicitly
acknowledge this: “The PACE trial paper…does not purport to be
studying CFS/ME but CFS defined simply as a principal complaint of
fatigue that is disabling, having lasted six months, with no
alternative medical explanation (Oxford criteria)”.
The Trial Protocol, however, clearly
refers to patients with “CFS/ME”.
Despite their letter to The Lancet
confirming that they were not studying ME, the Investigators assert
that the results of the PACE trial are generalisable to those who
meet either the Oxford or alternative criteria for ME “but only
if fatigue is their main symptom”. This has been interpreted as
meaning that CBT and GET are effective no matter how the disorder is
defined, an illogical assertion. There is a direct link
between such conceptual confusion and the likelihood of iatrogenic
harm.
In professionally analysed surveys
conducted by various ME charities, a large proportion reported that
CBT and GET were harmful, resulting in substantial deterioration:
certainly it has been demonstrated that incremental exercise induces
prolonged and accentuated oxidative stress, compounding the existing
cellular damage.
THE PACE TRIAL FINDINGS
Scrutiny of the chosen definition of the “normal
range” and the entry criteria reveals a manifest contradiction,
which the table below illustrates.
It also shows that the benchmarks used differed
considerably from those to which the Investigators had committed
themselves in the Protocol.
PACE Trial Benchmarks |
SF-36 Physical Function sub-scale
lower scores mean poorer
physical functioning |
Chalder Fatigue Questionnaire
higher scores mean more
fatigue |
Entry Criteria
|
60 or below when recruiting
began
subsequently raised to 65 “to
increase recruitment”
|
‘Bimodal’ score of at least 6
ð
this can translate to a score as low as 12 on ‘Likert’
(scale) rating method |
Analysis Conducted:
Threshold of
“The Normal Range” |
60 and above |
‘Likert’ score of 18 or less
ð
this can translate to a score as high as 9 on ‘bimodal’
(binary) rating method |
Analysis Proposed (Trial
Protocol):
“a positive outcome” |
75 and above
(or a 50% improvement over
baseline) |
‘Bimodal’ score of 3 or less
(or a 50% improvement over
baseline) |
Meaningful Benchmarks?
The Protocol
set specific benchmarks against which findings were to be judged;
additionally, in 2006 the Chief Principal Investigator assured the
Multicentre Regional Ethics Committee that “a categorical
positive outcome” would be an SF-36 score of at least 75, saying
that this would “[reassert] a ten-point score gap between
entry criterion and positive outcome”, and that it
“would bring the PACE trial into line with the FINE trial, an MRC
funded trial for CFS/ME and the sister study to PACE” (14).
When in April
2010 the FINE (Fatigue Intervention by Nurses
Evaluation) Trial reported, the difference between
intervention and comparison groups at the primary outcome point was
not statistically significant, so it is notable that when the PACE
report was published in February 2011, these same benchmarks of “a
positive outcome” had been dropped.
Remarkably, in view of the complexity
of much of the analysis presented in The Lancet article, the
Investigators offered this explanation: “Changes to the original
published protocol were made to improve…interpretability”
(15).
What is “Normal”?
“The normal range” and the lay
term “normal” are not the same. “The normal range” is
a statistical concept, with a technical definition – the range of
values encompassed by the mean plus or minus one standard deviation
from the mean. For the Investigators to infer that “within the
normal range” equates to normal health is misleading, because “normal”
in lay terms means high physical function with little or no
impairment.
Around 90% of the general population
are within the “normal range” according to the benchmark used
to gauge PACE participants’ outcomes – ie. 60 and above for SF-36
physical function, with only 10% functioning at a lower level. In
stark contrast to the general population, around 70% of PACE
participants who underwent CBT/GET failed to reach the
Investigators’ redefined “normal range” and remained in the
poorest-functioning 10% of the population.It is the remaining 30% statistic that
has been repeatedly quoted as evidence that around one third of
participants “recovered” with CBT and GET.
However, what the
Investigators failed to clarify was that this 30% figure related to
participants who received both CBT and Specialist Medical
Care (SMC): as 15% of the SMC alone group were in the “normal
range”, in reality CBT added 15% to that figure (GET added 13%),
so to allow the media to believe the 30% figure relates to
effectiveness of CBT/GET is misleading.
Moreover, the Investigators present an
uncommonly low threshold of “the normal range” on physical
function:
“In another
post-hoc analysis, we compared the proportions of participants who
had scores of both primary outcomes within the normal range at 52
weeks. This range was defined as less than the mean plus 1 SD scores
of adult attendees to UK general practice of 14.2 (+4.6) for fatigue
(score of 18 or less) and equal to or above the mean minus 1 SD
scores of the UK working age population of 84 (–24) for physical
function (score of 60 or more)”.
This is curious because the paper
cited in support of this figure reviews normative data from various
sources, none of which appears to provide a mean of 84.
Contrary to what is stated in The
Lancet, the reference group included elderly people, a fact which
the Investigators had no option but acknowledge:
“We did however make a descriptive
error in referring to the sample…as a ‘UK working age population’,
whereas it should have read ‘English adult population’” (16).
The “English adult population”
includes not only the elderly but also sick people: the appropriate
comparator should have been data from an age-matched healthy
population.
In a radio interview, one of
the Investigators stated candidly:
“What this
trial isn’t able to answer is how better are these treatments than
really not having very much treatment at all”
(17).
After screening over 3,000 patients, a trial lasting 9 years and
costing £5 million, that is an extraordinary statement.
Evidence of Efficacy?
Mean Improvements
The Investigators conclude that CBT
and GET “moderately improve outcomes for chronic fatigue
syndrome”.
This claim
rests on relatively better average outcomes on measures of
fatigue and physical function among those who received CBT or GET
alongside “Specialist Medical Care” compared with the group who
received SMC alone (SMC consisted of advice on balancing activity
and rest, and also help with sleep and pain control). A fourth arm
of the trial – the Investigators’ own version of “pacing” – did not
emerge favourably.
Two primary outcome measures were used
to assess the Trial: fatigue was assessed using the Chalder Fatigue
Questionnaire (18) and physical
function was assessed using the Short-Form (SF-36) physical function
subscale (19).
The Investigators determined a “clinically
useful difference” for the two primary outcomes to be an
improvement of 2 points on the Chalder Fatigue scale (Likert scoring
0 – 33) and 8 points on the SF-36 physical function scale (0 – 100).
CBT and
physical function: the CBT
group failed to achieve a “clinically useful” mean
improvement: the mean difference from SMC was only 7.1 points.
This was
mentioned only indirectly by the Investigators: “Mean differences
between groups on primary outcomes almost always
exceeded predefined clinically useful differences for CBT and GET
when compared with APT and SMC” (where the words “almost
always” refer to the failure of CBT to achieve a clinically
useful difference).
CBT and
fatigue: the mean difference
from SMC was -3.4, which was
a marginal 1.4 points better than the clinically useful threshold of
2.
GET and
physical function: for
physical function, the mean difference for GET was trivially better
than for CBT with a score of 9.4, this being a marginal 1.4 points
above the clinically useful difference of 8 points on a scale of 0 –
100.
GET and
fatigue: the mean difference
was -3.2 (ie. 1.2 points better than the clinically useful
threshold of 2 points on a scale of 0 – 33).
These
results challenge the Investigators’ assertions that psychological
interventions should be the primary management strategy for patients
with ME/CFS.
Both
primary outcomes – physical function and fatigue -- were
self-reported, but studies of graded exercise for ME/CFS patients by
other investigators have demonstrated that self-report
questionnaires do not relate well to actual activity (20).
Indeed, one US study found that when objective actigraphy measures
were used, there was a numerical decrease from the
pre-treatment baseline (21).
Secondary outcome measures
A secondary
outcome measure was the 6 minute walking distance test. The mean
distance recorded by those who had undergone CBT was 354 metres. For
those who had undergone GET the mean distance was 379 metres, the
latter being a 67-metre increase from baseline.
These scores
were lower than scores documented in many other serious
diseases, such as those awaiting lung transplantation, where a six
minute walking test of less than 400 metres is regarded as a marker
for placing a patient on the transplant list (22)
and the mean score of those in class III heart failure is 402 metres
(23). PACE Trial participants did not
achieve a mean six minute walking distance of 518 metres, a level
considered abnormal for healthy people aged 50-85 years (24).
Moreover, data on the 6 minute walking
test was available for only 69% - 76% of participants, a completion
figure roughly 20% lower than for the other secondary outcome
measures, for which the Investigators offer no explanation.
Significantly, the
CBT group managed less of an average increase in walking distance
than those in the SMC alone group.
The
Distortion Continues
At the press
conference, both the lay and medical press picked up on the PACE
Trial as a resounding success with no caveats whatsoever.
On 18th
February 2011 The Independent proclaimed: “Got ME? Just get out
and exercise”; the Daily Mail reported that “scientists have
found encouraging people with ME to push themselves to their limits
gives the best hope of recovery” and on-line medical
sources such as NHS Choices and NHS Evidence exaggerated reports of
a successful outcome.
A
nightmare for ME patients
Given
(i) the inability of the
recruitment criteria to distinguish between ME/CFS and psychogenic
fatigue,
(ii) the illogical overlap of the entry criteria with
“the normal range”,
(iii) the failure of CBT to achieve a
clinically useful difference for one of the primary outcomes and the
trivial improvement produced by GET,
(iv) the failure to recognise
that an “averaged” improvement often masks very different responses
to an intervention, and
(v) the fact that around two thirds of
participants who received CBT/GET remained in the lowest functioning
10% of the general population,
the international ME community
wonders why the PACE Trial is being hailed as a “gold standard”
study which demonstrated the efficacy of CBT and GET for ME/CFS
patients (although the Protocol refers to it as an RCT, The Lancet
paper at no point describes PACE as a controlled trial, yet it was
described in the press release as “the highest grade of clinical
evidence” and as “extremely rigorous (and) carefully
conducted”).
CBT and GET are
being actively and inappropriately applied to people with ME or CFS;
the PACE press release states that the results suggest: “everyone
with the condition should be offered the treatment” and that
every patient “who wishes to be helped” should be willing to
take part in such regimes. Non-compliance (for example, if a person
has already found that exercise exacerbates their condition) is
deemed to demonstrate lack of desire to recover, which in some
instances has already led to the withdrawal of state and/or
insurance benefits.
The PACE
Trial is a travesty of science and a tragedy for patients with ME.
References
1. |
Comparison of adaptive pacing therapy,
cognitive behaviour therapy, graded exercise therapy,
and specialist medical care for chronic fatigue syndrome
(PACE): a randomised trial. PD White et al.
The Lancet 5th March
2011:377:823-836; published online 18th February 2011:
DOI:10.1016/S0140-6736(11)60096-2 (FAST
TRACKED) |
2. |
PACE
Trial Protocol
http://www.biomedcentral.com/1471-2377/7/6 |
3. |
Chronic fatigue syndrome: where to
PACE from here? G Bleijenberg and H Knoop. The Lancet:
published online February 18, 2011
DOI:10.1016/S0140-6736(11)60172-4 |
4. |
Is Full
Recovery Possible after Cognitive Behavioural Therapy for Chronic
Fatigue Syndrome? Hans Knoop, Gijs Bleijenberg, Marieke FM
Gielssen, Jos ver der Meer, Peter D White. Psychotherapy and
Psychosomatics 2007:76:171-176 |
5. |
Fatigue and chronic fatigue
syndrome-like complaints in the general population. Marjolein van’t
Leven, Gerhard A Zielhuis, Jos van der Meer, Andre L Verbeek, Gijs
Bleijenberg. European Journal of Public Health 2009:20:3:251-257
|
6. |
Study finds
therapy and exercise best for ME. Sarah Bosely. The Guardian, 18th
February 2011
|
7. |
WHO
International Classification of Diseases (ICD-10 G93.3)
|
8. |
Following an erroneous News Release
in 1990 about this point, the American Medical Association issued a
correction which said: “ A news release in the July 4 packet
confused chronic fatigue with chronic fatigue syndrome; the two are
not the same. We regret the error and any confusion it may have
caused”. JAMA issues correction (referring to the article
entitled Chronic fatigue: A prospective clinical and virologic study
by Deborah Gold et al: JAMA 1990:264:1:48-53).
|
9. |
On 16th
October 2001 the WHO provided written clarification: “I wish to
clarify the situation regarding the classification of neurasthenia,
fatigue syndrome, post-viral fatigue syndrome and benign myalgic
encephalomyelitis. Let me state clearly that the World Health
Organisation (WHO) has not changed its position on these disorders
since the publication of (ICD-10) in 1992 and versions of it during
later years. Post viral fatigue syndrome remains under the
diseases of the nervous system as G93.3. Benign myalgic
encephalomyelitis is included within this category. Neurasthenia
remains under mental and behavioural disorders as F48.0 and fatigue
syndrome (note: not The Chronic Fatigue Syndrome) is included
within this category. However, post viral fatigue syndrome is
explicitly excluded from F48.0”
On 23rd January 2004 the
WHO provided further written clarification: “This is to confirm
that according to the taxonomic principles governing the Tenth
Revision of the World Health Organisation’s International
Classification of Diseases and Related Health problems (ICD-10) it
is not permitted for the same condition to be classified to more
then one rubric as this would mean that the individual categories
and subcategories were no longer mutually exclusive”
On 30th January 2009 the
WHO re-confirmed the position: “I confirm that the WHO has not
changed its position regarding benign myalgic encephalomyelitis.
Statements made in the past…regarding coding and classification of
the aforementioned condition are still valid. There is no evidence
that any change should be made to this in ICD-11”
|
10. |
“CFS is different than fatigue.
CFS is a long-lasting debilitating illness with impact similar to
heart disease, multiple sclerosis and AIDS”: US Centres for
Disease Control; Emergency Preparedness: Consideration in CFS; power
point for physicians, 18th August 2011 |
11. |
Myalgic Encephalomyelitis:
International Consensus Criteria.
Carruthers BM, van de Sande MI, de Meirleir KL, Klimas NG,
Broderick G, Mitchell T, Staines D, Powles ACP, Speight N,
Vallings R, Bateman L, Baumgarten-Austrheim B, Bell DS,
Carlo-Stella N, Chia J, Darragh A, Jo D, Lewis D,
Light AR, Marshall-Gradisbik S, Mena
I, Mikovits JA, Miwa J, Murovska M, Pall ML, Stevens S.
J. Intern Med. 2011
doi:10.1111/j.1365-2796.2011.02428.x
|
12. |
DVD of
“Invest in ME” CPD (Continuing Professional Development-accredited)
Conference, 2010
http://www.investinme.org/IiME%20Conference%202011/IiME%20International%20ME%20Conference%202011%20DVD%20Orders.htm
|
13. |
A report -
Chronic Fatigue Syndrome: Guidelines for Research. M Sharpe et al.
JRSM: 1991: 84:118-121
|
14. |
Letter
dated 9th February 2006 sent by Professor Peter White to
Mrs Anne McCullough, Administrator, West Midlands Multi-centre
Research Ethics Committee
|
15. |
The PACE trial
in chronic fatigue syndrome – Authors’ reply. The Lancet:
doi:10.1016/S0140-6736(11)60651-X)
|
16. |
|
17. |
Comparison of
treatments for chronic fatigue syndrome – the PACE trial. ABC
National Radio: The Health Report. http://tinyurl.com/84a9vf3
|
18. |
Development
of a Fatigue Scale. Trudie Chalder, Simon Wessely et al. J
Psychosom Res 1993:37:2:147-153
|
19. |
The MOS 36 item short form health survey (SF-36):
II. Psychometric and clinical tests of validity in measuring
physical and mental health constructs McHorney CA, Ware JE, Raczek
AE; Med Care 1993. 31: 247–63
|
20. |
Physical activity in chronic fatigue syndrome:
assessment and its role in fatigue. Vercoulen JH et al. J Psychiat Res 1997: 31(6):661-673 |
21. |
Cognitive
behaviour therapy in chronic fatigue syndrome: is improvement
related to increased physical activity? Friedberg F et al. J Clin
Psychol: 2009:65(4):423-442
|
22. |
The six
minute walk test: a guide to assessment for lung transplantation. Kadikar A et al; J Heart Lung Transplant 1997:16(3):3130319
|
23. |
Six minute
walking test for assessing exercise capacity in chronic heart
failure. DP Lipkin et al; BMJ 1986:292:653
|
24. |
Six minute
walking distance in healthy elderly subjects. T Troosters et al;
Eur Respir J 1999:14:270-274. |
|