Does EA fail in GAD and IBS?

Inspired by Mak et al J Gastroenterol Hepatol 2019.[1]

Standing on a glass floor at this height generates a degree of anxiety!

This paper caught my eye because the title seems to pronounce a definitive negative result to a useful question concerning the value of EA (electroacupuncture) in GAD (generalised anxiety disorder) in patients with IBS (irritable bowel syndrome). I was curious because definitive negative results as opposed to neutral ones are difficult to achieve. The majority of trials show neutral results ie neither positive or negative, yet they are unlikely to be powered for equivalence. So acupuncture might fail to be significantly better or worse than a control or comparator, but cannot be said to be equivalent since that usually requires twice the statistical power. Non-inferiority is another matter, and it easier to achieve, but you have to plan that from the outset. Trials are generally powered be able to discard a one-sided null hypothesis, eg the null hypothesis would be that acupuncture is not better than a comparator. A two-sided hypothesis might be that acupuncture is not better and not worse than a comparator, and hence equivalent. If that is too much, don’t worry, this trial used a simple one-sided hypothesis: is EA more effective than non-penetrating sham EA in comorbid GAD and IBS?

Is EA more effective than non-penetrating sham EA in comorbid GAD and IBS?

The EA included pairs of needles in all four limbs: my favourite ST36 linked with ST37; as well as PC6 linked to HT7 (a bit fiddly to perform I expect). The sham used a foam block to conceal the non-penetrating retractable needle. It is not clear whether or not the foam blocks were used in the real EA group.

The primary outcome was referred to as the 7-item Patient Health Questionnaire (PHQ) section for anxiety, but to avoid confusion this is also known as the GAD-7.

The results are stated as follows:

80 subjects, 40 in each arm, were randomized. All but 2 in the sham group completed 10 weekly sessions. There was no significant difference in the proportion of patients experiencing significant (>/= 50%) reduction of anxiety symptoms between the two groups immediately after intervention (32.4% vs 21.6%, p=0.06) and at 6-week follow-up (25.7% in electroacupuncture vs 27% in sham, p=0.65).

from the abstract of Mak et al .[1]

There certainly seems to be a trend favoring EA immediately after the treatment course, but that clearly wears off by 6 weeks. Could the authors have missed a short-term effect of EA? The usual p value adopted for a one-sided test is p<0.05 ie a 1:20 chance; p=0.06 is a 1:16-17. To me this is screaming out as a possible type 2 statistical error, that means there is a real difference between groups but the trial fails to show it statistically because it is underpowered (too small). Well just looking at the figures you suspect that… in acupuncture for chronic pain we need 400 in each arm to measure the difference of real over sham, so 40 looks as though it falls rather short of the mark.

How do you tell if a trial was adequately powered (big enough)? Your next stop should be the power calculation:

In the trial of Eich et al, 60.7% of patients with GAD produced clinically significant response to acupuncture, while response rate to sham acupuncture was 21.4%. To achieve a power of 80% and assuming Type I error of 0.05, the sample size required to detect this difference in response, was estimated to be 29 for each arm. Presuming a 30% drop-out rate, a total of 80 participants were randomized.

from the methods section of Mak et al .[1]

A clinically significant response in over 60%! Well that is a good result from Eich et al.[2] So now we need to see what exactly Eich et al did and what was measured. They had only 10 patients with GAD and used a CGI (clinical global impression) scale for the outcome. Not only was it a different (very likely softer) outcome, but n=10 is at severe risk of overestimating the effect, and there is no mention of any estimate of SD (standard deviation), an essential requirement for a power calculation.[3]

It is premature to say that EA is ineffective in GAD, but it certainly looks as though any effects are short-lived.

Well that seems to be the major oversight in this otherwise well reported study. In conclusion I think it is premature to say that EA is ineffective in GAD, but it certainly looks as though any effects are short-lived.


