By John D. Perry, PhD
In
the first month following HCFA's Medical-Surgical Panel Public Hearings on
April 12-13, 2000, attention was focused on the various procedural
irregularities that surrounded the formal process. Everyone was stunned by the outcome, and we waited to see the
promised protest of the Consumer Representative, Ms. Greenberger. Eventually a large number of individuals and
professional organizations submitted additional protest letters as well,
including the American Medical Association, which had not entered the process
in April.
On
May 1st I attended a poster presentation by Dr. Bary Berghmans at
the AUA meeting in Atlanta, and for the first time got a grasp on the conceptual
confusion ("biofeedback is a form of physical therapy") that
underlies the BC/BS TEC report (#1). Soon
thereafter HCFA published the Executive Committee's March 1st transcripts
and reports, and the basis for the Med-Surg Panel's activities finally became
clear. Also clear was that the EC
recommendations were honored more in the breach than the promise. (See #2-5). In #6 we point out that Burton et al (1988) didn't qualify, since
he didn't do any PME! (a TEC oversight.)
In #7 we discuss levels of evidence and the controversy with the EC. #8 concerns the claim that there were
"no relevant outcomes" in two studies. In #9 we take issue with TEC's claims of bias, and also point out
that Franke (2000) also did not have a PME-alone control group - another TEC
error. Finally, in #10 we introduce
three new forms of bias that apply to behavioral interventions such as
biofeedback, and fault the TEC heros on all three forms of bias.
1. Berghmans' EGG and Conceptual Confusion (5/11/00)
2. What the EC recommended to Panels (5/26/00)
3. Dr. Landy's "report" (5/26/00)
4. Dr. Zendle's "report" (5/26/00)
5. The slide they asked to see (5/27/00)
6. Another BC/BS error (Burton & Burgio) (5/30/00)
Burton's Urge patients didn't get PME Alone!
7. What happened to the AHCPR Guidelines? (5/31/00)
8. No relevant outcomes? (5/31/00)
9. TEC's three forms of bias (6/1/00)
Franke didn't have a PME alone control group
10.
Three additional forms of bias (6/4/00)
Bary
Berghmans presented a poster at the AUA convention in Atlanta last week that
was essentially a re-hash of his February 2000 paper on Biofeedback and Urge
Urinary Incontinence.
But
there was one important difference -- as a poster presentation he included a
graphic that was not part of that BJU paper.
The graphic showed his concept of "biofeedback", and explains
the conceptual confusion that was included in the Monaco Report, and
subsequently became the cornerstone of the BC/BS TEC report on Biofeedback.
The
key is Berghmans' drawing of the relationship between "physical
therapy" and "biofeedback".
Picture,
if you will, an egg, with a yoke. The
entire egg is labeled "physical therapy", while the yoke is labeled
"biofeedback".
In
Berghmans' view, everything that is called "biofeedback" is a sub-set
of "physical therapy". There
is no part of biofeedback that isn't a "physical therapy"
activity. Thus Berghmans set out to
"assess the efficacy of *physical therapies* for first-line use in the treatment
of urge urinary incontinence..." (Feb.
2000 abstract, emphasis added).
And,
as you know, he concludes that there are "too few studies to evaluate the
effects of PFM exercise with or without biofeedback...". That's notwithstanding that the very best
study, in terms of methodological criteria defined by Berghmans, is Burgio's
1998 JAMA study, which found that biofeedback was vastly superior to drugs or
to placebo.
Burgio's
problem is that, in spite of her study being the largest study ever done of
Urge Incontinence, it is only ONE good study, and in Berghmans' so-call
evidence-based model you have to have three high-quality studies to
"prove"
a
point.
[In
Berghmans' model, you need at least an n of 50; Burgio had nearly 200. In other words, if the Burgio authors had
published their results in three groups of over 65 subjects each, we would have
"three high quality studies" and biofeedback would be acknowledged as
superior to PME alone. If that isn't
arbitrary, I don't know what is.] Interestingly enough, in the Berghmans
graphic the relationship between "physical therapies" and
"electrical stimulation" are shown as two only partially overlapping
circles. In other words, there are activities
that are uniquely electrical stimulation, there are activities that are
uniquely physical therapy, and there are activities that are part of both
electrical stimulation and part of physical therapy. I don't have any problem with that.
But
what I don't understand is Berghman's instance that the entire world of
biofeedback can be described as a circle WITHIN the sphere of Physical
Therapy.
It
is this very assumption that leads Monaco and the BC/BS TEC report to assume
that biofeedback can be considered an "additive" to physical therapy,
and one can legitimately investigate the value of this additive by comparison
to physical therapy (PME) that does NOT include the additive, biofeedback.
But
if biofeedback is not merely a branch of physical therapy, the entire process
unravels. Suppose that biofeedback
includes elements or activities that are not properly or commonly considered
aspects of "physical exercise", and are not provided by physical
therapists? The entire PME with
Biofeedback vs. PME Alone comparison collapses.
And
there is evidence that this is so.
Ironically, it comes from one of Berghmans' own frequent collaborators,
Norwegian physiotherapist Prof. Kari Bo.
Bo argues that biofeedback therapy for incontinence cannot work by
"strengthening pelvic muscles" because biofeedback produces results
in too quick a time to be explained by changes in muscle physiology as a result
of exercise. And, since EMG readings go
up at a much faster rate than can be explained by what is known about exercise
in general sports medicine, Bo concludes, quite mistakenly, that EMG scores are
NOT a reliable indicator of pelvic muscle strength. She considers them "invalid".
There
is, of course, another explanation, which is already widely understood within
the field of biofeedback, if not in physical therapy. That is that what the EMG device is measuring is the
"effective" strength of the muscles; that is, the combination of
basic muscle physiology and vastly improved central nervous system function
resulting from biofeedback training.
[This
point is drawn from my essay "Are we really 'strengthening muscles' down
there?" in the March, 2000 issue of California Biofeedback, Jeff Cram,
editor.] It is the combination of improved CNS functioning and slight
improvements in muscle physiology that gives "biofeedback" the
competitive edge over plain pelvic muscle exercise (physical therapy)
alone.
Improvement
in CNS functioning can be readily shown, based on Bo's argument, by increases
in measured EMG Pelvic Muscle Strength that are TOO rapid to be accounted for
by exercise physiology alone. If the
EMG goes up by 50% in seven days, and exercise can only account for (say) a 5%
increase in seven days, then the other 45% clearly comes from improved CNS functioning.
Improvements
of 30-100% in the first few weeks of biofeedback training are not uncommon,
especially after the first or second week of home biofeedback training.
Publications
promoting PMEs alone, such as those of the National Association For Continence,
are quite consistent in stating that patients must do daily PMEs for
"several months" before they will notice significant
improvement. But with biofeedback
results are obtained in a matter of a few weeks. Why?
Even
Berghmans noted "the positive trend in speed of improvement with the
addition of biofeedback (1998, p.
188)".
Ceresoli
1993, for example, compared six weeks of biofeedback with 13 weeks of plain
PMEs. The biofeedback group was
insignificantly a tad better than the plain group. (And Ceresoli used an inferior form of biofeedback -- perineal
measurements rather than vaginal feedback, further depressing the biofeedback
results.)
Further
support for the notion that biofeedback adds CNS improvements that are not part
of plain PME alone come from comparison with the results of electrical
stimulation studies.
When
the muscle is only *passively* exercised, as in electrical stimulation (whether
magnetically coupled or directly coupled), the results (1) take longer, more
like plain PMEs, and (2) are not as dramatic, as biofeedback results.
The
latest Neotonus results, for instance, show only a 25% reduction in pad weights
(20 -> 15 grm.) In another such study, leaks per day declined only 48%. (1999 AUA paper). Sand et al (1995) reported a 29% reduction in leakage reports
after 14 weeks of stim therapy. In
other stim studies passive treatment required 3.5 months (Bergman and Eriksen,
1986) to "4 to 19 months" (Fall et al, 1977) to obtain good
improvements.
The
average training time in our biofeedback program was 4.3 visits over 8 weeks;
since patients were required to train until they had been "dry"
(without any leaks) for 30 days, the average time to dry was 4 weeks. [See http://www.incontinet.com/effective.htm
for details.] The role of CNS enhancement in the treatment of incontinence is
already the topic of anecdotal reports on the internet, where several
researchers have reported success using EEG biofeedback alone for incontinence.
In these reports, improvement in CNS
function alone, not muscle exercise, was used to treat incontinence.
In
a panel discussion at the AAPB Convention in Denver last month, two clinicians
(Linda Kirk and Louise Marks) reported on difficult cases treated with a
combination of EMG pelvic muscle biofeedback AND EEG biofeedback (now called
"neurofeedback").
There
is an interesting parallel tradition that supports the role of CNS enhancement
in the treatment of incontinence. The
work of Moshe Feldenchrist involves a "mental rehearsal" method that
is claimed to be as effective as physical exercise at certain tasks. It provides an interesting parallel
development to biofeedback.
So
we return to the question; is biofeedback a sub-category of physical therapy? It is true, to be sure, that biofeedback is
sometimes a "modality" that is used by physical therapists, just like
hot packs and ultrasound.
Consider
an analogy: If physical therapists pray with and for their patients, does that
make "prayer" a form of physical therapy? Of course not.
In
other words, just because physical therapists sometimes do it doesn't make
biofeedback a form of PT. Like
religion, biofeedback exists outside the realm of PT. Berghmans' egg model for the relationship is simply wrong. Biofeedback should be described, like
electrical stimulation, as a partly overlapping but distinct separate circle of
activity.
The
HCFA-TEC presentation focused on the issue of whether ADDING biofeedback to PME
resulted in increased benefit. They
repeated the common mistake of thinking of biofeedback as a technique ADDED TO
"plain pelvic muscle exercises".
[Unfortunately,
the titles of important papers by both Burgio and Tries reinforce this
ahistorical conception of "PMEs 'enhanced' by biofeedback".] The TEC
report says:
"PMEs are the main component of treatment.
PMEs derive from the Kegel exercises developed in
the 1940s and 1950s."
But
that is simply FALSE. The "Kegel
exercises" were developed by nurses in the 1970s and 1980s when Arnold
Kegel's perineometer (the first biofeedback device) became commercially
unavailable. [See the historical essay
"The Bastardization of Kegel's Exercises" at http://www.incontinet.com/articles/art_urin/bastard.htm]
Originally, "biofeedback" was the main component of treatment
developed by Kegel in the late 1940s and 1950s. (The term "biofeedback" wasn't coined until the late
1960s, but the process Kegel used has long been recognized as the first example
of biofeedback). [See http://www.incontinet.com/articles/art_urin/20yearbf.htm]
Kegel NEVER advocated the use of the non-biofeedback exercises that were only
developed after his death.
Therefore,
an historically-correct formulation would address the question of whether
SUBTRACTING biofeedback (including daily home training with a biofeedback
device) from Kegel's Program DECREASED the effectiveness of it.
Unfortunately,
the highly-touted research projects of Berghmans and Burns DO NOT address that
question, since they did NOT use a biofeedback program like Kegel's. Kegel required the daily at-home use of a
biofeedback device. (Berghmans and
Burns did not). Kegel understood that
patients learn at different rates, and did not use of fixed number of training
sessions. (Berghmans and Burns
did.)
The
only study that did follow the Kegel model was Shepherd, Montgomery and
Anderson (1983) which found an 83% symptom reduction rate for biofeedback
compared with a 25% rate for plain PMEs.
But this study is dismissed by TEC on the grounds that they did not
perform a test of statistical significance!
[Sorry, but back in 1983 when you got clear-cut results like 83% vs. 25%
no one EVER thought it was necessary to run stats. The computer revolution didn't come until 1984, when the MAC was
introduced! Hello?] Is
"biofeedback" a sub-class of physical therapies"? The biofeedback society (AAPB) doesn't think
so.
While
there are many prominent PTs who use biofeedback (Susan Middaugh and Stephen
Wolfe come quickly to mind), physical therapists have always been a minority in
the biofeedback world.
The
most prominent group in biofeedback is psychologists -- the same people who
developed behavior modification and behavioral medicine. Three quarters of the experts in the classic
text "Biofeedback", edited by Mark Schwartz, are psychologists. Physical therapy is an important application
of biofeedback, but biofeedback is not a branch of physical therapy.
When
Berghmans et al, and BC/BS assume that biofeedback is a special form of
physical therapy, they make a conceptual mistake that produces faulty
questions, faulty comparisons, and misleading conclusions.
I
recently recommended that listmembers review the recently-published
"Discussion Paper" at http://www.hcfa.gov/quality/8b1-i6.htm
as well as the transcript of the 3/1/00
MCAC Executive Committee ("next" button) and the "Interim
Recommendations" ("next" again) that followed the meeting. Finally I took my own advice, and here is
the result of a more careful reading.
-----------------------------------------------------
The
Working Group Report of 2//21/00 states clearly that the effectiveness of a
treatment should be evaluated "relative to other items or services" (p. 3) but this was not done in the case of
biofeedback, which was only compared to physical therapy (leading to the
exclusion of the Burgio 1998 study).
The
report also states, with respect to known IDEAL levels of evidence: "This
level of evidence will likely be unavailable for many of the interventions that
the MCAC panels will evaluate." It
further states that "in some cases the panel will determine that
observational evidence is sufficient to draw conclusions about
effectiveness.(p. 4)"
In
the "Interim Recommendations", the same point is stated this way:
"However in many cases the panel will determine
that observational evidence is sufficient to draw conclusions about
effectiveness."
Yet
the April 12-13 panel was told that ONLY RCT trials could be considered, and in
the case of biofeedback, only trials within physical therapy models could be
evaluated. But there is no basis in the
discussion paper for the rigid stand that HCFA staff took in evaluating
incontinence evidence.
Likewise,
the Interim Report appears fully aware of the complexities of
non-pharmacological research when they say:
"For example, the outcomes of a complex
surgical procedure can depend heavily on the skills of the surgeons and other
staff caring for the patient. "
The
AAPB's testimony to HCFA in January discussed at some length the ways in which
surgery and behavioral treatments were similar, and both differ from drug
research. For instance, clinician skill
is critical in surgery and biofeedback, while relatively or completely
unimportant in electrical stimulation and pharmacology studies. (http://www.incontinet.com/isestimadrug.htm)
HCFA
and TEC apparently did not agree.
The
Interim Report makes specific recommendations for evidence review. It states:
"The panel chair should assign at least two
panel members to work closely with the authors of the evidence reports. The rationale for this recommendation is to
ensure that the evidence report covers a sufficient scope of studies, that it
considers relevant alternative interventions, and that it will be useful to the
panels in other respects. The panel should include some people who have
acquired expertise in the topic of a coverage recommendation"
There
is no indication that the first part was done.
As
far as the record goes, only HCFA staff worked with the TEC report staff. (See below for the second part.)
The
Interim Recommendations also state:
"In addition, the Executive Committee
recommends that the panel chair assign two primary reviewers for each
topic. These reviewers will not be the
individuals who assist in the development of the evidence report; they should
be new to the topic. They will evaluate
the evidence independently of one another.
Each will write a 1-2-page report ..."
Dr. Garber, panel chair, explained that this was
a new process that was only "partially" implemented for the
incontinence panel:
10
There is an extensive review process
11
that the executive committee asked for, which we
12
have implemented partially for this panel meeting,
13
not entirely. The review process that
they
14
recommended includes both internal and external
15
review, and I believe that we have come very close
16
to meeting their requests for the internal review.
17
And we have two panel members, Dr. Lisa Landy and
18
Dr. Les Zendle, who are essentially the
internal
19
reviewers from the panel of the topic at hand.
Others
will be less enthusiastic about how well the panel followed the
recommendations. The main problem was
that no one in the public had heard of the "Interim Recommendations"
until the hearing was already underway, so they were not aware of what was happening. [The Interim Recommendations were not
published on the HCFA website until a week AFTER the hearings.] Dr. Zendle, far
from presenting a formal 1-2 page report, made a few almost casual remarks at
the conclusion of the BCBS presentation.
But it appears that he spoke out of turn, because the agenda had listed
"open committee deliberation" to follow the Simon and Lefevre
presentations.
He
assumed it was his turn next. But after
his remarks, which were generally in full support of the TEC report, Chairman
Garber said:
1
DR. GARBER: Thank you. Before we
2
proceed with other questions and comments from
3
panelists, I think Ken Simon had a few other things
4
to add to finish off the HCFA presentation.
The
program did not indicate that Simon would speak twice. Only after Simon finished the HCFA staff
presentation, Garber said that the open committee discussion would begin, and
said:
6
As I mentioned at the outset, two panel
7
members were designated as reviewers, Les Zendle is
8
one of them, Lisa Landy is the other.
Les, I
9
assume that was your opening statement.
And I
10
would like to ask Lisa to speak before we open up
11
to the entire panel to ask questions and make
12
comments.
It
is important to note that neither Zendle nor Landy were listed on the printed
program, so the audience was not aware of the formal nature of their
"reviews".
Dr.
Zendle is a "geriatric medicine specialist" currently working for
Kaiser Permanente. He did not claim to
have ever worked with biofeedback or electrical stimulation.
Dr.
Landy, on the other hand, did meet the Interim Recommendations criteria as
expert in the topic at hand. She is a
urogenecologist who uses these techniques in her work. She was not a member of the med-surg panel,
but was imported for this one hearing, apparently to have at least one panel
member who actually knew something about biofeedback.
Landy's
testimony is worth reading in its own right, but may be best summarized by
noting that (1) she was the only panel member with professional experience in
the subject, and (2) she was the only panel member to vote in the affirmative
(and against BCBS-TEC) on both biofeedback and electrical stimulation reports.
That
tells you something!
None
of the panel members who voted against these modalities claimed to have any
professional (or personal) knowledge of them.
(For
that matter, none of the "experts" at BlueCross/ BlueShield claimed
any such experience, either.)
[Note:
Diane Smith, RN, a well known expert on biofeedback and electrical stimulation,
was a "guest" on the panel, but she was not allowed to vote.] The
Interim Recommendations also contain a provision for expert opinion prior to
the public hearing; The panel...
"should ask independent experts to comment upon
the evidence report in advance of panel meetings. The opinion of experts is the best way to assure everyone, the
public and the panel, that the evidence report is complete and fair. ....The Executive Committee envisions that
the panel will choose a small number of expert reviewers (perhaps no more than
six)...A reviewer may ask the panel's industry representative to obtain
additional information from industry sources.
Clearly,
NONE of these recommendations were followed.
Experts
in the field of Biofeedback and Electrical Stimulation were never consulted
prior to the hearing.
In
fact, the BCBS-TEC report's existence was not even made public until some two
weeks before the hearing. There is no
systematic review by any independent professionals with any clinical experience
in either field, as recommended, and therefore, no expert opinion to be made
part of the public hearing PRIOR to the panel's deliberations.
In
spite of the short notice, many biofeedback and electrical stimulation experts
did in fact review the TEC reports, and they were universal in condemning them
as inadequate and misleading. In the
end, their hands-on testimony was disregarded in favor of the arm-chair
research of "literature reviewers" employed by BlueCross/BlueShield.
There
is only one established expert group concerned with biofeedback, the
Association for Applied Psychophysiology and Biofeedback, and in spite of
frequent communications, these experts were not informed of the existence of
the TEC reports until a few days before their content was made public, just
prior to the hearings. The AAPB experts
did submit eight separate documents pertaining to the subject of incontinence
research, but this expert opinion was NEVER distributed to the Panel members,
in clear violation of both the letter and the spirit of the Interim Report.
We
can only hope that the Executive Committee will call HCFA staff on this blatant
violation of their recommendations.
On
the other hand, the EC had been warned that their role was only advisory; the
HCFA preamble to the Discussion Paper (2/21/00) stated:
"When the panels offer comments to HCFA about
medical evidence, both HCFA and the public should understand the panels’ basis
for making those judgments. Those
standards are the MCAC’s; we do not take them to be criteria or processes
binding to HCFA."
Obviously
they meant it.
Below
is the testimony of Dr. Lisa Landy, the
only voting member of the Med-Surg Panel who had ANY actual professional
experience with biofeedback and electrical stimulation. She was acting in the role of "official
reviewer" of the TEC report for the panel, although at the time the
meaning of this role wasn't clear.
Dr. Landy was not a 'regular' member of the med-
surg panel, but was brought in for this one hearing because NONE of the regular
panel members had ANY experience with biofeedback or stim.
Dr. Landy was the only panel member to vote in
favor of both biofeedback and electrical stimulation at the hearing. Being experienced, she felt that the
evidence was compelling enough.
Following
Dr. Landy's comments, at page 193, is
the only humorous event in the whole two days.
Triggered
by one of Landy's remarks, Dr. Epstein
asked me to elaborate on one of my slides in "two minutes". When he clarified his request, he changed it
to "five minutes", whereupon the Chairman interjected "Not five
minutes though.
Let's
keep this brief.", which brought a round of laughter from the panel. The stenographer did not record the
laughter.
As
I remember it, that was the ONLY laughter in two full days.
It is
noteworthy that although Dr. Epstein voted with the male majority against
biofeedback on the first day of the hearings (a few minutes after the excerpted
part), on the second day he abstained on the electrical stimulation vote.
====
from the official transcript ========
20
DR. LANDY: Yeah. I had some opening
21
remarks. Some of them are kind of
reiterating
22
what's been said already today, but I kind of want
23
to summarize things.
24
The first one is, the task set before us
25
is a very specific one, and it's to answer a series
00188
1
of efficacy and additional benefit. The
MCAC
2
committee has helped us and set forth guidelines
3
for us as panel members specifically to follow, and
4
these guidelines were set up to assess new
5
technologies and compare them to established
6
practices. And we're to use evidence
based
7
medicine as the foundation for our decisions.
8
And as we can see from today's
9
presentations, multiple presentations, that there
10
are several levels of evidence that we can consider
11
and weigh appropriately when we answer these
12
questions. We've heard today from
representatives
13
of multiple professional societies and specialty
14
organizations presenting their consensus statements
15
regarding efficacy of this behavioral
16
intervention.
17
The 1998 [sic: 1988] NIH consensus statement
18
recognized the efficacy of behavioral intervention
19
and specifically biofeedback. There are
guidelines
20
of practice that we all use when we practice in
21
this field based from the AHCPR guidelines which
22
recommend the use of behavioral interventions,
23
including biofeedback, as first line therapy.
We
24
also heard presentations of a technology assessment
25
which confirmed biofeedback efficacy, and then
00189
1
focused on answering the question of whether there
2
is additional benefit achieved from biofeedback
3
over PME alone.
4 I
would like to summarize some of these
5
key points that come out of today's presentations
6
before we go into our discussion, and use this as a
7
launching point for our deliberation.
One of the
8
points is that biofeedback is not a new technology
9
and that the guidelines that were set up to do is
10
to compare to established practice.
Biofeedback is
11
a very well established practice. And
that goes
12
back to the issue of why is PME alone chosen as the
13
standard for comparison? In the
original
14
presentation by the statistician, there was the
15
question of choosing appropriate standards.
And I
16
think we should keep that in the back of our head
17
when we look at all this information and data.
18
From 1948 on, when PME was introduced,
19
Kegel himself recognized the need of using a device
20
to assist and be adjuvant to the PME alone.
And
21
from the very beginning of therapy in this area, a
22
device or perineometer, or some kind of
23
intervention was utilized. So it has
always been a
24
part of established care and standard to use some
25
form of biofeedback method. It really
isn't a new
00190
1
technology.
2
And we have been given evidence from
3
multiple sources, the Bump study in 1991, Kerri [sic]
4
Bo's study in 1990, and most recently, the
5
Sampselle study, 2000, showing the drawbacks of
6
doing Kegel exercise with just verbal instruction,
7
and I think that was brought up very clearly.
8
In 1992 and 1996 updates, the AHCPR
9
guidelines for treatment was more developed, and
10
this was a panel of experts in the field, who came
11
up with these guidelines and recommendations, and
12
they came up with these guidelines based on strong
13
scientific evidence, rated their evidence, and this
14
is akin to our task set before us today.
Their job
15
as panel of experts back in 1996 was very similar
16
to what we are being charged with today.
And they
17
felt that based on their review and the strength of
18
evidence, they've made recommendations regarding
19
pelvic muscle rehabilitation and bladder inhibition
20
using biofeedback therapy as recommendations for
21
treatment of these patient groups. They
22
specifically did not sort out biofeedback and
23
remove it from the formula. And I think
there is
24
something flawed with that whole question of taking
25
away a therapy that's always been part of the
00191
1
treatment from the very beginning.
2
The technology assessment has come to
3
certain conclusions. I think in our
discussions,
4
we can critically analyze the data.
Like they
5
said, the AHCPR guidelines specifically did not
6
address the issue of whether the addition of
7
biofeedback to PME is more effective than PME
8
alone, and I think it specifically was avoided as
9
to not take that out of therapeutic treatment
10
modalities. We have to treat people,
because we
11
treat people in this area with multimodality
12
treatment.
13
Since then though, the question has come
14
up and been the focus of several evidence based
15
reviews. In de Kruif and van Wegen, one
in 1996;
16
Berghmans in 1998; and the meta-analysis by
17
Weatherall in 1999, as well as the current
18
technology assessment, all of them with varying
19
conclusions.
20
I would like to make a point too. This
21
panel was initially charged with addressing the
22
issue of efficacy of biofeedback as an incontinence
23
intervention, and now we are being asked to compare
24
it as an adjunct therapy to PME versus PME alone.
25
Now the question is asking about efficacy as an
00192
1
adjunct to a therapy, and this is an important
2
distinction when looking at the literature.
And
3
when we reviewed this before we came here, we may
4
not have looked at the literature in quite the same
5
way as this nuance brings up. But for
the question
6
at hand, those studies comparing PME alone to
7
biofeedback and PME are the ones we really need to
8
critically review.
9
And we have to look at them for
10
comparison of groups, methodology, and outcome
11
measures. And while analyzing the data,
we need to
12
keep in mint that the PME alone groups show
13
variability between the studies as to what the
14
treatment intervention was in those groups, and
15
consist of interventions other than PMEs, and that
16
may influence the results of the data.
And that
17
brings me back to the issue of, did we select an
18
appropriate standard to compare it to?
19
So that -- in one of the presentations
20
by Dr. Perry, he gave us some slides and I think we
21
critically need to look at those, but he brought
22
out some of the potential information about
23
methodology, about the PME alone group.
24
So, I thought that was a good launching
25
point now for us to open up discussion.
00193
1
DR. GARBER: Thank you, Lisa. Arnie?
2
DR. EPSTEIN: Even without the prompting
3
by Lisa, I was thinking the same thing, that the
4
final slide you brought out, you actually brought
5
out two, but the final one was particularly
6
interesting to me, where you talked about the 25
7
percent, 50 to 60, and 55 to 70 percent, and he had
8
very little time when he did that, and I wonder if
9
we could give him two minutes to get him to expand
10
on where those numbers came from and the strength
11
of the studies behind them?
12
DR. PERRY: I didn't really get the
13
question.
14
DR. GARBER: I think Dr. Epstein is
15
asking if you can show us the last slide, is that
16
correct, or the second to the last?
17
MS. SMITH: He means this one, the
18
levels of PME where you compared the written
19
instruction from Sampselle, Berghmans in '96, and
20
Burgio, where you had 27 percent, then 51 to 60
21
percent.
22
DR. HILL: We have it in our handout.
23
MS. SMITH: We have it in our handout.
24
DR. EPSTEIN: Yeah, and I was really --
25
I have the handout and I have the visual memory,
00194
1
and I didn't have the Sampselle study that I can
2
recall beforehand. It's partly because
of that but
3
also partially because I think it makes potentially
4
an interesting case, and I wonder if you can take
5
the talking points that you would have used five
6
minutes for but were forced not to, and now take
7
them.
8
DR. GARBER: Not five minutes though.
9
Let's keep this brief.
10
DR. PERRY: The Sampselle study is
11
especially interesting because they avoid all the
12
problems with contamination and really did do PMEs
13
alone. They just had a handout, here it
is, a
14
one-pager and you know, this is your education.
15
And I'm amazed, you know, really the differences
16
between us all come down to one thing.
TEC wants
17
to use a rigid definition of biofeedback and a
18
catchall definition of PME alone. It's
interesting
19
because it was sort of the other way around back in
20
the guidelines where they used surgery, clear;
21
drugs, clear; everything else is behavioral,
22
including stim. Does that answer? So, you have a
23
really rigid category of biofeedback, and a
24
catchall category of everything else counts as PME
25
alone, and when you do that, you get nonsignificant
00195
1
results.
2
DR. LANDY: A comment I'd like to make.
3 I
think the importance of sorting out the PME alone
4
group is that if it truly is an intervention, then
5
what you're looking at is the result of an
6
intervention, as opposed to how we clinically use
7
the descriptive term of PME alone. And
when
8
clinically applied, most clinicians in this area
9
would do some form of verbal instruction, written
10
instruction sheet and send the patient home, and
11
that's truly what the studies are not comparing.
12
The studies are comparing one intervention to
13
another, so that PME alone is not really a good
14
standard. The best standard we have are
looking at
15
the studies with, comparing a waiting list control
16
group, because that most represents what we see
17
clinically, because those are people who on their
18
own, at some point in their association with a
19
physician were taught or told to do Kegel
20
exercises, or they read it in a magazine article,
21
and that's what they're doing on their own.
And
22
that best represents the result we get with PME
23
alone clinically.
==============end
of excerpt===============
The
full transcripts are available on the hcfa.gov/quality website, or
incontinet.com.
Following
are the comments of Dr. Zendle at the
Incontinence hearings in Baltimore. As
mentioned in a previous post, Dr. Zendle was one of two panel members asked to
"review" the TEC report and prepare a recommendation to the full
panel.
Due
to a lack of clarity in the printed program, Dr. Zendle made these remarks in
the middle of the HCFA/BCBS presentation.
Taken by themselves, his remarks do not constitute much of a
"review" of the TEC report.
He first praises the AHCPR Guidelines (which TEC rejected), and then
agrees with TEC that "there isn't enough evidence". Then he complains about the lack of research
in this area. Apparently he isn't aware
that there isn't a lot of money to be made in behavioral treatment of
incontinence, and that the potential for profit -- big profit -- is what drives
research.
8
DR. ZENDLE: Well, this has been a very
9
interesting day. It's hard to believe
we have
10
already been here for six hours; it's gone pretty
11
quickly. I want to thank and congratulate
the
12
presenters and the organizers of this.
I've
13
learned a lot.
14
After today -- you know, I went over the
15
questions myself beforehand and I have listened
16
very carefully to what people had to say.
And I
17
have no problem accepting the AHCPR '96 guidelines,
18
and I have no problem agreeing with the clinicians
19
who feel that some patients do better with feedback
20
and PMEs than with the exercises alone.
And I
21
actually think it should be made available to those
22
patients who are so identified, especially if a
23
guideline is being followed that tells you which
24
patients it works best on and which form of
25
biofeedback and what the regimen should be.
00182
1
But I have to agree with the TEC
2
assessment that there isn't sufficient evidence,
3
scientific evidence of sufficient quality really,
4
to conclude that adding biofeedback to the
5
exercises is better or not better than doing the
6
exercises alone. And I guess the only
other point
7 I
would make is that the statistical definition of
8
what's enough evidence isn't really a matter of
9
opinion, it's a scientific matter, that science has
10
already made agreements as to what is
11
scientifically relevant, and I don't think this
12
meets the magnitude of that.
13
It does leave me with one important
14
question, though, and that's why hasn't there been
15
more research in this area? It's not
like this is
16
a rare problem, and it's not like these are mild
17
symptoms. This is a common problem that
is a major
18
life disruption not only for the patient, but for
19
families and for society. And it's
shocking to me
20
actually that there are so few patients that have
21
been looked at in a rigorous way and therefore, we
22
can't reach conclusions with statistical validity.
23
And I'm not sure who's to blame for that, but it's
24
just a question that I'm left with and frustrated
25
with.
======
end of remarks =========
See
the full transcript on the hcfa.gov or incontinet.com websites for proper
context.
Previous
comments in this series mentioned a request by Dr. Epstein for elaboration of a
slide I had presented showing the increasing levels of effectiveness of adding
various interventions to "PME Alone".
Although
my handout containing the content of the slides was actually given to the
stenographer, the transcript as published gives only the verbal presentation,
omitting the visual presentation that was the basis for that verbal
presentation.
Since
each of us had only eight minutes to present our testimony, most people made
the same assumption that I did, namely, that we should use our limited verbal
opportunity to elaborate, rather than repeat, our visual presentations.
But
the result is that the very precise and legal official transcript is of limited
value, because it does not include the words that were presented visually at
the hearing. The reader can only guess
as to what the audience was reading while the speaker was speaking.
(IncontiNet.com
has already published, as a public service, all the testimony of expert
witnesses that was submitted for publication.
See the opening paragraphs at: http://www.incontinet.com/home.htm
for details and links.)
The
following is the text of the slide that Drs. Epstein and Landy asked me to
elaborate.
Levels
of Pelvic Muscle Exercise
|
Written
instruction alone |
27% |
Sampselle
2000 |
|
ADD
vaginal palpation |
51%
to 60% |
Berghmans
1996 |
|
ADD
EMG testing |
54%
to 77% |
Burns
1993 |
|
ADD
formal biofeedback training |
80%
94% |
Burgio
1998 1986; Sussett,
Kegel, etc. |
Commentary:
The numbers represent the percentage reduction in symptoms reported by the
noted researchers for the intervention added beyond "mere verbal instruction
alone", which is best exemplified in Sampselle (March, 2000).
Sampselle
et al reported on the success of a genuine "PME alone" intervention
-- they gave subjects a written handout describing how to do PMEs. They had no confounding interventions, such
as therapist's manual palpation of the pelvic muscles and verbal feedback of
success. It was a case of PURE
"PME Alone", and the results were dismal -- 27% symptom change --
barely above the expected placebo effect for therapeutic attention.
The
Wells 1991 report got the best results for "PME Alone" (77%) --
because she did much more than PME Alone. She actually "tested" subjects' pelvic muscles with an
EMG biofeedback device, before "PME Alone" and every month thereafter
for six months -- a total of *7 EMG sessions*.
TEC
2000 failed to notice this important fact, and as a result, erroneously ascribed
to "PME Alone" greater success than it deserved.
The
original slide is visible in a PowerPoint presentation referenced on the
IncontiNet home page.
We
have previously called attention to the BC/BS-TEC error in considering Burton
et al as an instance of "PME Alone vs.
PME plus biofeedback" in the treatment of Urge Incontinence. The vast majority (11 of 14, or 79%) of
Burton's control group was given "education"; but only 3 of the 14
(21%), those who had stress incontinence, were given instruction in pelvic
muscle exercise.
The
"education" resembles modern bladder retraining:
"Patients...(were) taught to respond to an urge
sensation by: relaxing; tightening their urethral sphincter; relaxing their
abdominal musculature, and, when the urge passes, walking slowly to a lavatory
to void." (p. 695)
There
is no way that can be construed as a form of "pelvic muscle exercise
alone".
(One
has to read the text carefully and compare the text with the tables to discern
precisely who got what. The issue is not spelled out clearly because the
purpose of the 1988 research was not address the issue TEC tried to study 12
years later.)
In
any case, Burton was invoked as an example (the only example) of a controlled
study of PME on Urge Incontinence, and NONE of Burton's Urge patients got ANY
PME at all!
In
reviewing the TEC report yet one more time, we note that a similar mistake was
made by TEC in the classification of Burgio et al, 1986 ("The role of
biofeedback...").
TEC
says:
"The nonrandomized trial by Burgio et al. (1986) assigned 24 patients to PME alone or
PME plus biofeedback after stratifying by age and frequency of
incontinence. The authors reported a
significantly greater percent improvement in incontinent episodes for patients
treated with PME plus biofeedback (76% improvement versus 51%, p <
0.05)."
But
Burgio et al, 1986 says:
"This study examined the effectiveness of
teaching pelvic floor exercises with the use of bladder-sphincter biofeedback
compared with training with VERBAL FEEDBACK BASED ON VAGINAL PALPATION in 24
women..." (Abstract, emphasis added).
Since
the point is made in the abstract, TEC didn't even have to read the article
itself to realize that this study did NOT qualify for inclusion in a field that
was supposed to be limited (albeit arbitrarily) to "PME Alone vs.
PME+BFB".
Burgio
clearly states:
"Verbal feedback training consists of
instructing the patient to squeeze the vaginal muscles around the examiner's
fingers and providing her with verbal performance feedback." (Abstract)
The
Biofeedback technique of Burgio is well known, being the subject a documentary
film by NIA in 1984. Burgio herself
provides substantial verbal feedback when helping the patient to interpret and
understand the tracings of the polygraph.
The
study showed that patients did much better when they were given instantaneous
biofeedback with verbal interpretation instead of delayed, relayed feedback
from a human being alone.
In
addition, it seems plausible that many of the women in this study were slightly
uncomfortable performing exercises with another woman's hand inside their
vaginas, thus reducing the effectiveness of the technique.
In
any case, it is worth noting that this kind of "verbal feedback" is
more labor intensive than "PME with biofeedback", since it requires a
much higher level of clinician experience.
In other words, biofeedback costs about the same as manual-verbal
feedback therapy, but delivers 50% more results.
1. What happened to the AHCPR Guidelines?
Many
people have noted that the famous AHCPR Guidelines on Incontinence (1992 &
1996) were not distributed to the April 12-13 Panel discussing
Incontinence. Both Diane Smith and Lisa
Landy, guest members of the panel, remarked about this omission explicitly, and
several of the speakers mentioned the Guidelines as having shaped and
established the accepted standards of treatment in this country.
The
omission is complex. None of the people
involved in the 2/21/00 committee report, nor the 3/1/00 Executive Committee
meeting, nor the 4/17/00 "Interim Recommendations" had any direct
awareness of incontinence treatments or research and practice in
incontinence. They were reaching for
broad principles that could be applied to ALL areas of medicine.
These
people, mostly physicians, attempted to bring the latest standards and values
in medical research into a broad guide for Medicare-related research in the new
Millennium. The latest trends emphasize
"evidence-based-medicine", which had not been invented when the AHCPR
guidelines were prepared and published.
So
the AHCPR guidelines are considered "primitive" by today's new
standards. Indeed, they are no longer
"politically correct". They
have become an embarrassment to the medical establishment.
This
is clear in item 4 in the "Discussion Report" of 2/21/00, which states:
(Note the omission of AHCPR, [now "AHRQ"])
"The standard of excellence for the evidence
report should be the best work in the private sector (e.g., Blue Cross-Blue
Shield), by professional organizations (e.g., ACP-ASIM), and for other
Federally sponsored panels (e.g., the Evidence-based Practice Centers technical
support for the U.S. Preventive
Services Task Force)."
The
politically correct view is reinforced in Dr. Sox's opening remarks to the
3/1/00 Executive Committee meeting:
14
And we feel that the standard for HCFA should be
15
the best that's out there in other settings, such
16
as the private sector where Blue Cross Blue
17
Shield has a long track record of doing
18
evaluations of the evidence and making coverage
19
decisions in what is a process that's both
20
efficient and I think highly regarded by
21
professional organizations such as the ACP-ASIM
22
and by other federally sponsored panels.
The
23
Agency for Health Research and Quality has a
24
series of evidence-based practice centers in
25
various universities, and I think there are a .00027
1
couple of private settings around the country,
2
and they provide technical support for the U.S.
3
Preventive Services Task Force on which I serve.
So
it was no accident that HCFA staff members suppressed the AHCPR report (i.e., failed
to distribute it to the panel, in spite of its historical importance in the
treatment of incontinence). They didn't
like the way the 1992 and 1996 Guidelines considered ALL levels of evidence to
have potential merit.
[The guidelines ranked evidence as
"A-B-C", based on:
controlled trials, clinical series, or mere expert opinion.]
Panel
members, for the most part, were not familiar with the history of incontinence
work and did not request the AHCPR Guidelines from HCFA staff.
2. Who decides what level of
"evidence" is needed on a particular review?
In
the Executive Committee deliberations, it was clearly up to the PANEL itself to
decide what level of evidence was need for the topic under review. For example, in Dr. Sox's opening remarks
about the 2/21/00 committee report, he said (on 3/1):
(00019)
9
But in some other cases, perhaps many cases, the
10
panel will determine that observational evidence
11
is sufficient to draw conclusions about
12
effectiveness.
At
the Executive Committee on March 1, Dr. Sox (chairman) is very clear that it is
up to the PANEL to decide what level of evidence should be required to reach a
decision.
But
the Med-Surg panel was not told this.
They were led to think that the BC/BS TEC report set the standard for
them. TEC said that only RCTs studies
should be considered, and the Panel never questioned that.
This
point is closely related to the next one.
3. Defining the Question.
Even
more important than the "level of evidence" is the Question for which
evidence is collected in the first place.
Consider
the following exchange that took place after the AHRQ presentation on April
12th:
12
DR. RATHMELL: I would also like to
13
clarify. I want to be very clear about
the
14
question that we're being asked. We are
not being
15
asked whether or not there is adequate evidence to
16
support the effectiveness of biofeedback.
We are
17
being asked if there is adequate evidence, and now
18
I'll quote. Here's the question in the
19
technology: Does adding biofeedback to
PME -- now
20
biofeedback by the definition of the process,
21
includes PMEs -- and we're being asked if adding
22
biofeedback results in a greater improvement in
23
health outcome, okay?
24
It's very important because none of the
25
panelists have any of the evidence, and I
00051
1
understand there is a large body of evidence
2
looking directly at the effectiveness of
3
biofeedback versus control, okay? So it's
very
4
important, we're looking at a very very small
5
subsection and we have been given the evidence only
6
on a small subsection. So we can't
answer the
7
question about whether biofeedback is effective;
8
all we can do is compare it to PMEs, a very very
9
specific question.
10
DR. GARBER: Yeah. Thank you for that
11
clarification. And let me emphasize
that the
12
question posed is in a sense deliberate, because
13
our entire set of classifications for effectiveness
14
are based on comparative statements, and as
15
Dr. Zarin had mentioned, what you
compare it to is
16
critical in analyzing the data and making the
17
determination. Yes?
Actually,
NO. Dr. Rathmell had noticed that the
panel was being asked to focus on a "a very very small subsection of the
evidence", and that the panel was NOT addressing the original question,
"Is Biofeedback effective?".
Does
Chairman Garber's answer make sense?
Not to me. Apparently it didn't
convince Rathmell, either. He went on
to elaborate the point:
16
DR. RATHMELL: This is Dr. Rathmell.
17
Our technology assessment doesn't look at
18
biofeedback versus control, except tangentially.
19
There are many additional studies that look at
20
biofeedback versus control, like a waiting list
21
control, various control groups. And so
I don't
22
think we can answer the question as to whether
23
there's adequate evidence, as to whether
24
biofeedback versus control, but biofeedback versus
25
the PMEs alone, that's all we can assess, that's
00054
1
all the technology addresses.
2
DR. GARBER: Yeah. I think that the
3
question that we were posed by HCFA is the one that
4
the evidence report attempts to answer.
I'm not
5
sure that it's, the question is -- I see.
This
6
question does not spell out that it's compared to
7
PMEs alone. Is that your concern, that
HCFA's
8
question doesn't state that?
9
DR. RATHMELL: HCFA's question does very
10
specifically say that what we're comparing to is
11
PMEs alone. So I would say, that's the
only
12
question we're addressing. Versus
someone sitting
13
on a waiting list and doing nothing, they are not
14
instructed in anything, we are not answering that
15
question.
16
DR. GARBER: That's correct.
According
to Garber, the question (PME vs PME+BFB) was posed "by the evidence
report", not by HCFA. Look more
closely at what Garber said:
2
.....I think that the
3
question that we were posed by HCFA is the one that
4
the evidence report attempts to answer.
I'm not
5
sure that it's, the question is -- I see.
This
6
question does not spell out that it's compared to
7
PMEs alone.
Garber
is, of course, wrong on the first point.
HCFA raised the question about Biofeedback, but the TEC report only addresses
what Rathmell called a "very very small subset" of the possible
questions that could be asked about biofeedback.
HCFA's
question to the panel was about the effectiveness of biofeedback in the
Medicare population.
TEC's
report addressed a very small sub-set of the possible data that could be
marshaled to answer HCFA's question.
Electrical Stimulation was compared to (1) placebo, and (2)
alternatives, such as "PME, vaginal cones, bladder training, pharmacologic
agents". In contrast, Biofeedback
was only compared to PME. Why? Rathmell tried to raise this question, but
he didn't get very far.
4. What types of evidence should be considered?
The
Executive Committee had left the door open for panels to determine that
less-than-perfect evidence, such as observation evidence, might be sufficient
to draw conclusions.
In
contrast, the TEC report implied that there is only one standard, the
"gold standards" of RCT of today's evidence-based-medicine, and the
panel was not given any hint that different standards might apply in different
disciplines.
The
most quotable remark on April 12th was the protest of a panel member who was
aware that his negative vote on "science" could be misconstrued as a
policy recommendation. He said:
"the standard of evidence in science is .05, but the standard of evidence
in policy decisions is .50."
In
other words, we understand that the ONLY standard in science is a RCT study
with statistically significant results, but when making policy decisions, we
expect HCFA to decide on the basis of probabilities. Several panel members voiced support for this intermediate
position, and they expressed frustration over being forced to vote as they
did.
The
current fad in medicine is called "evidence- based-medicine", and
tries to suggest that in the past medicine wasn't "really" based on
evidence -- but now it will be.
Those
who know the history of philosophy will recognize this as a late reincarnation
of the school of "Logical Positivism" of the mid-20th century. Bertrand Russell and his peers maintained,
in effect, that "If you can't kick it, it isn't real". The limitations of this philosophy soon
became apparent, and it is the object of ridicule and scorn among philosophers
today.
Those
who are trained in the Philosophy of Science understand that science includes
many levels of research that become progressively more focused and
refined. Systematic observation is NOT
"unscientific", it is just a lower level of scientific certainty than
RCTs. Sometimes it's the only evidence
available.
The
problem is that good research is expensive, and many important issues are not
addressed in the higher or highest levels of scientific research because no one
stands to profit from the investment.
In
this context, it is worth noting that Dr.
Zendel's review of the BC/BS TEC report consisted of two points: (1) He accepted their conclusion that there
was no "real" scientific evidence in support of behavioral treatments
of incontinence, and (2) he wondered why there was no research.
While
informed readers will question the first point (BC/BS phrased the question
prejudicially to exclude important studies such as Burgio 1998), there is
really no mystery about the second; there is no big money to be made funding
such research.
At
the 3/1/00 Executive Committee meeting, Dr. Wayne Roe, Chairman of Covance
Health Economics & Outcome Services in Washington, D.C., spoke on behalf of
the Health Industry Manufactures Association.
He lamented the current trend:
00077
2
........Far
3
too much weight on randomized controlled trials
4
as the desired level of evidence. We're
going to
5
have them, we're going to have more of them, but
6
they're going to be rare. And we can't
afford
7
them all. And we all know there are
lots and
8
lots and lots of reasons why we can't do them.
9
And the FDA doesn't require them every time even
10
for drugs. So I think you have to
recognize
11
that. There's lots of good science being
done
12
far better than before. Overemphasis on
13
randomized controlled trials is going to make
14
other research seem inadequate, and I think it
15
will lead to some research not being done, some
16
good research not being done, and things not
17
being developed.
HCFA
staff tried to respond: Dr. Hugh Hill
said:
00122
15
....As the subcommittee report suggests,
16
observations alone may sometimes allow a panel to
17
make conclusions about effectiveness.
Such
18
suboptimal evidence may allow us to conclude that
19
Medicare should cover the service.
And
Jeffrey Kang said:
00133
5
The first is I did not read in this
6
document that there's an implication that
7
everyone has to have a randomized controlled
8
trial. What this document in my mind
says is
9
that's the gold standard, but to the extent that
10
you deviate from the gold standard, you have to
11
explain biases, how you dealt with it et cetera.
Dr.
Garber added:
.00161
3
If it is uncontrolled, it is not valid evidence
4
by itself, yet there are plenty of studies that
5
could have valid controls that are not
6
randomized, and I would hate for the readers of
7
this document to think that this paragraphs means
8
you have to have randomized controlled trials.
But
six weeks later the tone stiffened.
HCFA and BC/BS-TEC adopted a hard line:
23
Finally, the subcommittee made, I
24
think, a very strong statement saying that a body
25
of evidence that consisted only of uncontrolled .00020
1
studies, whether based on anecdotal evidence,
2
testimonials or case series or disease registries
3
without adequate historical controls, is never
4
adequate. So we really feel strongly
there needs
5
to be some form of control even if it's only
6
historical controls.
The
Interim Recommendations report is more specific:
[The highest] level of evidence will likely be
unavailable for many of the interventions that the MCAC panels will
evaluate. There may be randomized trials
conducted in other populations (e.g., middle-aged men rather than men and women
65 years of age and older), randomized trials with important design flaws
(e.g., they are not double-blinded), or nonrandomized studies with concurrent
controls. Deciding whether such studies
constitute valid, applicable evidence can be very difficult.
...
[But
not impossible!]
In some cases, the panel may decide that it cannot
draw firm conclusions about effectiveness without randomized trials.
[Actually,
the panel never addressed this issue; they assumed it was true because TEC said
so.]
Although they do not have randomized controls, all
well designed observational studies include some form of control. Controls may consist of an implicit or
explicit control group or statistical controls.
[TEC
got around this by insisting that there was only one question: "PME
vs. PME+BFB". This was a clear violation of the recommendation,
which said that many types of control might be valid, as Dr. Rathmell tried to
point out.]
A body of evidence consisting solely of studies with
no controls whatsoever - whether based on anecdotal evidence, testimonials, or
case series - is never adequate.
However in many cases the panel will determine that observational
evidence is sufficient to draw conclusions about effectiveness. When these circumstances apply, the panel
must describe possible sources of bias and explain the basis for its decision
that bias is unlikely to account for the results.
Since
the Panel accepted TEC's definition of the "only" question, they
didn't have to address the questions of "possible sources of
bias". We will examine the role of
classic concepts of bias in a future installment.
This
report examines on some of the statements made in the BC/BS-TEC report on
Biofeedback that may be incorrect. It
does NOT directly change the bottom line, and it may be safely disregarded by
those with less than obsessive interest in the accuracy of the TEC report. But it does raise some questions about TEC's
accuracy, and apparent reliance on secondary sources.
==================================
The
BC/BS-TEC report states:
"The final two trials included in the Berghmans
review did not meet the selection criteria for this assessment—Castleden et
al. (1984) had no concurrent control
group and reported no relevant outcomes and Taylor and Henderson (1986)
reported no relevant outcomes."
As
for Castleden, it is true that there was no "concurrent" control
group; they used a "cross- over" design. But even more to the point, there was no "biofeedback"
treatment group! TEC was apparently
confused by Berghmans
1998
analysis, which states:
"Five RCTs were identified comparing PFM
exercises with biofeedback (BF) against PFM alone [refs include
Castleden]" (185).
This
study keeps popping up in literature searches, apparently because the subjects
used a "perineometer". But
they did NOT use it for "biofeedback" during their empty-vagina
exercises several times a day. Instead
they simply "checked" their muscle strength once a day. A novel idea, which had never been used
before and has never been repeated in the 16 years since. (It didn't work!)
Lest
this distinction seem arbitrary, we should note that the Castleden study fails
to meet the inclusion criteria of TEC itself, which defines biofeedback (in
part) as "to assist patients in the performance of pelvic floor muscle
exercises."
Castleden's
subjects did not use the perineometer to help in the performance of their daily
exercise. Berghmans et al failed to
notice this detail.
TEC
should have read the original, instead of relying on Berghmans.
As
for "no relevant outcomes", Castleden reported that 14 out of 19
subjects no longer used protective pads.
That certainly seems like a relevant outcome.
On
the other hand, this study does not meet the inclusion requirements because
there was no "PME alone" group.
That's a different issue.
As
for the Taylor and Henderson study, what Berghmans et al (1998) actually said
was "Because data was poorly reported in the study by Taylor and Henderson
[61], it is not possible to isolate the comparisons between the
treatments" and "Again, the analysis in the study by Taylor and
Henderson [61] cannot be assessed" (185)
Taylor
and Henderson had reported a small pilot study that would have led to a
full-scale study except for the untimely death of Dr. Taylor.
They
divided 12 subjects into three experimental groups; daily home biofeedback,
daily home exercise with a resistive device, weekly office biofeedback, and a
fourth control (no treatment) group. Their
report, in the Journal of Gerontological Nursing, states that that the daily
biofeedback group got a 100% continence rate, while the control group got 67%,
"as was the rate obtained by the experimental groups as a whole".
The
quoted sentence is, admittedly, ambiguous.
Do they really mean that the average of ALL (3) experimental groups was
"67%"? Or should
"experimental groups as a whole" have been stated as "the REMAINING
experimental groups"?
After
all, the first interpretation would require us to assume that, since Group 1
got 100%, the other two Experimental Groups got much lower scores, which the
authors are trying to conceal.
The
more logical explanation is that the authors, having already described TWO of
the groups, are now describing the other TWO (experimental) groups -- with a
poor choice of words.
Professional
courtesy demands that any authors be given ethical credit over linguistic
skill.
According
to recent discussions on the evidence- based-health email list, it is commonly
accepted practice in EBM for a reviewer to CONTACT the authors in the case of
textual ambiguity.
It
is perhaps understandable that Berghmans did not call Henderson from the
Netherlands, but less clear why BC/BS in Chicago didn't make the call, or at
least send her an email.
BTW,
I personally did contact Henderson at her home in Denton, Texas and she
confirms that the language used was less than ideal. They weren't trying to hide anything; they just didn't say it as
clearly as they should have.
So
the results of this small study were that ALL of the 3 members of the daily
home biofeedback group got dry, but only 2 out of three in each of the other
groups did. That sounds like a "relevant
outcome" to me.
Thus
when BC/BS says "Taylor and Henderson (1986) reported no relevant
outcomes" they did not read the primary source closely enough.
Incidentally,
the Taylor and Henderson "control" group was given pelvic muscle EMG
evaluations "before and after", so they were not a pure "no
treatment" group.
Which
leads directly to my final point.
Taylor and Henderson should never have been mentioned in the first
place, because they DID NOT HAVE a "PME-Alone" condition! Why didn't BC/BS notice that? That would have been sufficient grounds to
exclude this study under their narrow definition of the question.
While
not of earth-shattering significance in themselves, these errors do cast a
shadow over the assumed infallibility of the BC/BS technology assessment
process.
BC/BS-TEC
identified three types of "bias" in research:
1. Selection bias - Imbalances in
patient characteristics between groups with potential for differences to affect
outcomes
2. Performance bias - Inequality in the
intensity of treatment given between groups
3. Attrition bias - Significant number
of dropouts in one or more study arms, not taken into account in the
statistical analysis Then in summarizing the Stress Incontinence papers, they
report:
In four of the trials, one or more potential sources
of bias was identified (Shepherd et al.
1983; Burgio et al. 1986;
Ceresoli et al. 1993; Glavind et
al. 1996), while in two trials no
obvious potential sources of bias were identified (Burns et al. 1993; Berghmans et al. 1996).
In
their summary tables we read:
Burgio
1986
1. Potential for selection bias (not random,
was matched)
And
in their methodology review, TEC says:
The trial by Burgio et al. (1986), while stratified to balance the arms on age and frequency
of incontinence, was not randomized.
To
fault Burgio here seems to contradict what they had promised in the
introductory sections, where they admitted:
"Controlled
trials that are nonrandomized, while prone to selection bias, may also provide
sufficient evidence of efficacy if the comparability of the treatment arms can
be adequately assessed."
Since
Burgio DID report at least the most important characteristics of the two
groups, and they were not significantly different on age and incontinence, this
would seem to qualify under the rules TEC set down.
The
same is true of the other study that Burgio participated in, Burton et al 1988,
where patients were matched and the distribution into treatment groups that
were also shown to be without prejudice.
However, we have previously shown that Burton did not qualify for this
review, since NONE of Burton's control-group urge patients got ANY PME-Alone,
as required for this analysis.
Other
studies were faulted mostly for performance and attrition bias; or rather, for
the "potential" for them.
Specifically:
Shepherd
1983
1. Potential for performance bias. (5.7 vs. 3.5 sessions)
2. Potential for attrition bias. (0 vs. 27% dropouts 3/11)
Ceresoli
1993
1. Potential for performance bias. (6 weeks vs. 3 months)
2. Potential for selection bias. (not randomized)
Glavind
1996
1. Potential for performance bias. (BF got 4 more sessions)
2. Potential for attrition bias. (5% vs.
25% dropouts (1/20, 5/20)
(Note
that the two studies that got better BFB results got more BFB time (attention);
but the apparent exception, Ceresoli, got equal results when the PME group got more
time.)
Berghmans
1996 - none
Burns
1993 - none
The
concepts of "performance bias" and "attrition bias" are
especially troublesome when evaluating biofeedback research. The Burgio, Shepherd, and Glavind
biofeedback treatments resulted in significantly better outcomes, but they are
faulted because they also took longer that PME alone (even when it wasn't really
"alone").
Ceresoli
got "similar" results with biofeedback vs. PME alone, but PME
subjects needed twice as much time as the biofeedback subjects did to get that
level. If the PME subjects had been
evaluated at 6 weeks, instead of 13, it seems most likely that biofeedback
would have produced "superior" results, just like the other
three. So Ceresoli really belongs in
the "biofeedback" column, leaving only Berghmans and Burns with null
results.
PERFORMANCE
BIAS The concept of "performance
bias" is, on closer examination, much more complex that first appears. The assumption is that treatments can be
directly compared in a raw, quantitative way.
For instance, 6 weeks of biofeedback vs. 6 weeks of PME alone.
By
this logic, one could also compare the effects of 10 mg of aspirin with 10 mg
of morphine. But that would be
inappropriate, since the drugs operate by very different mechanisms and there
are very different standard dosages.
But
biofeedback and pelvic muscle exercise also operate by very different
mechanisms. Biofeedback relies on a
neurological enhancement that goes beyond the mere passive exercise of the
muscle fibers.
[We
have previously discussed Kari Bo's argument that biofeedback results are too
rapid to be explained by principles of sports physiology. We agree.
Biofeedback involves neurological training which is much more than just
physical exercise alone.]
In
recently published research, Detrol was given in 2 mg bid, whereas Ditropan XL
was given in a single 10 mg pill. Would
we call that a "performance bias"?
After all, they DID get more than twice as much Ditropan as Detrol!
Obviously
a solution to this dilemma is to disregard the time factor and ask "how
much improvement can a patient get with all the biofeedback they need",
vs. how much improvement can they get with PMEs alone?
Then
we compare the time factors and see which is more efficient. This is slightly awkward for researchers who
want to get results in a "reasonable" time frame, but it is
infinitely more scientific.
Note
that the only published protocol for the use of Biofeedback in clinical
practice, the Perry Protocol (1990), states that all patients are entitled to
be treated until their incontinence is resolved. "One-size-fits-all" treatments are considered unethical
in clinical practice.
Another
approach would be to use more sophisticated statistics which would project
these time differences. For instance,
if 75% of biofeedback patients are dry after 8 weeks, how long would it take to
achieve the same level with PME alone?
ATTRITION
BIAS The other sword hanging over
biofeedback studies in the TEC report is "attrition bias". Two studies that otherwise demonstrated the
superiority of biofeedback over PMEs alone were faulted because too many
patients in the PME-alone condition dropped out [25% in Shepherd, 29% in
Glavind].
How
should we understand this? What do we
know from clinical experience about "dropouts"?
I'm
trying hard to remember, but I can't recall ever hearing a patient say
"I'm dropping your biofeedback program because I don't need it any more --
I'm cured!" What I do remember is
some patients dropping out because they thought the effort -- we required an
hour a day of home practice -- too time consuming. In the words of one patient, "Why should I keep doing these
stupid exercises when my surgeon says he can fix me in a few minutes,
permanently?"
I
think we can estimate the 95% of dropouts are dissatisfied with the
effort-to-results ratio they have seen.
According
to NAFC and similar groups, it can take "several months" before PMEs
Alone produce results. Most biofeedback
program claim results in "several weeks". Therefore, it would be a reasonable hypothesis to predict that
more subjects would drop out of the PME Alone condition than the biofeedback
condition, and that is precisely what Shepherd and Glavind found. Should they be faulted for that?
In
future research we recommend that inclusion of the "dropout
hypothesis" in all comparisons.
There
was one other study cited by BC/BS-TEC that, at first glance, appears to
contradict this discussion. Their one
example of PME+BFB vs. PME Alone for post-prostate incontinence was
Franke JJ, Gilbert WB, Grier J et al.
(2000).
Early
post-prostatectomy pelvic floor biofeedback. J Urol, 163(1):191-3.
And
they charge:
1.Potential
for attrition bias. (33% (BFB) vs. 13%
dropouts)
2.Effect
of treatment possibly diluted by spontaneous improvement in both groups.
They
further state:
"Pts
randomized into one of two groups.
"Randomization
process not described.
"PME
alone – educational materials given, no specific instruction in PME."
Excise
me? If no instructions were given for
PME Alone, how can they be called a "PME alone" control group?
The
text is clear:
"Those in the control arm received no
instruction and were asked to return a voiding diary and 48-hour pad test at
the routine followup visits. It is not known whether controls performed pelvic
floor exercises without instruction to do so.
(191)"
In
addition, they got a standard packet of information for prostate surgery
patients, but....
"There is no mention of pelvic floor exercises
in this literature."
Two
things are obvious. First, it was
virtually impossible to "drop out" of the control group, since the
evaluation of incontinence was merely a routine part of their post-operative
care. Who wouldn't go back an
insurance-paid post-op inspection?
Second,
the "control group" was not asked to do PMEs and they were not asked
if they had done any. So how can BC/BS-TEC
claim that this is a "PME-alone" control group? Even the authors don't make that claim.
In
retrospect, it is interesting to note that BC/BS-TEC has included two null
reports -- Burton for Urge and Franke for Post-prostate -- that in fact don't
even meet their own criteria for having a "PME-alone" control group.
It
is hard to believe that BC/BS-TEC could make such a gross scholarly error, not
once, but twice.
On
the other hand, perhaps it was a calculated risk; they may have assumed that no
one would notice that Burton and Franke didn't have PME-alone control groups,
and should have been excluded from the git-go.
In
defense of this theory, it certainly would have been an embarrassment if they
had reported that the "Question" (PME vs PME+BFB) proved so narrow
that there was NO research in two of the three categories (Stress, Urge and
P-P) that could be examined. That would
have raised the question "Why not compare 'no treatments' and 'alternative
treatments', just like you did for electrical stimulation? But that would have brought Burgio 1998 into
the picture.
Based
on Berghmans' assessment of Burgio 1998 as methodologically the best study ever
done on biofeedback (He rated it 8.5 on a 10 point scale, 1.5 points higher
than the next best study), it would have drastically changed the balance of
power in favor of the "effectiveness of biofeedback".
Memo
To: Dr. Hugh Hill, HCFA
Re: A Discussion of Trainer Bias, Protocol Bias,
and Instrumentation Bias and their implications for the BC/BS-TEC report.
Date: June 4, 2000
---------------------------------------------------------------
The
BlueCross/BlueShield-Technology Evaluation Center's Review of Biofeedback
Research for the Treatment of Incontinence concluded that only two
"scientific" (i.e., controlled experimental studies) met their
criteria of NOT containing potential sources of methodological bias --
Berghmans 1996 and Burns 1993.
Since
both studies also failed to show a statistically significant differences
between Biofeedback experimental groups and so-called "PME-Alone" control
groups, TEC concluded that there was no evidence that biofeedback contributed
anything to "PME Alone".
The
fact that biofeedback showed a "trend" towards better results (Burns)
or faster results (Berghmans) is not considered convincing evidence.
There
are, theoretically, two broad reasons why experimental and control groups could
be statistically indistinguishable as in these two studies.
First,
the results in the experimental group could be exceptionally BAD.
Second,
the results in the control group could be exceptionally GOOD.
And,
of course, it is possible that BOTH of these conditions could occur
simultaneously.
The
Burns and Berghmans studies appear to include both of these
features. At the time of publication,
Burns set a new record LOW for published success using EMG biofeedback, 61%
symptom reduction.
Three
years later, her bad results were beaten by a NEW record LOW of 54% set by
Berghmans et al.
In
both cases, "PME-alone" control groups -- which actually received
much more than "PME-alone" -- performed unusually well, thus leading
to non- significant group differences.
TEC
concludes from this that "no differences" is the norm, but TEC
overlooked important sources of bias in these studies.
TEC's
"external" methodological considerations of possible sources of bias
consisted of three possible factors -- Attention bias, Attrition bias, and
Selection bias. According to TEC, none
of these sources of bias were present in the "B&B" studies, and,
therefore, their non-significant conclusions were ruled valid.
We
have already pointed out that scientific procedure does NOT permit the fallacy
of "ACCEPTING the null hypothesis", which in this case means drawing
the conclusion that THERE IS NO DIFFERENCE.
All
that we can scientifically say is that we CANNOT prove that there IS a
difference, which is a very different matter.
In
testimony presented to HCFA (but NOT distributed to the Med-Surg Panel, in violation
of announced procedures), we have also pointed out that most surgical and
behavioral treatments share several characteristics with each other that are
not common to pharmacological treatments upon which the "RCT"
methodology is based.
In
drug research, the purity of the active ingredient, and therefore its potency,
is subject to prior external control by the FDA. "Good Manufacturing Procedures" require constant and
complete testing by chemists to assure that when a research subject is given,
say, 10 mg of Detrol, they really ARE getting 10 mg of Detrol. Thus the evidence-based analysis of Detrol
studies does NOT need to look at the composition and quality of the drug
treatment furnished to subjects; it can safely be assumed.
For
a variety of reasons discussed below, this assumption can NOT be made in
EITHER surgical OR behavioral research analysis.
We
leave it to surgeons to articulate the implications of this for surgical
studies, and present three sources of bias that need to be evaluated in the
analysis of behavioral research designs.
In the present report we will focus on biofeedback, but most of these
points apply to the entire field of behavioral interventions.
-------------------------------------------------------------
1. TRAINER BIAS - Trainer bias refers to
the demonstrated professional skills level of the person making a biofeedback
or other behavioral intervention.
Surgeons
would be outraged if a study of the efficacy of collagen injections was
presented in which the clinician's credentials were that of a Nurse
Practitioner with no additional training by the manufacturer. (NPs give "injections",
right?) When such
"injections" fail to achieve the expected therapeutic result, they
would rightly complain that it wasn't the collagen's fault but, most likely,
the clinician's lack of general skills and specific training that caused the
failure.
In
the same way, biofeedback clinicians are outraged when persons with no
biofeedback credentials at all, AND no additional training from the
manufacturer, is allowed to do "research" with biofeedback
instruments.
When
such "biofeedback training" fails, they rightly complain that it
isn't the instrument's fault, but, most likely, the clinician's lack of general
skills and specific training that caused the failure.
We
note in this regard that the American Psychological Association has long held
that a mere weekend workshop is usually not sufficient training for a
psychologist to undertake a new specialty, such as biofeedback or incontinence training.
For
nearly two decades the Biofeedback Certification Institute of America has
conducted a national program which guides the formal (classroom and
apprenticeship) training of biofeedback practitioners. A comprehensive written professional examination
AND an hands-on demonstration of clinical skills is required in order to use
the designation "BCIA-C" after one's name to indicate such formal
certification.
In
addition, continuing education requirements must be met to re-certify every
three years.
In
order to be above suspicion in biofeedback research, the clinician delivering
the experimental treatment should be BCIA-Certified (just as the surgeon should
be Board-Certified in his or her specialty).
In
addition, the treatment of pelvic muscle dysfunctions with EMG biofeedback
presents a variety of special problems not found in general EMG muscle
rehabilitation, so specialized training beyond BCIA is essential to properly
use these instruments and treat these patients.
Several
of the major incontinence-equipment manufacturers, as well as American and
European biofeedback professional organizations, and both nursing and physical
therapy associations, offer such specialized training on a regular basis,
usually two or three times a year, EACH.
Mere
attendance at an "incontinence workshop", without BCIA certification,
would not remove the heavy cloud of suspicion about "Trainer Bias"
that otherwise renders a biofeedback research project inconclusive.
In
addition to these technologically-oriented professional skills, it is widely
acknowledged (by the AHCPR guidelines, for example) that one of the
requirements of a good biofeedback trainer is the ability to motivate patients
to a high level of performance.
In
this regard, a biofeedback trainer is much like an athletic coach, or more
aptly, a "personal fitness trainer" who generates enthusiasm for the
biofeedback process and motivates the patient to put in many hours of home
practice between office visits.
Since
one of the training tools involves ensuring that the patient understand and
appreciate the intermediate success of the biofeedback training, it should be
obvious that not only double but also single blinding is impossible in this
field, and sham feedback would impossible as a practical matter. (The therapist would have to explain the
significance of false or sham data which did not correspond to the subject's own
proprioception.)
[In
this regard, it is instructive to note that highly touted attempts to create
"virtual reality" scenes through visual presentations have fallen
flat, because the subject knows from inner-ear signals, that s/he is still
sitting in a chair on the floor, and not flying through space!]
2. PROTOCOL BIAS - Protocol bias refers to
the optimization of biofeedback training procedures to ensure that the patient
or subject derives maximum benefit from the experience.
The
original protocol, developed by Arnold Kegel, MD, in the late 1940s, had two
essential elements that are still considered valid today.
First,
every one of Dr. Kegel's patients
engaged in daily at-home biofeedback with a biofeedback instrument, in addition
to periodic office visits to evaluate practice.
Dr. Kegel was convinced that faithful daily
biofeedback practice was critical to the success of his protocol; so convinced,
indeed, that he published a chart showing a decline in measured muscle strength
of a few points following a single day of "skipped" practice.
Any
research project that does not afford subjects the opportunity to use
biofeedback on a daily basis may be consider to demonstrate "protocol
bias", such that insignificant results are more likely attributed to the
protocol than to "biofeedback" itself.
In
practice there are reasonable modifications of Kegel's protocol; for instance,
having the patient use the clinic's office-level biofeedback instruments on a
daily basis, as was done in an Italian study some years ago.
But
insofar as the research deviates from Kegel's daily practice protocol, any LACK
of significant results raises the issue of protocol bias. Positive results, of course, can demonstrate
that the new protocol (for example, three office visits a week) was NOT a
handicap. But lack of positive results
is most likely the result of an inferior biofeedback protocol.
The
second feature of Dr. Kegel's protocol
is that the length of the biofeedback treatment was always tailored to the
experience of the individual patient.
Kegel
recognized that, in internet jargon, EPID ("every person is
different"). Since pelvic muscle
rehabilitation is both a learning and an exercise problem, some people will
need more, while others need less, exposure to the biofeedback protocol.
It
is possible to statistically control for Protocol Bias, by measuring and
plotting the daily (Kegel) or weekly (most modern systems) progress in both
Treatment effects (also called "intermediate effects") and well as
outcome effects (symptom reduction).
[This is not very different from analyzing Attrition bias for trends.]
In such controlled research, an averaged plot of progress in experimental and
control groups would show when, if ever, their respective trend lines would
cross, somewhat like Ceresoli showed that 6 weeks of biofeedback gave the same
results as 13 weeks of PME-alone.
Unfortunately,
this is seldom done; the experimenter determines a priori and with no empirical
justification that 4, 6, or 8 weeks of "treatment" under both conditions
should be sufficient to demonstrate any differences. This is akin to saying that each patient is getting 5 mg of
oxybutynin, regardless of effects.
Many
drugs are studied on the basis of a known relationship between drug
concentration and body weight, since (in those cases) body weight may
reasonably effect their potency. In the
behavioral field, interventions might be tailored to "brain weight",
or, more precisely, an "expected speed of skills acquisition" based
on the patient's age and cognitive and physical status. But absent such sophisticated
considerations, it is inappropriate to arbitrarily assume that "one size
fits all"; it usually doesn't.
In
summary, if the subjects are not provided with daily biofeedback of sufficient
duration, and they do not receive apparent benefit from the experimental
treatment, the burden of proof is on the researcher to show that they really
did get "real" biofeedback, just like Dr. Kegel taught.
3. INSTRUMENTATION BIAS -- Instrumentation bias
refers to the importance of having and using appropriate sensors and electronic
instruments to perform minimally-effective biofeedback training.
It
should be obvious, but a report on using common kitchen utensils to perform
suspension surgeries would not reflect badly on the entire field of
"surgery" if it failed.
Before
it can be considered NOT to be a biasing factor, any new instrumentation must
have been demonstrated to be effective, at least once, in producing positive
results. For example, several years ago
a form of "underwear" with embedded EMG sensors was used in a
biofeedback study that did not achieve significant results. Unfortunately the electrode design did not
provide reliable signals, so the only conclusion that can be drawn refers to
the novel equipment, not to "biofeedback" as an intervention.
The
sensors that have been shown to be most effective for biofeedback training are
those which are inserted in the vaginal or anus (or both). This includes both manometric devices
(Kegel, Burgio, Shepherd) as well as EMG sensors (Binnie, Taylor, Perry,
Baigis-Smith, Williams, etc.)
Binnie
et al (1991) showed that electrical stimulation electrodes (i.e.,
circumfirential electrodes) were NOT effective at detecting EMG signals,
compared with longitudinal electrodes, but the later correlated highly
(>0.91) with inserted needle EMG electrodes. Use of proper electrodes is significant in biofeedback.
Very
few studies have dealt with the advantages of modern computerized biofeedback
instruments, although there is universal agreement among clinicians that for
most biofeedback applications, modern equipment has clear advantages in
training -- usually, it is in terms of making it easier for the patient to see
and understand the objective of the training.
In
the PT field of muscle rehabilitation in general, clinical opinion is clearly
in favor of intelligent EMG devices which provide "pattern matching". In pelvic muscle work, clinicians find
somewhat simpler "Work-Rest" mode instruments valuable.
[General
purpose EMG biofeedback instruments are designed with a single target --
relaxation. In pelvic muscle work, both
"contraction" and "relaxation" are targets. The most useful devices record and calculate
"work" and "rest" averages separately to show progress on
both items.]
In
addition, there is considerable theoretical and clinical experience in support
of the fact that both fast twitch and slow twitch muscle fibers have roles in
maintaining continence, so the most useful biofeedback devices calculate both
quick (short, or flick) values and sustained (hold, or 10-second endurance)
values, in order to determine deficiencies and, therefore, to set practice
objectives.
While
these issues have not be the subject of empirical study, the fact is that all
biofeedback instrument manufacturers seeking the attention of incontinence
clinicians provide these features on their instruments. In general, experienced clinicians agree
that the standardization of data collection facilitates understanding the
patient's condition and prescribing the most appropriate combination of
therapeutic interventions. Most would
scoff at the research assumption that all patients should be given the same
exercise assignment each week.
A
RE-EXAMINATION OF TEC In view of these three additional sources of bias that may render
research conclusions unscientific, we now propose to reconsider the
"outcomes" tables in the TEC report to see how their "no
bias" studies fare.
------------------------------------------------
Berghmans
et al, 1996
1. Trainer bias
2. Protocol bias
3. Instrumentation bias
1. Trainer bias. The credentials of the clinician in this report are not
specified. It is doubtful if the person
was generally qualified, or specifically trained for this task.
The
therapist did NOT present or discuss EMG values with the subjects, so
opportunities for and evidence of subject motivation were minimal.
2. Protocol bias. Berghmans attempted to use a new intensive physical therapy model
developed by Bo for biofeedback. It has
never been shown effective in any biofeedback study, and was not effective
here.
In
addition, Berghmans did not provide daily biofeedback practice, and limited the
study to an exceptionally short 4 week period.
3. Instrumentation bias. Berghmans used an electrical stimulation electrode (Verimed, USA)
instead
of an EMG sensor, so the quality of the signal was clearly sub-standard.
Berghmans
used a general-purpose EMG instrument which did not quantify work-rest
intervals. (And other details are
impossible to verify.)
Therefore,
with at least six possible sources of bias, the conclusions of this study
cannot be considered as "scientific evidence".
-----------------------------------------------------
Burns
et al, 1993.
1. Trainer bias
2. Protocol bias
3. Instrumentation bias
1. Trainer bias. The therapist treating experimental subjects was supposed to
obtain BCIA certification (according to her grant application) but did NOT do
so.
Nor
is there is evidence that this therapist received any specialized training from
anyone with established skills in the field.
2. Protocol bias. Burns' subjects were trained in the office with biofeedback
instruments, but were expected to practice their exercises at home with empty
vaginas -- a very different condition.
They
were not given home biofeedback instruments for daily practice, as Kegel had
shown to be effective.
They
were all treated for the same eight weeks, without regard to individual
differences in learning skills.
Compared with the clinical literature, they made very poor progress.
3. Instrumentation bias. Burns used a general-purpose EMG "stand-alone" (early
office) device that was designed for relaxation training, not for muscle
rehabilitation.
Burns
used a private definition of contraction strength, the "best five seconds
out of ten", instead of the standard ten-second average. Therefore, her intermediate result (an
increase from 2.0 to 4.0 microvolts in the experimental group) is much more a
"fast twitch" measurement than a true sustained score. [i.e., if 10-second scores were used, she
probably would not have a significant treatment effect, which would account for
the absence of a significant outcome effect.] We can't really say if Burns'
subjects received any benefit from their "biofeedback
treatment".
An additional note: Burns' study population included
only about 10% of her recruited subjects.
There were so many exclusionary criteria that she ended up with a pool
of not only elderly, but also debilitated subjects (at least in terms of pelvic
conditions). Proof of this is found in
the fact that her research subjects AFTER therapy was still much worse than her
PILOT subjects had been BEFORE her therapy.
Conclusion:
With at least six unresolved sources of potential bias, the Burns study is NOT
of sufficient methodological quality to provide "scientific" evidence
about the effectiveness of biofeedback in the treatment of incontinence.
Summary:
We have shown that the two studies cited by TEC as being "without
bias" when considered only externally were indeed fraught with bias
problems when the quality of their "biofeedback" intervention is
examined.
From
the perspective of persons trained in biofeedback, both the Burns and Berghmans
studies must be dismissed as not providing scientific evidence -- one way or
the other.
Since
BC/BS has already dismissed the remaining four (Stress) studies, and we have
previously shown that the one Urge and one Post-proctatectomy studies did NOT
include "PME-alone" control groups, the only rational conclusion is
that there is NO scientific evidence that can be brought to bear on this
subject.
In
the words of the Industry Representative on the Med Surg Panel, it appears that
TEC has set the bar too high to address the question.
Therefore,
we are forced to agree with the American Medical Association (and others) who
have urged that other levels of evidence, including clinical research, be
considered by HCFA when attempting to address the issue at hand.
We
also agree with the Interim Recommendations, that persons with expert
familiarity with the topic under discussion need to be invited to participate
in the technology assessment.
At
the very least that ought to include consultation with the only professional
organization devoted to the study of biofeedback, the Association for Applied Psychophysiology
and Biofeedback (AAPB). That was not
done.
Respectfully
submitted,
John
D. Perry, PhD, MDiv, BCIA-C (Senior
Fellow)
Dr.
Hill replied:
Copyright (c) 2000 IncontiNet.com
URL: www.incontinet.com/april12.htm
Last edited on 05/04/05