Critical Comments on the BC/BS-TEC report
and HCFA's Medical-Surgical Panel Meeting

 

By John D. Perry, PhD

 

Jump to "Contents"

 

In the first month following HCFA's Medical-Surgical Panel Public Hearings on April 12-13, 2000, attention was focused on the various procedural irregularities that surrounded the formal process.  Everyone was stunned by the outcome, and we waited to see the promised protest of the Consumer Representative, Ms. Greenberger.  Eventually a large number of individuals and professional organizations submitted additional protest letters as well, including the American Medical Association, which had not entered the process in April.

 

On May 1st I attended a poster presentation by Dr. Bary Berghmans at the AUA meeting in Atlanta, and for the first time got a grasp on the conceptual confusion ("biofeedback is a form of physical therapy") that underlies the BC/BS TEC report (#1).  Soon thereafter HCFA published the Executive Committee's March 1st transcripts and reports, and the basis for the Med-Surg Panel's activities finally became clear.  Also clear was that the EC recommendations were honored more in the breach than the promise.  (See #2-5).  In #6 we point out that Burton et al (1988) didn't qualify, since he didn't do any PME! (a TEC oversight.)  In #7 we discuss levels of evidence and the controversy with the EC.  #8 concerns the claim that there were "no relevant outcomes" in two studies.  In #9 we take issue with TEC's claims of bias, and also point out that Franke (2000) also did not have a PME-alone control group - another TEC error.  Finally, in #10 we introduce three new forms of bias that apply to behavioral interventions such as biofeedback, and fault the TEC heros on all three forms of bias. 

 

 

Contents

1. Berghmans' EGG and Conceptual Confusion (5/11/00)

2. What the EC recommended to Panels (5/26/00)

3. Dr. Landy's "report" (5/26/00)

4. Dr. Zendle's "report" (5/26/00)

5. The slide they asked to see (5/27/00)

6. Another BC/BS error (Burton & Burgio) (5/30/00)

       Burton's Urge patients didn't get PME Alone!

7. What happened to the AHCPR Guidelines? (5/31/00)

8. No relevant outcomes?  (5/31/00)

9. TEC's three forms of bias (6/1/00)

     Franke didn't have a PME alone control group

10. Three additional forms of bias (6/4/00)

 

 


1. Berghmans' Egg and Conceptual Confusion (5/11/00)

 

Bary Berghmans presented a poster at the AUA convention in Atlanta last week that was essentially a re-hash of his February 2000 paper on Biofeedback and Urge Urinary Incontinence. 

 

But there was one important difference -- as a poster presentation he included a graphic that was not part of that BJU paper.  The graphic showed his concept of "biofeedback", and explains the conceptual confusion that was included in the Monaco Report, and subsequently became the cornerstone of the BC/BS TEC report on Biofeedback.

 

The key is Berghmans' drawing of the relationship between "physical therapy" and "biofeedback".

 

Picture, if you will, an egg, with a yoke.  The entire egg is labeled "physical therapy", while the yoke is labeled "biofeedback". 

 

In Berghmans' view, everything that is called "biofeedback" is a sub-set of "physical therapy".  There is no part of biofeedback that isn't a "physical therapy" activity.  Thus Berghmans set out to "assess the efficacy of *physical therapies* for first-line use in the treatment of urge urinary incontinence..." (Feb.  2000 abstract, emphasis added).

 

And, as you know, he concludes that there are "too few studies to evaluate the effects of PFM exercise with or without biofeedback...".  That's notwithstanding that the very best study, in terms of methodological criteria defined by Berghmans, is Burgio's 1998 JAMA study, which found that biofeedback was vastly superior to drugs or to placebo. 

 

Burgio's problem is that, in spite of her study being the largest study ever done of Urge Incontinence, it is only ONE good study, and in Berghmans' so-call evidence-based model you have to have three high-quality studies to "prove"

a point. 

 

[In Berghmans' model, you need at least an n of 50; Burgio had nearly 200.  In other words, if the Burgio authors had published their results in three groups of over 65 subjects each, we would have "three high quality studies" and biofeedback would be acknowledged as superior to PME alone.  If that isn't arbitrary, I don't know what is.] Interestingly enough, in the Berghmans graphic the relationship between "physical therapies" and "electrical stimulation" are shown as two only partially overlapping circles.  In other words, there are activities that are uniquely electrical stimulation, there are activities that are uniquely physical therapy, and there are activities that are part of both electrical stimulation and part of physical therapy.  I don't have any problem with that.

 

But what I don't understand is Berghman's instance that the entire world of biofeedback can be described as a circle WITHIN the sphere of Physical Therapy. 

 

It is this very assumption that leads Monaco and the BC/BS TEC report to assume that biofeedback can be considered an "additive" to physical therapy, and one can legitimately investigate the value of this additive by comparison to physical therapy (PME) that does NOT include the additive, biofeedback. 

 

But if biofeedback is not merely a branch of physical therapy, the entire process unravels.  Suppose that biofeedback includes elements or activities that are not properly or commonly considered aspects of "physical exercise", and are not provided by physical therapists?  The entire PME with Biofeedback vs. PME Alone comparison collapses.

 

And there is evidence that this is so.  Ironically, it comes from one of Berghmans' own frequent collaborators, Norwegian physiotherapist Prof. Kari Bo.  Bo argues that biofeedback therapy for incontinence cannot work by "strengthening pelvic muscles" because biofeedback produces results in too quick a time to be explained by changes in muscle physiology as a result of exercise.  And, since EMG readings go up at a much faster rate than can be explained by what is known about exercise in general sports medicine, Bo concludes, quite mistakenly, that EMG scores are NOT a reliable indicator of pelvic muscle strength.  She considers them "invalid". 

 

There is, of course, another explanation, which is already widely understood within the field of biofeedback, if not in physical therapy.  That is that what the EMG device is measuring is the "effective" strength of the muscles; that is, the combination of basic muscle physiology and vastly improved central nervous system function resulting from biofeedback training.

 

[This point is drawn from my essay "Are we really 'strengthening muscles' down there?" in the March, 2000 issue of California Biofeedback, Jeff Cram, editor.] It is the combination of improved CNS functioning and slight improvements in muscle physiology that gives "biofeedback" the competitive edge over plain pelvic muscle exercise (physical therapy) alone. 

 

Improvement in CNS functioning can be readily shown, based on Bo's argument, by increases in measured EMG Pelvic Muscle Strength that are TOO rapid to be accounted for by exercise physiology alone.  If the EMG goes up by 50% in seven days, and exercise can only account for (say) a 5% increase in seven days, then the other 45% clearly comes from improved CNS functioning.

 

Improvements of 30-100% in the first few weeks of biofeedback training are not uncommon, especially after the first or second week of home biofeedback training.

 

Publications promoting PMEs alone, such as those of the National Association For Continence, are quite consistent in stating that patients must do daily PMEs for "several months" before they will notice significant improvement.  But with biofeedback results are obtained in a matter of a few weeks.  Why?

 

Even Berghmans noted "the positive trend in speed of improvement with the addition of biofeedback (1998, p.  188)". 

 

Ceresoli 1993, for example, compared six weeks of biofeedback with 13 weeks of plain PMEs.  The biofeedback group was insignificantly a tad better than the plain group.  (And Ceresoli used an inferior form of biofeedback -- perineal measurements rather than vaginal feedback, further depressing the biofeedback results.) 

 

Further support for the notion that biofeedback adds CNS improvements that are not part of plain PME alone come from comparison with the results of electrical stimulation studies. 

 

When the muscle is only *passively* exercised, as in electrical stimulation (whether magnetically coupled or directly coupled), the results (1) take longer, more like plain PMEs, and (2) are not as dramatic, as biofeedback results. 

 

The latest Neotonus results, for instance, show only a 25% reduction in pad weights (20 -> 15 grm.) In another such study, leaks per day declined only 48%.  (1999 AUA paper).  Sand et al (1995) reported a 29% reduction in leakage reports after 14 weeks of stim therapy.  In other stim studies passive treatment required 3.5 months (Bergman and Eriksen, 1986) to "4 to 19 months" (Fall et al, 1977) to obtain good improvements. 

 

The average training time in our biofeedback program was 4.3 visits over 8 weeks; since patients were required to train until they had been "dry" (without any leaks) for 30 days, the average time to dry was 4 weeks.  [See http://www.incontinet.com/effective.htm for details.] The role of CNS enhancement in the treatment of incontinence is already the topic of anecdotal reports on the internet, where several researchers have reported success using EEG biofeedback alone for incontinence.  In these reports, improvement in CNS function alone, not muscle exercise, was used to treat incontinence.

 

In a panel discussion at the AAPB Convention in Denver last month, two clinicians (Linda Kirk and Louise Marks) reported on difficult cases treated with a combination of EMG pelvic muscle biofeedback AND EEG biofeedback (now called "neurofeedback"). 

 

There is an interesting parallel tradition that supports the role of CNS enhancement in the treatment of incontinence.  The work of Moshe Feldenchrist involves a "mental rehearsal" method that is claimed to be as effective as physical exercise at certain tasks.  It provides an interesting parallel development to biofeedback.

 

So we return to the question; is biofeedback a sub-category of physical therapy?  It is true, to be sure, that biofeedback is sometimes a "modality" that is used by physical therapists, just like hot packs and ultrasound. 

 

Consider an analogy: If physical therapists pray with and for their patients, does that make "prayer" a form of physical therapy?  Of course not. 

 

In other words, just because physical therapists sometimes do it doesn't make biofeedback a form of PT.  Like religion, biofeedback exists outside the realm of PT.  Berghmans' egg model for the relationship is simply wrong.  Biofeedback should be described, like electrical stimulation, as a partly overlapping but distinct separate circle of activity.

 

 

The HCFA-TEC presentation focused on the issue of whether ADDING biofeedback to PME resulted in increased benefit.  They repeated the common mistake of thinking of biofeedback as a technique ADDED TO "plain pelvic muscle exercises". 

 

[Unfortunately, the titles of important papers by both Burgio and Tries reinforce this ahistorical conception of "PMEs 'enhanced' by biofeedback".] The TEC report says:

 

"PMEs are the main component of treatment.

PMEs derive from the Kegel exercises developed in the 1940s and 1950s." 

 

But that is simply FALSE.  The "Kegel exercises" were developed by nurses in the 1970s and 1980s when Arnold Kegel's perineometer (the first biofeedback device) became commercially unavailable.  [See the historical essay "The Bastardization of Kegel's Exercises" at http://www.incontinet.com/articles/art_urin/bastard.htm] Originally, "biofeedback" was the main component of treatment developed by Kegel in the late 1940s and 1950s.  (The term "biofeedback" wasn't coined until the late 1960s, but the process Kegel used has long been recognized as the first example of biofeedback).  [See http://www.incontinet.com/articles/art_urin/20yearbf.htm] Kegel NEVER advocated the use of the non-biofeedback exercises that were only developed after his death.

 

Therefore, an historically-correct formulation would address the question of whether SUBTRACTING biofeedback (including daily home training with a biofeedback device) from Kegel's Program DECREASED the effectiveness of it. 

 

Unfortunately, the highly-touted research projects of Berghmans and Burns DO NOT address that question, since they did NOT use a biofeedback program like Kegel's.  Kegel required the daily at-home use of a biofeedback device.  (Berghmans and Burns did not).  Kegel understood that patients learn at different rates, and did not use of fixed number of training sessions.  (Berghmans and Burns did.) 

 

The only study that did follow the Kegel model was Shepherd, Montgomery and Anderson (1983) which found an 83% symptom reduction rate for biofeedback compared with a 25% rate for plain PMEs.  But this study is dismissed by TEC on the grounds that they did not perform a test of statistical significance!  [Sorry, but back in 1983 when you got clear-cut results like 83% vs. 25% no one EVER thought it was necessary to run stats.  The computer revolution didn't come until 1984, when the MAC was introduced!  Hello?] Is "biofeedback" a sub-class of physical therapies"?  The biofeedback society (AAPB) doesn't think so.

While there are many prominent PTs who use biofeedback (Susan Middaugh and Stephen Wolfe come quickly to mind), physical therapists have always been a minority in the biofeedback world.

 

The most prominent group in biofeedback is psychologists -- the same people who developed behavior modification and behavioral medicine.  Three quarters of the experts in the classic text "Biofeedback", edited by Mark Schwartz, are psychologists.  Physical therapy is an important application of biofeedback, but biofeedback is not a branch of physical therapy.

 

When Berghmans et al, and BC/BS assume that biofeedback is a special form of physical therapy, they make a conceptual mistake that produces faulty questions, faulty comparisons, and misleading conclusions. 

 

 


2.  What the EC Recommended to Panels

 

I recently recommended that listmembers review the recently-published "Discussion Paper" at http://www.hcfa.gov/quality/8b1-i6.htm  as well as the transcript of the 3/1/00 MCAC Executive Committee ("next" button) and the "Interim Recommendations" ("next" again) that followed the meeting.  Finally I took my own advice, and here is the result of a more careful reading.

-----------------------------------------------------

 

The Working Group Report of 2//21/00 states clearly that the effectiveness of a treatment should be evaluated "relative to other items or services" (p.  3) but this was not done in the case of biofeedback, which was only compared to physical therapy (leading to the exclusion of the Burgio 1998 study).

 

The report also states, with respect to known IDEAL levels of evidence: "This level of evidence will likely be unavailable for many of the interventions that the MCAC panels will evaluate."  It further states that "in some cases the panel will determine that observational evidence is sufficient to draw conclusions about effectiveness.(p.  4)"

 

In the "Interim Recommendations", the same point is stated this way:

 

"However in many cases the panel will determine that observational evidence is sufficient to draw conclusions about effectiveness."

 

Yet the April 12-13 panel was told that ONLY RCT trials could be considered, and in the case of biofeedback, only trials within physical therapy models could be evaluated.  But there is no basis in the discussion paper for the rigid stand that HCFA staff took in evaluating incontinence evidence.

 

Likewise, the Interim Report appears fully aware of the complexities of non-pharmacological research when they say:

 

"For example, the outcomes of a complex surgical procedure can depend heavily on the skills of the surgeons and other staff caring for the patient.  "

 

The AAPB's testimony to HCFA in January discussed at some length the ways in which surgery and behavioral treatments were similar, and both differ from drug research.  For instance, clinician skill is critical in surgery and biofeedback, while relatively or completely unimportant in electrical stimulation and pharmacology studies.  (http://www.incontinet.com/isestimadrug.htm)

HCFA and TEC apparently did not agree.

 

The Interim Report makes specific recommendations for evidence review.  It states:

 

"The panel chair should assign at least two panel members to work closely with the authors of the evidence reports.  The rationale for this recommendation is to ensure that the evidence report covers a sufficient scope of studies, that it considers relevant alternative interventions, and that it will be useful to the panels in other respects. The panel should include some people who have acquired expertise in the topic of a coverage recommendation"

 

There is no indication that the first part was done.

 

As far as the record goes, only HCFA staff worked with the TEC report staff.  (See below for the second part.)

 

The Interim Recommendations also state:

 

"In addition, the Executive Committee recommends that the panel chair assign two primary reviewers for each topic.  These reviewers will not be the individuals who assist in the development of the evidence report; they should be new to the topic.  They will evaluate the evidence independently of one another.  Each will write a 1-2-page report ..."

 

Dr.  Garber, panel chair, explained that this was a new process that was only "partially" implemented for the incontinence panel:

 

10 There is an extensive review process

11 that the executive committee asked for, which we

12 have implemented partially for this panel meeting,

13 not entirely.  The review process that they

14 recommended includes both internal and external

15 review, and I believe that we have come very close

16 to meeting their requests for the internal review.

17 And we have two panel members, Dr. Lisa Landy and

18 Dr.  Les Zendle, who are essentially the internal

19 reviewers from the panel of the topic at hand.

 

Others will be less enthusiastic about how well the panel followed the recommendations.  The main problem was that no one in the public had heard of the "Interim Recommendations" until the hearing was already underway, so they were not aware of what was happening.  [The Interim Recommendations were not published on the HCFA website until a week AFTER the hearings.] Dr. Zendle, far from presenting a formal 1-2 page report, made a few almost casual remarks at the conclusion of the BCBS presentation.  But it appears that he spoke out of turn, because the agenda had listed "open committee deliberation" to follow the Simon and Lefevre presentations.

 

He assumed it was his turn next.  But after his remarks, which were generally in full support of the TEC report, Chairman Garber said:

 

1 DR.  GARBER:  Thank you.  Before we

2 proceed with other questions and comments from

3 panelists, I think Ken Simon had a few other things

4 to add to finish off the HCFA presentation.

 

The program did not indicate that Simon would speak twice.  Only after Simon finished the HCFA staff presentation, Garber said that the open committee discussion would begin, and said:

 

6 As I mentioned at the outset, two panel

7 members were designated as reviewers, Les Zendle is

8 one of them, Lisa Landy is the other.  Les, I

9 assume that was your opening statement.  And I

10 would like to ask Lisa to speak before we open up

11 to the entire panel to ask questions and make

12 comments.

 

It is important to note that neither Zendle nor Landy were listed on the printed program, so the audience was not aware of the formal nature of their "reviews".

 

Dr. Zendle is a "geriatric medicine specialist" currently working for Kaiser Permanente.  He did not claim to have ever worked with biofeedback or electrical stimulation.

 

Dr. Landy, on the other hand, did meet the Interim Recommendations criteria as expert in the topic at hand.  She is a urogenecologist who uses these techniques in her work.  She was not a member of the med-surg panel, but was imported for this one hearing, apparently to have at least one panel member who actually knew something about biofeedback.

 

Landy's testimony is worth reading in its own right, but may be best summarized by noting that (1) she was the only panel member with professional experience in the subject, and (2) she was the only panel member to vote in the affirmative (and against BCBS-TEC) on both biofeedback and electrical stimulation reports.

 

That tells you something!

 

None of the panel members who voted against these modalities claimed to have any professional (or personal) knowledge of them.

 

(For that matter, none of the "experts" at BlueCross/ BlueShield claimed any such experience, either.)

 

[Note: Diane Smith, RN, a well known expert on biofeedback and electrical stimulation, was a "guest" on the panel, but she was not allowed to vote.] The Interim Recommendations also contain a provision for expert opinion prior to the public hearing; The panel...

 

"should ask independent experts to comment upon the evidence report in advance of panel meetings.  The opinion of experts is the best way to assure everyone, the public and the panel, that the evidence report is complete and fair.  ....The Executive Committee envisions that the panel will choose a small number of expert reviewers (perhaps no more than six)...A reviewer may ask the panel's industry representative to obtain additional information from industry sources.

 

Clearly, NONE of these recommendations were followed.

 

Experts in the field of Biofeedback and Electrical Stimulation were never consulted prior to the hearing.

 

In fact, the BCBS-TEC report's existence was not even made public until some two weeks before the hearing.  There is no systematic review by any independent professionals with any clinical experience in either field, as recommended, and therefore, no expert opinion to be made part of the public hearing PRIOR to the panel's deliberations.

 

In spite of the short notice, many biofeedback and electrical stimulation experts did in fact review the TEC reports, and they were universal in condemning them as inadequate and misleading.  In the end, their hands-on testimony was disregarded in favor of the arm-chair research of "literature reviewers" employed by BlueCross/BlueShield.

 

There is only one established expert group concerned with biofeedback, the Association for Applied Psychophysiology and Biofeedback, and in spite of frequent communications, these experts were not informed of the existence of the TEC reports until a few days before their content was made public, just prior to the hearings.  The AAPB experts did submit eight separate documents pertaining to the subject of incontinence research, but this expert opinion was NEVER distributed to the Panel members, in clear violation of both the letter and the spirit of the Interim Report.

 

We can only hope that the Executive Committee will call HCFA staff on this blatant violation of their recommendations.

 

On the other hand, the EC had been warned that their role was only advisory; the HCFA preamble to the Discussion Paper (2/21/00) stated:

 

"When the panels offer comments to HCFA about medical evidence, both HCFA and the public should understand the panels’ basis for making those judgments.  Those standards are the MCAC’s; we do not take them to be criteria or processes binding to HCFA."

 

Obviously they meant it.

 

 


3. Dr. Landy's Testimony

 

Below is the testimony of Dr.  Lisa Landy, the only voting member of the Med-Surg Panel who had ANY actual professional experience with biofeedback and electrical stimulation.  She was acting in the role of "official reviewer" of the TEC report for the panel, although at the time the meaning of this role wasn't clear.

 

Dr.  Landy was not a 'regular' member of the med- surg panel, but was brought in for this one hearing because NONE of the regular panel members had ANY experience with biofeedback or stim. 

 

Dr.  Landy was the only panel member to vote in favor of both biofeedback and electrical stimulation at the hearing.  Being experienced, she felt that the evidence was compelling enough. 

 

Following Dr.  Landy's comments, at page 193, is the only humorous event in the whole two days.

 

Triggered by one of Landy's remarks, Dr.  Epstein asked me to elaborate on one of my slides in "two minutes".  When he clarified his request, he changed it to "five minutes", whereupon the Chairman interjected "Not five minutes though.

Let's keep this brief.", which brought a round of laughter from the panel.  The stenographer did not record the laughter.

 

As I remember it, that was the ONLY laughter in two full days.

 

It is noteworthy that although Dr. Epstein voted with the male majority against biofeedback on the first day of the hearings (a few minutes after the excerpted part), on the second day he abstained on the electrical stimulation vote.

 

==== from the official transcript ========

 

20 DR.  LANDY:  Yeah.  I had some opening

21 remarks.  Some of them are kind of reiterating

22 what's been said already today, but I kind of want

23 to summarize things.

24 The first one is, the task set before us

25 is a very specific one, and it's to answer a series

00188

1 of efficacy and additional benefit.  The MCAC

2 committee has helped us and set forth guidelines

3 for us as panel members specifically to follow, and

4 these guidelines were set up to assess new

5 technologies and compare them to established

6 practices.  And we're to use evidence based

7 medicine as the foundation for our decisions.

8 And as we can see from today's

9 presentations, multiple presentations, that there

10 are several levels of evidence that we can consider

11 and weigh appropriately when we answer these

12 questions.  We've heard today from representatives

13 of multiple professional societies and specialty

14 organizations presenting their consensus statements

15 regarding efficacy of this behavioral

16 intervention.

17 The 1998 [sic: 1988] NIH consensus statement

18 recognized the efficacy of behavioral intervention

19 and specifically biofeedback.  There are guidelines

20 of practice that we all use when we practice in

21 this field based from the AHCPR guidelines which

22 recommend the use of behavioral interventions,

23 including biofeedback, as first line therapy.  We

24 also heard presentations of a technology assessment

25 which confirmed biofeedback efficacy, and then

00189

1 focused on answering the question of whether there

2 is additional benefit achieved from biofeedback

3 over PME alone.

4 I would like to summarize some of these

5 key points that come out of today's presentations

6 before we go into our discussion, and use this as a

7 launching point for our deliberation.  One of the

8 points is that biofeedback is not a new technology

9 and that the guidelines that were set up to do is

10 to compare to established practice.  Biofeedback is

11 a very well established practice.  And that goes

12 back to the issue of why is PME alone chosen as the

13 standard for comparison?  In the original

14 presentation by the statistician, there was the

15 question of choosing appropriate standards.  And I

16 think we should keep that in the back of our head

17 when we look at all this information and data.

18 From 1948 on, when PME was introduced,

19 Kegel himself recognized the need of using a device

20 to assist and be adjuvant to the PME alone.  And

21 from the very beginning of therapy in this area, a

22 device or perineometer, or some kind of

23 intervention was utilized.  So it has always been a

24 part of established care and standard to use some

25 form of biofeedback method.  It really isn't a new

00190

1 technology.

2 And we have been given evidence from

3 multiple sources, the Bump study in 1991, Kerri [sic]

4 Bo's study in 1990, and most recently, the

5 Sampselle study, 2000, showing the drawbacks of

6 doing Kegel exercise with just verbal instruction,

7 and I think that was brought up very clearly.

8 In 1992 and 1996 updates, the AHCPR

9 guidelines for treatment was more developed, and

10 this was a panel of experts in the field, who came

11 up with these guidelines and recommendations, and

12 they came up with these guidelines based on strong

13 scientific evidence, rated their evidence, and this

14 is akin to our task set before us today.  Their job

15 as panel of experts back in 1996 was very similar

16 to what we are being charged with today.  And they

17 felt that based on their review and the strength of

18 evidence, they've made recommendations regarding

19 pelvic muscle rehabilitation and bladder inhibition

20 using biofeedback therapy as recommendations for

21 treatment of these patient groups.  They

22 specifically did not sort out biofeedback and

23 remove it from the formula.  And I think there is

24 something flawed with that whole question of taking

25 away a therapy that's always been part of the

00191

1 treatment from the very beginning.

2 The technology assessment has come to

3 certain conclusions.  I think in our discussions,

4 we can critically analyze the data.  Like they

5 said, the AHCPR guidelines specifically did not

6 address the issue of whether the addition of

7 biofeedback to PME is more effective than PME

8 alone, and I think it specifically was avoided as

9 to not take that out of therapeutic treatment

10 modalities.  We have to treat people, because we

11 treat people in this area with multimodality

12 treatment.

13 Since then though, the question has come

14 up and been the focus of several evidence based

15 reviews.  In de Kruif and van Wegen, one in 1996;

16 Berghmans in 1998; and the meta-analysis by

17 Weatherall in 1999, as well as the current

18 technology assessment, all of them with varying

19 conclusions.

20 I would like to make a point too.  This

21 panel was initially charged with addressing the

22 issue of efficacy of biofeedback as an incontinence

23 intervention, and now we are being asked to compare

24 it as an adjunct therapy to PME versus PME alone.

25 Now the question is asking about efficacy as an

00192

1 adjunct to a therapy, and this is an important

2 distinction when looking at the literature.  And

3 when we reviewed this before we came here, we may

4 not have looked at the literature in quite the same

5 way as this nuance brings up.  But for the question

6 at hand, those studies comparing PME alone to

7 biofeedback and PME are the ones we really need to

8 critically review.

9 And we have to look at them for

10 comparison of groups, methodology, and outcome

11 measures.  And while analyzing the data, we need to

12 keep in mint that the PME alone groups show

13 variability between the studies as to what the

14 treatment intervention was in those groups, and

15 consist of interventions other than PMEs, and that

16 may influence the results of the data.  And that

17 brings me back to the issue of, did we select an

18 appropriate standard to compare it to?

19 So that -- in one of the presentations

20 by Dr. Perry, he gave us some slides and I think we

21 critically need to look at those, but he brought

22 out some of the potential information about

23 methodology, about the PME alone group.

24 So, I thought that was a good launching

25 point now for us to open up discussion.

00193

1 DR.  GARBER:  Thank you, Lisa.  Arnie?

2 DR.  EPSTEIN:  Even without the prompting

3 by Lisa, I was thinking the same thing, that the

4 final slide you brought out, you actually brought

5 out two, but the final one was particularly

6 interesting to me, where you talked about the 25

7 percent, 50 to 60, and 55 to 70 percent, and he had

8 very little time when he did that, and I wonder if

9 we could give him two minutes to get him to expand

10 on where those numbers came from and the strength

11 of the studies behind them?

12 DR. PERRY:  I didn't really get the

13 question.

14 DR. GARBER:  I think Dr. Epstein is

15 asking if you can show us the last slide, is that

16 correct, or the second to the last?

17 MS. SMITH:  He means this one, the

18 levels of PME where you compared the written

19 instruction from Sampselle, Berghmans in '96, and

20 Burgio, where you had 27 percent, then 51 to 60

21 percent.

22 DR. HILL:  We have it in our handout.

23 MS. SMITH:  We have it in our handout.

24 DR. EPSTEIN:  Yeah, and I was really --

25 I have the handout and I have the visual memory,

00194

1 and I didn't have the Sampselle study that I can

2 recall beforehand.  It's partly because of that but

3 also partially because I think it makes potentially

4 an interesting case, and I wonder if you can take

5 the talking points that you would have used five

6 minutes for but were forced not to, and now take

7 them.

 

8 DR. GARBER:  Not five minutes though.

9 Let's keep this brief.

 

10 DR. PERRY:  The Sampselle study is

11 especially interesting because they avoid all the

12 problems with contamination and really did do PMEs

13 alone.  They just had a handout, here it is, a

14 one-pager and you know, this is your education.

15 And I'm amazed, you know, really the differences

16 between us all come down to one thing.  TEC wants

17 to use a rigid definition of biofeedback and a

18 catchall definition of PME alone.  It's interesting

19 because it was sort of the other way around back in

20 the guidelines where they used surgery, clear;

21 drugs, clear; everything else is behavioral,

22 including stim.  Does that answer?  So, you have a

23 really rigid category of biofeedback, and a

24 catchall category of everything else counts as PME

25 alone, and when you do that, you get nonsignificant

00195

1 results.

2 DR. LANDY:  A comment I'd like to make.

3 I think the importance of sorting out the PME alone

4 group is that if it truly is an intervention, then

5 what you're looking at is the result of an

6 intervention, as opposed to how we clinically use

7 the descriptive term of PME alone.  And when

8 clinically applied, most clinicians in this area

9 would do some form of verbal instruction, written

10 instruction sheet and send the patient home, and

11 that's truly what the studies are not comparing.

12 The studies are comparing one intervention to

13 another, so that PME alone is not really a good

14 standard.  The best standard we have are looking at

15 the studies with, comparing a waiting list control

16 group, because that most represents what we see

17 clinically, because those are people who on their

18 own, at some point in their association with a

19 physician were taught or told to do Kegel

20 exercises, or they read it in a magazine article,

21 and that's what they're doing on their own.  And

22 that best represents the result we get with PME

23 alone clinically.

==============end of excerpt===============

 

The full transcripts are available on the hcfa.gov/quality website, or incontinet.com.

 

 


4. Dr. Zendle's Testimony

 

Following are the comments of Dr.  Zendle at the Incontinence hearings in Baltimore.  As mentioned in a previous post, Dr. Zendle was one of two panel members asked to "review" the TEC report and prepare a recommendation to the full panel.

 

Due to a lack of clarity in the printed program, Dr. Zendle made these remarks in the middle of the HCFA/BCBS presentation.  Taken by themselves, his remarks do not constitute much of a "review" of the TEC report.  He first praises the AHCPR Guidelines (which TEC rejected), and then agrees with TEC that "there isn't enough evidence".  Then he complains about the lack of research in this area.  Apparently he isn't aware that there isn't a lot of money to be made in behavioral treatment of incontinence, and that the potential for profit -- big profit -- is what drives research. 

 

8 DR.  ZENDLE:  Well, this has been a very

9 interesting day.  It's hard to believe we have

10 already been here for six hours; it's gone pretty

11 quickly.  I want to thank and congratulate the

12 presenters and the organizers of this.  I've

13 learned a lot.

14 After today -- you know, I went over the

15 questions myself beforehand and I have listened

16 very carefully to what people had to say.  And I

17 have no problem accepting the AHCPR '96 guidelines,

18 and I have no problem agreeing with the clinicians

19 who feel that some patients do better with feedback

20 and PMEs than with the exercises alone.  And I

21 actually think it should be made available to those

22 patients who are so identified, especially if a

23 guideline is being followed that tells you which

24 patients it works best on and which form of

25 biofeedback and what the regimen should be.

00182

1 But I have to agree with the TEC

2 assessment that there isn't sufficient evidence,

3 scientific evidence of sufficient quality really,

4 to conclude that adding biofeedback to the

5 exercises is better or not better than doing the

6 exercises alone.  And I guess the only other point

7 I would make is that the statistical definition of

8 what's enough evidence isn't really a matter of

9 opinion, it's a scientific matter, that science has

10 already made agreements as to what is

11 scientifically relevant, and I don't think this

12 meets the magnitude of that.

13 It does leave me with one important

14 question, though, and that's why hasn't there been

15 more research in this area?  It's not like this is

16 a rare problem, and it's not like these are mild

17 symptoms.  This is a common problem that is a major

18 life disruption not only for the patient, but for

19 families and for society.  And it's shocking to me

20 actually that there are so few patients that have

21 been looked at in a rigorous way and therefore, we

22 can't reach conclusions with statistical validity.

23 And I'm not sure who's to blame for that, but it's

24 just a question that I'm left with and frustrated

25 with.

 

====== end of remarks =========

 

See the full transcript on the hcfa.gov or incontinet.com websites for proper context.

 

 


5. The Slide They Wanted to See

 

Previous comments in this series mentioned a request by Dr. Epstein for elaboration of a slide I had presented showing the increasing levels of effectiveness of adding various interventions to "PME Alone".

 

Although my handout containing the content of the slides was actually given to the stenographer, the transcript as published gives only the verbal presentation, omitting the visual presentation that was the basis for that verbal presentation.

 

Since each of us had only eight minutes to present our testimony, most people made the same assumption that I did, namely, that we should use our limited verbal opportunity to elaborate, rather than repeat, our visual presentations.

 

But the result is that the very precise and legal official transcript is of limited value, because it does not include the words that were presented visually at the hearing.  The reader can only guess as to what the audience was reading while the speaker was speaking.

 

(IncontiNet.com has already published, as a public service, all the testimony of expert witnesses that was submitted for publication.  See the opening paragraphs at: http://www.incontinet.com/home.htm for details and links.)

 

The following is the text of the slide that Drs. Epstein and Landy asked me to elaborate.

 

Levels of Pelvic Muscle Exercise

Written instruction alone

27%

Sampselle 2000

ADD vaginal palpation
and verbal feedback

51% to 60%

Berghmans 1996
Burgio 1986

ADD EMG testing

54% to 77%

Burns 1993 
Wells 1991

ADD formal biofeedback training

80% 94%

Burgio 1998 1986;

Sussett, Kegel, etc.

 

Commentary: The numbers represent the percentage reduction in symptoms reported by the noted researchers for the intervention added beyond "mere verbal instruction alone", which is best exemplified in Sampselle (March, 2000).

 

Sampselle et al reported on the success of a genuine "PME alone" intervention -- they gave subjects a written handout describing how to do PMEs.  They had no confounding interventions, such as therapist's manual palpation of the pelvic muscles and verbal feedback of success.  It was a case of PURE "PME Alone", and the results were dismal -- 27% symptom change -- barely above the expected placebo effect for therapeutic attention.

 

The Wells 1991 report got the best results for "PME Alone" (77%) -- because she did much more than PME Alone.  She actually "tested" subjects' pelvic muscles with an EMG biofeedback device, before "PME Alone" and every month thereafter for six months -- a total of *7 EMG sessions*.

 

TEC 2000 failed to notice this important fact, and as a result, erroneously ascribed to "PME Alone" greater success than it deserved.

 

The original slide is visible in a PowerPoint presentation referenced on the IncontiNet home page.

 

 


6. Burgio and Burton -- NO PMEs at all!

 

We have previously called attention to the BC/BS-TEC error in considering Burton et al as an instance of "PME Alone vs.  PME plus biofeedback" in the treatment of Urge Incontinence.  The vast majority (11 of 14, or 79%) of Burton's control group was given "education"; but only 3 of the 14 (21%), those who had stress incontinence, were given instruction in pelvic muscle exercise. 

 

The "education" resembles modern bladder retraining:

 

"Patients...(were) taught to respond to an urge sensation by: relaxing; tightening their urethral sphincter; relaxing their abdominal musculature, and, when the urge passes, walking slowly to a lavatory to void." (p.  695)

 

There is no way that can be construed as a form of "pelvic muscle exercise alone". 

 

(One has to read the text carefully and compare the text with the tables to discern precisely who got what. The issue is not spelled out clearly because the purpose of the 1988 research was not address the issue TEC tried to study 12 years later.)

 

In any case, Burton was invoked as an example (the only example) of a controlled study of PME on Urge Incontinence, and NONE of Burton's Urge patients got ANY PME at all!

 

In reviewing the TEC report yet one more time, we note that a similar mistake was made by TEC in the classification of Burgio et al, 1986 ("The role of biofeedback..."). 

 

TEC says:

 

"The nonrandomized trial by Burgio et al.  (1986) assigned 24 patients to PME alone or PME plus biofeedback after stratifying by age and frequency of incontinence.  The authors reported a significantly greater percent improvement in incontinent episodes for patients treated with PME plus biofeedback (76% improvement versus 51%, p < 0.05)."

 

But Burgio et al, 1986 says:

 

"This study examined the effectiveness of teaching pelvic floor exercises with the use of bladder-sphincter biofeedback compared with training with VERBAL FEEDBACK BASED ON VAGINAL PALPATION in 24 women..." (Abstract, emphasis added).

 

Since the point is made in the abstract, TEC didn't even have to read the article itself to realize that this study did NOT qualify for inclusion in a field that was supposed to be limited (albeit arbitrarily) to "PME Alone vs. PME+BFB". 

 

Burgio clearly states:

 

"Verbal feedback training consists of instructing the patient to squeeze the vaginal muscles around the examiner's fingers and providing her with verbal performance feedback."  (Abstract)

 

The Biofeedback technique of Burgio is well known, being the subject a documentary film by NIA in 1984.  Burgio herself provides substantial verbal feedback when helping the patient to interpret and understand the tracings of the polygraph.

 

The study showed that patients did much better when they were given instantaneous biofeedback with verbal interpretation instead of delayed, relayed feedback from a human being alone.

 

In addition, it seems plausible that many of the women in this study were slightly uncomfortable performing exercises with another woman's hand inside their vaginas, thus reducing the effectiveness of the technique. 

 

In any case, it is worth noting that this kind of "verbal feedback" is more labor intensive than "PME with biofeedback", since it requires a much higher level of clinician experience.  In other words, biofeedback costs about the same as manual-verbal feedback therapy, but delivers 50% more results. 

 

 


7. Levels of Evidence and Who Decides?

 

1.  What happened to the AHCPR Guidelines?

 

Many people have noted that the famous AHCPR Guidelines on Incontinence (1992 & 1996) were not distributed to the April 12-13 Panel discussing Incontinence.  Both Diane Smith and Lisa Landy, guest members of the panel, remarked about this omission explicitly, and several of the speakers mentioned the Guidelines as having shaped and established the accepted standards of treatment in this country. 

 

The omission is complex.  None of the people involved in the 2/21/00 committee report, nor the 3/1/00 Executive Committee meeting, nor the 4/17/00 "Interim Recommendations" had any direct awareness of incontinence treatments or research and practice in incontinence.  They were reaching for broad principles that could be applied to ALL areas of medicine. 

 

These people, mostly physicians, attempted to bring the latest standards and values in medical research into a broad guide for Medicare-related research in the new Millennium.  The latest trends emphasize "evidence-based-medicine", which had not been invented when the AHCPR guidelines were prepared and published.

 

So the AHCPR guidelines are considered "primitive" by today's new standards.  Indeed, they are no longer "politically correct".  They have become an embarrassment to the medical establishment. 

 

This is clear in item 4 in the "Discussion Report" of 2/21/00, which states: (Note the omission of AHCPR, [now "AHRQ"])

 

"The standard of excellence for the evidence report should be the best work in the private sector (e.g., Blue Cross-Blue Shield), by professional organizations (e.g., ACP-ASIM), and for other Federally sponsored panels (e.g., the Evidence-based Practice Centers technical support for the U.S.  Preventive Services Task Force)."

 

The politically correct view is reinforced in Dr. Sox's opening remarks to the 3/1/00 Executive Committee meeting:

 

14 And we feel that the standard for HCFA should be

15 the best that's out there in other settings, such

16 as the private sector where Blue Cross Blue

17 Shield has a long track record of doing

18 evaluations of the evidence and making coverage

19 decisions in what is a process that's both

20 efficient and I think highly regarded by

21 professional organizations such as the ACP-ASIM

22 and by other federally sponsored panels.  The

23 Agency for Health Research and Quality has a

24 series of evidence-based practice centers in

25 various universities, and I think there are a .00027

1 couple of private settings around the country,

2 and they provide technical support for the U.S.

3 Preventive Services Task Force on which I serve.

 

So it was no accident that HCFA staff members suppressed the AHCPR report (i.e., failed to distribute it to the panel, in spite of its historical importance in the treatment of incontinence).  They didn't like the way the 1992 and 1996 Guidelines considered ALL levels of evidence to have potential merit. 

 

[The guidelines ranked evidence as "A-B-C", based on:
controlled trials, clinical series, or mere expert opinion.]

 

Panel members, for the most part, were not familiar with the history of incontinence work and did not request the AHCPR Guidelines from HCFA staff.

 

 

2.  Who decides what level of "evidence" is needed on a particular review? 

 

In the Executive Committee deliberations, it was clearly up to the PANEL itself to decide what level of evidence was need for the topic under review.  For example, in Dr. Sox's opening remarks about the 2/21/00 committee report, he said (on 3/1):

 

(00019)

9 But in some other cases, perhaps many cases, the

10 panel will determine that observational evidence

11 is sufficient to draw conclusions about

12 effectiveness.

 

At the Executive Committee on March 1, Dr. Sox (chairman) is very clear that it is up to the PANEL to decide what level of evidence should be required to reach a decision. 

 

But the Med-Surg panel was not told this.  They were led to think that the BC/BS TEC report set the standard for them.  TEC said that only RCTs studies should be considered, and the Panel never questioned that.

 

This point is closely related to the next one.

 

 

3.  Defining the Question.

 

Even more important than the "level of evidence" is the Question for which evidence is collected in the first place.

 

Consider the following exchange that took place after the AHRQ presentation on April 12th:

 

12 DR. RATHMELL:  I would also like to

13 clarify.  I want to be very clear about the

14 question that we're being asked.  We are not being

15 asked whether or not there is adequate evidence to

16 support the effectiveness of biofeedback.  We are

17 being asked if there is adequate evidence, and now

18 I'll quote.  Here's the question in the

19 technology:  Does adding biofeedback to PME -- now

20 biofeedback by the definition of the process,

21 includes PMEs -- and we're being asked if adding

22 biofeedback results in a greater improvement in

23 health outcome, okay?

24 It's very important because none of the

25 panelists have any of the evidence, and I

00051

1 understand there is a large body of evidence

2 looking directly at the effectiveness of

3 biofeedback versus control, okay?  So it's very

4 important, we're looking at a very very small

5 subsection and we have been given the evidence only

6 on a small subsection.  So we can't answer the

7 question about whether biofeedback is effective;

8 all we can do is compare it to PMEs, a very very

9 specific question.

10 DR. GARBER:  Yeah.  Thank you for that

11 clarification.  And let me emphasize that the

12 question posed is in a sense deliberate, because

13 our entire set of classifications for effectiveness

14 are based on comparative statements, and as

15 Dr.  Zarin had mentioned, what you compare it to is

16 critical in analyzing the data and making the

17 determination.  Yes?

 

Actually, NO.  Dr. Rathmell had noticed that the panel was being asked to focus on a "a very very small subsection of the evidence", and that the panel was NOT addressing the original question, "Is Biofeedback effective?".

 

Does Chairman Garber's answer make sense?  Not to me.  Apparently it didn't convince Rathmell, either.  He went on to elaborate the point:

 

16 DR. RATHMELL:  This is Dr. Rathmell.

17 Our technology assessment doesn't look at

18 biofeedback versus control, except tangentially.

19 There are many additional studies that look at

20 biofeedback versus control, like a waiting list

21 control, various control groups.  And so I don't

22 think we can answer the question as to whether

23 there's adequate evidence, as to whether

24 biofeedback versus control, but biofeedback versus

25 the PMEs alone, that's all we can assess, that's

00054

1 all the technology addresses.

2 DR. GARBER:  Yeah.  I think that the

3 question that we were posed by HCFA is the one that

4 the evidence report attempts to answer.  I'm not

5 sure that it's, the question is -- I see.  This

6 question does not spell out that it's compared to

7 PMEs alone.  Is that your concern, that HCFA's

8 question doesn't state that?

9 DR. RATHMELL:  HCFA's question does very

10 specifically say that what we're comparing to is

11 PMEs alone.  So I would say, that's the only

12 question we're addressing.  Versus someone sitting

13 on a waiting list and doing nothing, they are not

14 instructed in anything, we are not answering that

15 question.

16 DR. GARBER:  That's correct.

 

According to Garber, the question (PME vs PME+BFB) was posed "by the evidence report", not by HCFA.  Look more closely at what Garber said:

 

2 .....I think that the

3 question that we were posed by HCFA is the one that

4 the evidence report attempts to answer.  I'm not

5 sure that it's, the question is -- I see.  This

6 question does not spell out that it's compared to

7 PMEs alone.

 

Garber is, of course, wrong on the first point.  HCFA raised the question about Biofeedback, but the TEC report only addresses what Rathmell called a "very very small subset" of the possible questions that could be asked about biofeedback.

 

HCFA's question to the panel was about the effectiveness of biofeedback in the Medicare population. 

 

TEC's report addressed a very small sub-set of the possible data that could be marshaled to answer HCFA's question.  Electrical Stimulation was compared to (1) placebo, and (2) alternatives, such as "PME, vaginal cones, bladder training, pharmacologic agents".  In contrast, Biofeedback was only compared to PME.  Why?  Rathmell tried to raise this question, but he didn't get very far. 

 

 

4.  What types of evidence should be considered?

 

The Executive Committee had left the door open for panels to determine that less-than-perfect evidence, such as observation evidence, might be sufficient to draw conclusions. 

 

In contrast, the TEC report implied that there is only one standard, the "gold standards" of RCT of today's evidence-based-medicine, and the panel was not given any hint that different standards might apply in different disciplines.

 

The most quotable remark on April 12th was the protest of a panel member who was aware that his negative vote on "science" could be misconstrued as a policy recommendation.  He said: "the standard of evidence in science is .05, but the standard of evidence in policy decisions is .50." 

 

In other words, we understand that the ONLY standard in science is a RCT study with statistically significant results, but when making policy decisions, we expect HCFA to decide on the basis of probabilities.  Several panel members voiced support for this intermediate position, and they expressed frustration over being forced to vote as they did. 

 

The current fad in medicine is called "evidence- based-medicine", and tries to suggest that in the past medicine wasn't "really" based on evidence -- but now it will be. 

 

Those who know the history of philosophy will recognize this as a late reincarnation of the school of "Logical Positivism" of the mid-20th century.  Bertrand Russell and his peers maintained, in effect, that "If you can't kick it, it isn't real".  The limitations of this philosophy soon became apparent, and it is the object of ridicule and scorn among philosophers today.

 

Those who are trained in the Philosophy of Science understand that science includes many levels of research that become progressively more focused and refined.  Systematic observation is NOT "unscientific", it is just a lower level of scientific certainty than RCTs.  Sometimes it's the only evidence available. 

 

The problem is that good research is expensive, and many important issues are not addressed in the higher or highest levels of scientific research because no one stands to profit from the investment.

 

In this context, it is worth noting that Dr.  Zendel's review of the BC/BS TEC report consisted of two points:  (1) He accepted their conclusion that there was no "real" scientific evidence in support of behavioral treatments of incontinence, and (2) he wondered why there was no research.

 

While informed readers will question the first point (BC/BS phrased the question prejudicially to exclude important studies such as Burgio 1998), there is really no mystery about the second; there is no big money to be made funding such research.

 

At the 3/1/00 Executive Committee meeting, Dr. Wayne Roe, Chairman of Covance Health Economics & Outcome Services in Washington, D.C., spoke on behalf of the Health Industry Manufactures Association.  He lamented the current trend:

 

00077

2 ........Far

3 too much weight on randomized controlled trials

4 as the desired level of evidence.  We're going to

5 have them, we're going to have more of them, but

6 they're going to be rare.  And we can't afford

7 them all.  And we all know there are lots and

8 lots and lots of reasons why we can't do them.

9 And the FDA doesn't require them every time even

10 for drugs.  So I think you have to recognize

11 that.  There's lots of good science being done

12 far better than before.  Overemphasis on

13 randomized controlled trials is going to make

14 other research seem inadequate, and I think it

15 will lead to some research not being done, some

16 good research not being done, and things not

17 being developed.

 

HCFA staff tried to respond:  Dr. Hugh Hill said:

 

00122

15 ....As the subcommittee report suggests,

16 observations alone may sometimes allow a panel to

17 make conclusions about effectiveness.  Such

18 suboptimal evidence may allow us to conclude that

19 Medicare should cover the service. 

 

And Jeffrey Kang said:

 

00133

5 The first is I did not read in this

6 document that there's an implication that

7 everyone has to have a randomized controlled

8 trial.  What this document in my mind says is

9 that's the gold standard, but to the extent that

10 you deviate from the gold standard, you have to

11 explain biases, how you dealt with it et cetera.

 

Dr. Garber added:

.00161

3 If it is uncontrolled, it is not valid evidence

4 by itself, yet there are plenty of studies that

5 could have valid controls that are not

6 randomized, and I would hate for the readers of

7 this document to think that this paragraphs means

8 you have to have randomized controlled trials.

 

But six weeks later the tone stiffened.  HCFA and BC/BS-TEC adopted a hard line:

 

23 Finally, the subcommittee made, I

24 think, a very strong statement saying that a body

25 of evidence that consisted only of uncontrolled .00020

1 studies, whether based on anecdotal evidence,

2 testimonials or case series or disease registries

3 without adequate historical controls, is never

4 adequate.  So we really feel strongly there needs

5 to be some form of control even if it's only

6 historical controls.

 

The Interim Recommendations report is more specific:

 

[The highest] level of evidence will likely be unavailable for many of the interventions that the MCAC panels will evaluate.  There may be randomized trials conducted in other populations (e.g., middle-aged men rather than men and women 65 years of age and older), randomized trials with important design flaws (e.g., they are not double-blinded), or nonrandomized studies with concurrent controls.  Deciding whether such studies constitute valid, applicable evidence can be very difficult.

...

[But not impossible!]

 

In some cases, the panel may decide that it cannot draw firm conclusions about effectiveness without randomized trials.

 

[Actually, the panel never addressed this issue; they assumed it was true because TEC said so.]

 

Although they do not have randomized controls, all well designed observational studies include some form of control.  Controls may consist of an implicit or explicit control group or statistical controls.

 

[TEC got around this by insisting that there was only one question: "PME vs.  PME+BFB".  This was a clear violation of the recommendation, which said that many types of control might be valid, as Dr. Rathmell tried to point out.]

 

A body of evidence consisting solely of studies with no controls whatsoever - whether based on anecdotal evidence, testimonials, or case series - is never adequate.  However in many cases the panel will determine that observational evidence is sufficient to draw conclusions about effectiveness.  When these circumstances apply, the panel must describe possible sources of bias and explain the basis for its decision that bias is unlikely to account for the results.

 

Since the Panel accepted TEC's definition of the "only" question, they didn't have to address the questions of "possible sources of bias".  We will examine the role of classic concepts of bias in a future installment. 

 

 


8. No Relevant Outcomes?

 

This report examines on some of the statements made in the BC/BS-TEC report on Biofeedback that may be incorrect.  It does NOT directly change the bottom line, and it may be safely disregarded by those with less than obsessive interest in the accuracy of the TEC report.  But it does raise some questions about TEC's accuracy, and apparent reliance on secondary sources.

==================================

 

The BC/BS-TEC report states:

 

"The final two trials included in the Berghmans review did not meet the selection criteria for this assessment—Castleden et al.  (1984) had no concurrent control group and reported no relevant outcomes and Taylor and Henderson (1986) reported no relevant outcomes."

 

As for Castleden, it is true that there was no "concurrent" control group; they used a "cross- over" design.  But even more to the point, there was no "biofeedback" treatment group!  TEC was apparently confused by Berghmans

1998 analysis, which states:

 

"Five RCTs were identified comparing PFM exercises with biofeedback (BF) against PFM alone [refs include Castleden]" (185).

 

This study keeps popping up in literature searches, apparently because the subjects used a "perineometer".  But they did NOT use it for "biofeedback" during their empty-vagina exercises several times a day.  Instead they simply "checked" their muscle strength once a day.  A novel idea, which had never been used before and has never been repeated in the 16 years since.  (It didn't work!)

 

Lest this distinction seem arbitrary, we should note that the Castleden study fails to meet the inclusion criteria of TEC itself, which defines biofeedback (in part) as "to assist patients in the performance of pelvic floor muscle exercises."

 

Castleden's subjects did not use the perineometer to help in the performance of their daily exercise.  Berghmans et al failed to notice this detail.

 

TEC should have read the original, instead of relying on Berghmans.

 

As for "no relevant outcomes", Castleden reported that 14 out of 19 subjects no longer used protective pads.  That certainly seems like a relevant outcome.

 

On the other hand, this study does not meet the inclusion requirements because there was no "PME alone" group.  That's a different issue.

 

 

As for the Taylor and Henderson study, what Berghmans et al (1998) actually said was "Because data was poorly reported in the study by Taylor and Henderson [61], it is not possible to isolate the comparisons between the treatments" and "Again, the analysis in the study by Taylor and Henderson [61] cannot be assessed" (185)

 

Taylor and Henderson had reported a small pilot study that would have led to a full-scale study except for the untimely death of Dr. Taylor.

 

They divided 12 subjects into three experimental groups; daily home biofeedback, daily home exercise with a resistive device, weekly office biofeedback, and a fourth control (no treatment) group.  Their report, in the Journal of Gerontological Nursing, states that that the daily biofeedback group got a 100% continence rate, while the control group got 67%, "as was the rate obtained by the experimental groups as a whole".

 

The quoted sentence is, admittedly, ambiguous.  Do they really mean that the average of ALL (3) experimental groups was "67%"?  Or should "experimental groups as a whole" have been stated as "the REMAINING experimental groups"?

 

After all, the first interpretation would require us to assume that, since Group 1 got 100%, the other two Experimental Groups got much lower scores, which the authors are trying to conceal.

 

The more logical explanation is that the authors, having already described TWO of the groups, are now describing the other TWO (experimental) groups -- with a poor choice of words.

 

Professional courtesy demands that any authors be given ethical credit over linguistic skill.

 

According to recent discussions on the evidence- based-health email list, it is commonly accepted practice in EBM for a reviewer to CONTACT the authors in the case of textual ambiguity.

 

It is perhaps understandable that Berghmans did not call Henderson from the Netherlands, but less clear why BC/BS in Chicago didn't make the call, or at least send her an email.

 

BTW, I personally did contact Henderson at her home in Denton, Texas and she confirms that the language used was less than ideal.  They weren't trying to hide anything; they just didn't say it as clearly as they should have.

 

So the results of this small study were that ALL of the 3 members of the daily home biofeedback group got dry, but only 2 out of three in each of the other groups did.  That sounds like a "relevant outcome" to me.

 

Thus when BC/BS says "Taylor and Henderson (1986) reported no relevant outcomes" they did not read the primary source closely enough.

 

Incidentally, the Taylor and Henderson "control" group was given pelvic muscle EMG evaluations "before and after", so they were not a pure "no treatment" group.

 

Which leads directly to my final point.  Taylor and Henderson should never have been mentioned in the first place, because they DID NOT HAVE a "PME-Alone" condition!  Why didn't BC/BS notice that?  That would have been sufficient grounds to exclude this study under their narrow definition of the question.

 

While not of earth-shattering significance in themselves, these errors do cast a shadow over the assumed infallibility of the BC/BS technology assessment process.

 

 


9.  TEC's three forms of bias

 

BC/BS-TEC identified three types of "bias" in research:

 

1.  Selection bias - Imbalances in patient characteristics between groups with potential for differences to affect outcomes

2.  Performance bias - Inequality in the intensity of treatment given between groups

3.  Attrition bias - Significant number of dropouts in one or more study arms, not taken into account in the statistical analysis Then in summarizing the Stress Incontinence papers, they report:

 

In four of the trials, one or more potential sources of bias was identified (Shepherd et al.  1983; Burgio et al.  1986; Ceresoli et al.  1993; Glavind et al.  1996), while in two trials no obvious potential sources of bias were identified (Burns et al.  1993; Berghmans et al.  1996).

 

In their summary tables we read:

 

Burgio 1986

1.  Potential for selection bias (not random, was matched)

 

And in their methodology review, TEC says:

 

The trial by Burgio et al.  (1986), while stratified to balance the arms on age and frequency of incontinence, was not randomized.

 

To fault Burgio here seems to contradict what they had promised in the introductory sections, where they admitted:

 

"Controlled trials that are nonrandomized, while prone to selection bias, may also provide sufficient evidence of efficacy if the comparability of the treatment arms can be adequately assessed."

 

Since Burgio DID report at least the most important characteristics of the two groups, and they were not significantly different on age and incontinence, this would seem to qualify under the rules TEC set down.

 

The same is true of the other study that Burgio participated in, Burton et al 1988, where patients were matched and the distribution into treatment groups that were also shown to be without prejudice.  However, we have previously shown that Burton did not qualify for this review, since NONE of Burton's control-group urge patients got ANY PME-Alone, as required for this analysis.

 

Other studies were faulted mostly for performance and attrition bias; or rather, for the "potential" for them.

Specifically:

 

Shepherd 1983

1.  Potential for performance bias.  (5.7 vs. 3.5 sessions)

2.  Potential for attrition bias.  (0 vs. 27% dropouts 3/11)

 

Ceresoli 1993

1.  Potential for performance bias.  (6 weeks vs. 3 months)

2.  Potential for selection bias.  (not randomized)

 

Glavind 1996

1.  Potential for performance bias.  (BF got 4 more sessions)

2.  Potential for attrition bias.  (5% vs.  25% dropouts (1/20, 5/20)

 

(Note that the two studies that got better BFB results got more BFB time (attention); but the apparent exception, Ceresoli, got equal results when the PME group got more time.)

 

 

Berghmans 1996 - none

Burns 1993 - none

 

The concepts of "performance bias" and "attrition bias" are especially troublesome when evaluating biofeedback research.  The Burgio, Shepherd, and Glavind biofeedback treatments resulted in significantly better outcomes, but they are faulted because they also took longer that PME alone (even when it wasn't really "alone").

 

Ceresoli got "similar" results with biofeedback vs. PME alone, but PME subjects needed twice as much time as the biofeedback subjects did to get that level.  If the PME subjects had been evaluated at 6 weeks, instead of 13, it seems most likely that biofeedback would have produced "superior" results, just like the other three.  So Ceresoli really belongs in the "biofeedback" column, leaving only Berghmans and Burns with null results.

 

PERFORMANCE BIAS  The concept of "performance bias" is, on closer examination, much more complex that first appears.  The assumption is that treatments can be directly compared in a raw, quantitative way.  For instance, 6 weeks of biofeedback vs. 6 weeks of PME alone.

 

By this logic, one could also compare the effects of 10 mg of aspirin with 10 mg of morphine.  But that would be inappropriate, since the drugs operate by very different mechanisms and there are very different standard dosages.

 

But biofeedback and pelvic muscle exercise also operate by very different mechanisms.  Biofeedback relies on a neurological enhancement that goes beyond the mere passive exercise of the muscle fibers.

 

[We have previously discussed Kari Bo's argument that biofeedback results are too rapid to be explained by principles of sports physiology.  We agree.  Biofeedback involves neurological training which is much more than just physical exercise alone.]

 

In recently published research, Detrol was given in 2 mg bid, whereas Ditropan XL was given in a single 10 mg pill.  Would we call that a "performance bias"?  After all, they DID get more than twice as much Ditropan as Detrol!

 

Obviously a solution to this dilemma is to disregard the time factor and ask "how much improvement can a patient get with all the biofeedback they need", vs. how much improvement can they get with PMEs alone?

 

Then we compare the time factors and see which is more efficient.  This is slightly awkward for researchers who want to get results in a "reasonable" time frame, but it is infinitely more scientific.

 

Note that the only published protocol for the use of Biofeedback in clinical practice, the Perry Protocol (1990), states that all patients are entitled to be treated until their incontinence is resolved.  "One-size-fits-all" treatments are considered unethical in clinical practice.

 

Another approach would be to use more sophisticated statistics which would project these time differences.  For instance, if 75% of biofeedback patients are dry after 8 weeks, how long would it take to achieve the same level with PME alone?

 

ATTRITION BIAS  The other sword hanging over biofeedback studies in the TEC report is "attrition bias".  Two studies that otherwise demonstrated the superiority of biofeedback over PMEs alone were faulted because too many patients in the PME-alone condition dropped out [25% in Shepherd, 29% in Glavind].

 

How should we understand this?  What do we know from clinical experience about "dropouts"?

 

I'm trying hard to remember, but I can't recall ever hearing a patient say "I'm dropping your biofeedback program because I don't need it any more -- I'm cured!"  What I do remember is some patients dropping out because they thought the effort -- we required an hour a day of home practice -- too time consuming.  In the words of one patient, "Why should I keep doing these stupid exercises when my surgeon says he can fix me in a few minutes, permanently?"

 

I think we can estimate the 95% of dropouts are dissatisfied with the effort-to-results ratio they have seen.

 

According to NAFC and similar groups, it can take "several months" before PMEs Alone produce results.  Most biofeedback program claim results in "several weeks".  Therefore, it would be a reasonable hypothesis to predict that more subjects would drop out of the PME Alone condition than the biofeedback condition, and that is precisely what Shepherd and Glavind found.  Should they be faulted for that?

 

In future research we recommend that inclusion of the "dropout hypothesis" in all comparisons.

 

 

IS THIS "PME-ALONE"?

 

There was one other study cited by BC/BS-TEC that, at first glance, appears to contradict this discussion.  Their one example of PME+BFB vs. PME Alone for post-prostate incontinence was Franke JJ, Gilbert WB, Grier J et al.  (2000).

Early post-prostatectomy pelvic floor biofeedback. J Urol, 163(1):191-3.

 

And they charge:

 

1.Potential for attrition bias.  (33% (BFB) vs. 13% dropouts)

2.Effect of treatment possibly diluted by spontaneous improvement in both groups.

 

They further state:

 

"Pts randomized into one of two groups.

"Randomization process not described.

"PME alone – educational materials given, no specific instruction in PME."

 

Excise me?  If no instructions were given for PME Alone, how can they be called a "PME alone" control group?

 

The text is clear:

 

"Those in the control arm received no instruction and were asked to return a voiding diary and 48-hour pad test at the routine followup visits. It is not known whether controls performed pelvic floor exercises without instruction to do so.  (191)"

 

In addition, they got a standard packet of information for prostate surgery patients, but....

 

"There is no mention of pelvic floor exercises in this literature."

 

Two things are obvious.  First, it was virtually impossible to "drop out" of the control group, since the evaluation of incontinence was merely a routine part of their post-operative care.  Who wouldn't go back an insurance-paid post-op inspection?

 

Second, the "control group" was not asked to do PMEs and they were not asked if they had done any.  So how can BC/BS-TEC claim that this is a "PME-alone" control group?  Even the authors don't make that claim.

 

 

In retrospect, it is interesting to note that BC/BS-TEC has included two null reports -- Burton for Urge and Franke for Post-prostate -- that in fact don't even meet their own criteria for having a "PME-alone" control group.

 

It is hard to believe that BC/BS-TEC could make such a gross scholarly error, not once, but twice.

 

On the other hand, perhaps it was a calculated risk; they may have assumed that no one would notice that Burton and Franke didn't have PME-alone control groups, and should have been excluded from the git-go.

 

In defense of this theory, it certainly would have been an embarrassment if they had reported that the "Question" (PME vs PME+BFB) proved so narrow that there was NO research in two of the three categories (Stress, Urge and P-P) that could be examined.  That would have raised the question "Why not compare 'no treatments' and 'alternative treatments', just like you did for electrical stimulation?  But that would have brought Burgio 1998 into the picture.

 

Based on Berghmans' assessment of Burgio 1998 as methodologically the best study ever done on biofeedback (He rated it 8.5 on a 10 point scale, 1.5 points higher than the next best study), it would have drastically changed the balance of power in favor of the "effectiveness of biofeedback".

 

 


10,  Three additional forms of bias.

 

Memo To:  Dr. Hugh Hill, HCFA

 

Re:  A Discussion of Trainer Bias, Protocol Bias, and Instrumentation Bias and their implications for the BC/BS-TEC report.

 

Date:  June 4, 2000

---------------------------------------------------------------

 

The BlueCross/BlueShield-Technology Evaluation Center's Review of Biofeedback Research for the Treatment of Incontinence concluded that only two "scientific" (i.e., controlled experimental studies) met their criteria of NOT containing potential sources of methodological bias -- Berghmans 1996 and Burns 1993. 

 

Since both studies also failed to show a statistically significant differences between Biofeedback experimental groups and so-called "PME-Alone" control groups, TEC concluded that there was no evidence that biofeedback contributed anything to "PME Alone".

 

The fact that biofeedback showed a "trend" towards better results (Burns) or faster results (Berghmans) is not considered convincing evidence.

 

There are, theoretically, two broad reasons why experimental and control groups could be statistically indistinguishable as in these two studies.

 

First, the results in the experimental group could be exceptionally BAD.

 

Second, the results in the control group could be exceptionally GOOD.

 

And, of course, it is possible that BOTH of these conditions could occur simultaneously.

 

The Burns and Berghmans studies appear to include both of these features.  At the time of publication, Burns set a new record LOW for published success using EMG biofeedback, 61% symptom reduction.

 

Three years later, her bad results were beaten by a NEW record LOW of 54% set by Berghmans et al.

 

In both cases, "PME-alone" control groups -- which actually received much more than "PME-alone" -- performed unusually well, thus leading to non- significant group differences. 

 

TEC concludes from this that "no differences" is the norm, but TEC overlooked important sources of bias in these studies.

 

TEC's "external" methodological considerations of possible sources of bias consisted of three possible factors -- Attention bias, Attrition bias, and Selection bias.  According to TEC, none of these sources of bias were present in the "B&B" studies, and, therefore, their non-significant conclusions were ruled valid.

 

We have already pointed out that scientific procedure does NOT permit the fallacy of "ACCEPTING the null hypothesis", which in this case means drawing the conclusion that THERE IS NO DIFFERENCE.

 

All that we can scientifically say is that we CANNOT prove that there IS a difference, which is a very different matter. 

 

In testimony presented to HCFA (but NOT distributed to the Med-Surg Panel, in violation of announced procedures), we have also pointed out that most surgical and behavioral treatments share several characteristics with each other that are not common to pharmacological treatments upon which the "RCT" methodology is based.

 

In drug research, the purity of the active ingredient, and therefore its potency, is subject to prior external control by the FDA.  "Good Manufacturing Procedures" require constant and complete testing by chemists to assure that when a research subject is given, say, 10 mg of Detrol, they really ARE getting 10 mg of Detrol.  Thus the evidence-based analysis of Detrol studies does NOT need to look at the composition and quality of the drug treatment furnished to subjects; it can safely be assumed.

 

For a variety of reasons discussed below, this assumption can NOT be made in EITHER surgical OR behavioral research analysis. 

 

We leave it to surgeons to articulate the implications of this for surgical studies, and present three sources of bias that need to be evaluated in the analysis of behavioral research designs.  In the present report we will focus on biofeedback, but most of these points apply to the entire field of behavioral interventions.

 

-------------------------------------------------------------

1.  TRAINER BIAS - Trainer bias refers to the demonstrated professional skills level of the person making a biofeedback or other behavioral intervention. 

 

Surgeons would be outraged if a study of the efficacy of collagen injections was presented in which the clinician's credentials were that of a Nurse Practitioner with no additional training by the manufacturer.  (NPs give "injections", right?)  When such "injections" fail to achieve the expected therapeutic result, they would rightly complain that it wasn't the collagen's fault but, most likely, the clinician's lack of general skills and specific training that caused the failure.

 

In the same way, biofeedback clinicians are outraged when persons with no biofeedback credentials at all, AND no additional training from the manufacturer, is allowed to do "research" with biofeedback instruments. 

 

When such "biofeedback training" fails, they rightly complain that it isn't the instrument's fault, but, most likely, the clinician's lack of general skills and specific training that caused the failure. 

 

We note in this regard that the American Psychological Association has long held that a mere weekend workshop is usually not sufficient training for a psychologist to undertake a new specialty, such as biofeedback or incontinence training.

 

For nearly two decades the Biofeedback Certification Institute of America has conducted a national program which guides the formal (classroom and apprenticeship) training of biofeedback practitioners.  A comprehensive written professional examination AND an hands-on demonstration of clinical skills is required in order to use the designation "BCIA-C" after one's name to indicate such formal certification. 

 

In addition, continuing education requirements must be met to re-certify every three years.

 

In order to be above suspicion in biofeedback research, the clinician delivering the experimental treatment should be BCIA-Certified (just as the surgeon should be Board-Certified in his or her specialty).

 

In addition, the treatment of pelvic muscle dysfunctions with EMG biofeedback presents a variety of special problems not found in general EMG muscle rehabilitation, so specialized training beyond BCIA is essential to properly use these instruments and treat these patients.

 

Several of the major incontinence-equipment manufacturers, as well as American and European biofeedback professional organizations, and both nursing and physical therapy associations, offer such specialized training on a regular basis, usually two or three times a year, EACH.

 

Mere attendance at an "incontinence workshop", without BCIA certification, would not remove the heavy cloud of suspicion about "Trainer Bias" that otherwise renders a biofeedback research project inconclusive. 

 

In addition to these technologically-oriented professional skills, it is widely acknowledged (by the AHCPR guidelines, for example) that one of the requirements of a good biofeedback trainer is the ability to motivate patients to a high level of performance. 

 

In this regard, a biofeedback trainer is much like an athletic coach, or more aptly, a "personal fitness trainer" who generates enthusiasm for the biofeedback process and motivates the patient to put in many hours of home practice between office visits. 

Since one of the training tools involves ensuring that the patient understand and appreciate the intermediate success of the biofeedback training, it should be obvious that not only double but also single blinding is impossible in this field, and sham feedback would impossible as a practical matter.  (The therapist would have to explain the significance of false or sham data which did not correspond to the subject's own proprioception.)

 

[In this regard, it is instructive to note that highly touted attempts to create "virtual reality" scenes through visual presentations have fallen flat, because the subject knows from inner-ear signals, that s/he is still sitting in a chair on the floor, and not flying through space!]

 

2.  PROTOCOL BIAS - Protocol bias refers to the optimization of biofeedback training procedures to ensure that the patient or subject derives maximum benefit from the experience. 

 

The original protocol, developed by Arnold Kegel, MD, in the late 1940s, had two essential elements that are still considered valid today.

 

First, every one of Dr.  Kegel's patients engaged in daily at-home biofeedback with a biofeedback instrument, in addition to periodic office visits to evaluate practice.

 

Dr.  Kegel was convinced that faithful daily biofeedback practice was critical to the success of his protocol; so convinced, indeed, that he published a chart showing a decline in measured muscle strength of a few points following a single day of "skipped" practice. 

 

Any research project that does not afford subjects the opportunity to use biofeedback on a daily basis may be consider to demonstrate "protocol bias", such that insignificant results are more likely attributed to the protocol than to "biofeedback" itself.

 

In practice there are reasonable modifications of Kegel's protocol; for instance, having the patient use the clinic's office-level biofeedback instruments on a daily basis, as was done in an Italian study some years ago.

 

But insofar as the research deviates from Kegel's daily practice protocol, any LACK of significant results raises the issue of protocol bias.  Positive results, of course, can demonstrate that the new protocol (for example, three office visits a week) was NOT a handicap.  But lack of positive results is most likely the result of an inferior biofeedback protocol.

 

The second feature of Dr.  Kegel's protocol is that the length of the biofeedback treatment was always tailored to the experience of the individual patient.

 

Kegel recognized that, in internet jargon, EPID ("every person is different").  Since pelvic muscle rehabilitation is both a learning and an exercise problem, some people will need more, while others need less, exposure to the biofeedback protocol. 

 

It is possible to statistically control for Protocol Bias, by measuring and plotting the daily (Kegel) or weekly (most modern systems) progress in both Treatment effects (also called "intermediate effects") and well as outcome effects (symptom reduction).  [This is not very different from analyzing Attrition bias for trends.] In such controlled research, an averaged plot of progress in experimental and control groups would show when, if ever, their respective trend lines would cross, somewhat like Ceresoli showed that 6 weeks of biofeedback gave the same results as 13 weeks of PME-alone. 

 

Unfortunately, this is seldom done; the experimenter determines a priori and with no empirical justification that 4, 6, or 8 weeks of "treatment" under both conditions should be sufficient to demonstrate any differences.  This is akin to saying that each patient is getting 5 mg of oxybutynin, regardless of effects.

 

Many drugs are studied on the basis of a known relationship between drug concentration and body weight, since (in those cases) body weight may reasonably effect their potency.  In the behavioral field, interventions might be tailored to "brain weight", or, more precisely, an "expected speed of skills acquisition" based on the patient's age and cognitive and physical status.  But absent such sophisticated considerations, it is inappropriate to arbitrarily assume that "one size fits all"; it usually doesn't.

In summary, if the subjects are not provided with daily biofeedback of sufficient duration, and they do not receive apparent benefit from the experimental treatment, the burden of proof is on the researcher to show that they really did get "real" biofeedback, just like Dr.  Kegel taught.

 

 

3.  INSTRUMENTATION BIAS -- Instrumentation bias refers to the importance of having and using appropriate sensors and electronic instruments to perform minimally-effective biofeedback training.

 

It should be obvious, but a report on using common kitchen utensils to perform suspension surgeries would not reflect badly on the entire field of "surgery" if it failed.

 

Before it can be considered NOT to be a biasing factor, any new instrumentation must have been demonstrated to be effective, at least once, in producing positive results.  For example, several years ago a form of "underwear" with embedded EMG sensors was used in a biofeedback study that did not achieve significant results.  Unfortunately the electrode design did not provide reliable signals, so the only conclusion that can be drawn refers to the novel equipment, not to "biofeedback" as an intervention.

 

The sensors that have been shown to be most effective for biofeedback training are those which are inserted in the vaginal or anus (or both).  This includes both manometric devices (Kegel, Burgio, Shepherd) as well as EMG sensors (Binnie, Taylor, Perry, Baigis-Smith, Williams, etc.)

 

Binnie et al (1991) showed that electrical stimulation electrodes (i.e., circumfirential electrodes) were NOT effective at detecting EMG signals, compared with longitudinal electrodes, but the later correlated highly (>0.91) with inserted needle EMG electrodes.  Use of proper electrodes is significant in biofeedback.

 

Very few studies have dealt with the advantages of modern computerized biofeedback instruments, although there is universal agreement among clinicians that for most biofeedback applications, modern equipment has clear advantages in training -- usually, it is in terms of making it easier for the patient to see and understand the objective of the training.

 

In the PT field of muscle rehabilitation in general, clinical opinion is clearly in favor of intelligent EMG devices which provide "pattern matching".  In pelvic muscle work, clinicians find somewhat simpler "Work-Rest" mode instruments valuable. 

 

[General purpose EMG biofeedback instruments are designed with a single target -- relaxation.  In pelvic muscle work, both "contraction" and "relaxation" are targets.  The most useful devices record and calculate "work" and "rest" averages separately to show progress on both items.]

 

In addition, there is considerable theoretical and clinical experience in support of the fact that both fast twitch and slow twitch muscle fibers have roles in maintaining continence, so the most useful biofeedback devices calculate both quick (short, or flick) values and sustained (hold, or 10-second endurance) values, in order to determine deficiencies and, therefore, to set practice objectives.

 

While these issues have not be the subject of empirical study, the fact is that all biofeedback instrument manufacturers seeking the attention of incontinence clinicians provide these features on their instruments.  In general, experienced clinicians agree that the standardization of data collection facilitates understanding the patient's condition and prescribing the most appropriate combination of therapeutic interventions.  Most would scoff at the research assumption that all patients should be given the same exercise assignment each week.

 

 

A RE-EXAMINATION OF TEC In view of these three additional sources of bias that may render research conclusions unscientific, we now propose to reconsider the "outcomes" tables in the TEC report to see how their "no bias" studies fare.

 

------------------------------------------------

Berghmans et al, 1996

1.  Trainer bias

2.  Protocol bias

3.  Instrumentation bias

 

1.  Trainer bias.  The credentials of the clinician in this report are not specified.  It is doubtful if the person was generally qualified, or specifically trained for this task.

 

The therapist did NOT present or discuss EMG values with the subjects, so opportunities for and evidence of subject motivation were minimal. 

 

2.  Protocol bias.  Berghmans attempted to use a new intensive physical therapy model developed by Bo for biofeedback.  It has never been shown effective in any biofeedback study, and was not effective here.

 

In addition, Berghmans did not provide daily biofeedback practice, and limited the study to an exceptionally short 4 week period.

 

3.  Instrumentation bias.  Berghmans used an electrical stimulation electrode (Verimed, USA)

instead of an EMG sensor, so the quality of the signal was clearly sub-standard.

 

Berghmans used a general-purpose EMG instrument which did not quantify work-rest intervals.  (And other details are impossible to verify.)

 

Therefore, with at least six possible sources of bias, the conclusions of this study cannot be considered as "scientific evidence".

 

-----------------------------------------------------

Burns et al, 1993.

1.  Trainer bias

2.  Protocol bias

3.  Instrumentation bias

 

1.  Trainer bias.  The therapist treating experimental subjects was supposed to obtain BCIA certification (according to her grant application) but did NOT do so. 

 

Nor is there is evidence that this therapist received any specialized training from anyone with established skills in the field.

 

2.  Protocol bias.  Burns' subjects were trained in the office with biofeedback instruments, but were expected to practice their exercises at home with empty vaginas -- a very different condition. 

 

They were not given home biofeedback instruments for daily practice, as Kegel had shown to be effective.

 

They were all treated for the same eight weeks, without regard to individual differences in learning skills.  Compared with the clinical literature, they made very poor progress.

 

3.  Instrumentation bias.  Burns used a general-purpose EMG "stand-alone" (early office) device that was designed for relaxation training, not for muscle rehabilitation.

 

Burns used a private definition of contraction strength, the "best five seconds out of ten", instead of the standard ten-second average.  Therefore, her intermediate result (an increase from 2.0 to 4.0 microvolts in the experimental group) is much more a "fast twitch" measurement than a true sustained score.  [i.e., if 10-second scores were used, she probably would not have a significant treatment effect, which would account for the absence of a significant outcome effect.] We can't really say if Burns' subjects received any benefit from their "biofeedback treatment". 

 

An additional note: Burns' study population included only about 10% of her recruited subjects.  There were so many exclusionary criteria that she ended up with a pool of not only elderly, but also debilitated subjects (at least in terms of pelvic conditions).  Proof of this is found in the fact that her research subjects AFTER therapy was still much worse than her PILOT subjects had been BEFORE her therapy.

 

Conclusion: With at least six unresolved sources of potential bias, the Burns study is NOT of sufficient methodological quality to provide "scientific" evidence about the effectiveness of biofeedback in the treatment of incontinence.

 

 

Summary: We have shown that the two studies cited by TEC as being "without bias" when considered only externally were indeed fraught with bias problems when the quality of their "biofeedback" intervention is examined.

 

From the perspective of persons trained in biofeedback, both the Burns and Berghmans studies must be dismissed as not providing scientific evidence -- one way or the other.

 

Since BC/BS has already dismissed the remaining four (Stress) studies, and we have previously shown that the one Urge and one Post-proctatectomy studies did NOT include "PME-alone" control groups, the only rational conclusion is that there is NO scientific evidence that can be brought to bear on this subject.

 

In the words of the Industry Representative on the Med Surg Panel, it appears that TEC has set the bar too high to address the question.

 

Therefore, we are forced to agree with the American Medical Association (and others) who have urged that other levels of evidence, including clinical research, be considered by HCFA when attempting to address the issue at hand. 

 

We also agree with the Interim Recommendations, that persons with expert familiarity with the topic under discussion need to be invited to participate in the technology assessment. 

 

At the very least that ought to include consultation with the only professional organization devoted to the study of biofeedback, the Association for Applied Psychophysiology and Biofeedback (AAPB).  That was not done. 

 

 

Respectfully submitted,

John D.  Perry, PhD, MDiv, BCIA-C (Senior Fellow)

 

 

Dr. Hill replied:

 

 

 


Copyright (c) 2000 IncontiNet.com

URL: www.incontinet.com/april12.htm

Last edited on 05/04/05

Hit Counter