A day or two ago, I posted some comments on a study
published in Journal of Child and
Adolescent Psychiatric Nursing by Karyn Purvis and colleagues. That study
claimed, on the basis of before-and-after tests of child characteristics, that
a day camp experience had a significant ameliorative effect on symptoms shown
by adopted children; I explained why the design of the study and the nature of
the treatment could not be used to draw this conclusion.
Now I find that Purvis and her colleague David Cross
had earlier published in Adoption
Quarterly another report about the effects of the day camp (“Improvements
in salivary cortisol, depression, and representations of family relationships
in at-risk adopted children utilizing a short-term therapeutic intervention”, 2006,
Vol. 10, pp. 25-43). In this study, the authors looked at 12 children whose age
range was not clearly stated (a younger group with a mean age of a little over
4 years, and an older group with a mean age a little over 10). These adopted
children had been in their adoptive homes for between 1.5 and 11 years, and had
been gathered through referrals from parent support groups and child and family
therapists. This is a tiny and variable group, and in fact it would be most
surprising if any clear results came out of a properly-done study.
Purvis and Cross used a measure of salivary cortisol
as an indicator of the children’s distress and arousal. They predicted that
there would be a reduction in cortisol levels associated with the camp
experience (although in fact in their brief literature review they noted a
paper showing lower cortisol levels for maltreated children, as well as a
number connecting high levels with stress). They measured salivary cortisol
during the week prior to camp and the week after camp, as well as on Mondays
and Wednesdays during the 5 day camp weeks.
Reporting the results of the cortisol measures,
Purvis and Cross state that although “salivary cortisol was measured three
times a day, only the morning data are show because there were no statistically
significant differences for the other measurement times (noon, afternoon)” (p.
34)[!]. They then present a table showing the results of t-tests comparing morning measurements for weeks 1, 3, and 5 to
pre-test salivary cortisol measures. Two of these, for weeks 1 and 5, were
significant at the .05 level. In addition, the table shows highly significant
differences between the three weekly measures and the post-test measure-- the post-camp cortisol reading being
significantly higher than the
measures during camp, and indeed rather higher than the pre-camp measure!
Words temporarily fail me as I look at this report,
which resembles an intentionally easy problem I might set as a present to weak
students on a research methods exam. But let’s soldier on. First: is it okay to
leave out of a table comparisons where there was no significant difference
found? No, Virginia, that is not okay, and in the trade we refer to it as “cherry-picking”.
The table as it stands makes it appear that there was a high proportion of
significant differences found whereas in fact there should have been ten comparisons
shown rather than six, and five significant differences out of those ten rather
than five out of six.
But hey, that still sounds like a lot, doesn’t it—five
out of ten? Sure, that’s how it sounds, but this is exactly the reason why a
study like this should use analysis of variance to examine all the comparisons
at the same time, rather than multiple t-tests.
Here’s the deal: using the .05 probability level, by definition 5 tests out of
100 will appear “significant” but do so by chance alone. With each additional t-test on a set of data, you increase
the probability of events appearing to indicate a significant difference but actually
occurring by chance. Analysis of variance avoids this problem, and that’s what
should have been done here.
Let’s also look at what the results seem to
indicate. Taking cortisol level as an indication of troubled mood, we see that
the children improved quickly as seen in weeks 1 and 3-- that their cortisol was significantly lower
at those measures than it had been before they started camp. By week 5,
however, although the cortisol reading is lower than at the pre-test, it is no
longer significantly lower. Did the treatment “work” quickly to begin with,
then “stop working”? And how is it that if this treatment is effective, the
post-camp cortisol readings are significantly higher than the readings taken
during camp, and somewhat higher than they were before the treatment began? Isn’t
a treatment supposed to have a longer-term effect than this?
Considering both the pre-and post-test design used
by Purvis and Cross, and the very peculiar set of differences presented to us,
I can only assume that we are seeing the effect of the confounding variables
natural to this type of design-- the
confounding variables that are the reason this design cannot be used to claim
that an intervention is an evidence-based treatment (EBT). Confounding
variables here could be as simple as the different circumstances under which
the cortisol sample was taken at home and at the camp, although there are many
other possibilities of the kind I discussed with respect to the more recent day
camp publication. The “scientific” study of cortisol levels gives no advantages
when the design is this weak, although, as Dr. Spock used to say about alcohol
rubs, it smells important.
Once again, it seems that a peer-reviewed journal’s
reviewers did not do their job, to the possible detriment of adoptive families
and children. The evidence here seems to be that use of this day camp intervention
may simply waste family resources and delay access to effective treatment.
No comments:
Post a Comment