Wednesday, November 6, 2013
More Day Camp Pre-and Post-Treatment Claims Outrun the Evidence
A day or two ago, I posted some comments on a study published in Journal of Child and Adolescent Psychiatric Nursing by Karyn Purvis and colleagues. That study claimed, on the basis of before-and-after tests of child characteristics, that a day camp experience had a significant ameliorative effect on symptoms shown by adopted children; I explained why the design of the study and the nature of the treatment could not be used to draw this conclusion.
Now I find that Purvis and her colleague David Cross had earlier published in Adoption Quarterly another report about the effects of the day camp (“Improvements in salivary cortisol, depression, and representations of family relationships in at-risk adopted children utilizing a short-term therapeutic intervention”, 2006, Vol. 10, pp. 25-43). In this study, the authors looked at 12 children whose age range was not clearly stated (a younger group with a mean age of a little over 4 years, and an older group with a mean age a little over 10). These adopted children had been in their adoptive homes for between 1.5 and 11 years, and had been gathered through referrals from parent support groups and child and family therapists. This is a tiny and variable group, and in fact it would be most surprising if any clear results came out of a properly-done study.
Purvis and Cross used a measure of salivary cortisol as an indicator of the children’s distress and arousal. They predicted that there would be a reduction in cortisol levels associated with the camp experience (although in fact in their brief literature review they noted a paper showing lower cortisol levels for maltreated children, as well as a number connecting high levels with stress). They measured salivary cortisol during the week prior to camp and the week after camp, as well as on Mondays and Wednesdays during the 5 day camp weeks.
Reporting the results of the cortisol measures, Purvis and Cross state that although “salivary cortisol was measured three times a day, only the morning data are show because there were no statistically significant differences for the other measurement times (noon, afternoon)” (p. 34)[!]. They then present a table showing the results of t-tests comparing morning measurements for weeks 1, 3, and 5 to pre-test salivary cortisol measures. Two of these, for weeks 1 and 5, were significant at the .05 level. In addition, the table shows highly significant differences between the three weekly measures and the post-test measure-- the post-camp cortisol reading being significantly higher than the measures during camp, and indeed rather higher than the pre-camp measure!
Words temporarily fail me as I look at this report, which resembles an intentionally easy problem I might set as a present to weak students on a research methods exam. But let’s soldier on. First: is it okay to leave out of a table comparisons where there was no significant difference found? No, Virginia, that is not okay, and in the trade we refer to it as “cherry-picking”. The table as it stands makes it appear that there was a high proportion of significant differences found whereas in fact there should have been ten comparisons shown rather than six, and five significant differences out of those ten rather than five out of six.
But hey, that still sounds like a lot, doesn’t it—five out of ten? Sure, that’s how it sounds, but this is exactly the reason why a study like this should use analysis of variance to examine all the comparisons at the same time, rather than multiple t-tests. Here’s the deal: using the .05 probability level, by definition 5 tests out of 100 will appear “significant” but do so by chance alone. With each additional t-test on a set of data, you increase the probability of events appearing to indicate a significant difference but actually occurring by chance. Analysis of variance avoids this problem, and that’s what should have been done here.
Let’s also look at what the results seem to indicate. Taking cortisol level as an indication of troubled mood, we see that the children improved quickly as seen in weeks 1 and 3-- that their cortisol was significantly lower at those measures than it had been before they started camp. By week 5, however, although the cortisol reading is lower than at the pre-test, it is no longer significantly lower. Did the treatment “work” quickly to begin with, then “stop working”? And how is it that if this treatment is effective, the post-camp cortisol readings are significantly higher than the readings taken during camp, and somewhat higher than they were before the treatment began? Isn’t a treatment supposed to have a longer-term effect than this?
Considering both the pre-and post-test design used by Purvis and Cross, and the very peculiar set of differences presented to us, I can only assume that we are seeing the effect of the confounding variables natural to this type of design-- the confounding variables that are the reason this design cannot be used to claim that an intervention is an evidence-based treatment (EBT). Confounding variables here could be as simple as the different circumstances under which the cortisol sample was taken at home and at the camp, although there are many other possibilities of the kind I discussed with respect to the more recent day camp publication. The “scientific” study of cortisol levels gives no advantages when the design is this weak, although, as Dr. Spock used to say about alcohol rubs, it smells important.
Once again, it seems that a peer-reviewed journal’s reviewers did not do their job, to the possible detriment of adoptive families and children. The evidence here seems to be that use of this day camp intervention may simply waste family resources and delay access to effective treatment.