Gerry Altmann: Harvard misconduct: setting the record straight, Part 2

I have thought long and hard about my comments in my previous post. Did I overstep the mark by using the F-word? I shall freely admit that my interpretation of the information I was given by the relevant Dean’s office is heavily dependent on the knowledge that Harvard found Professor Hauser guilty of misconduct. I am assuming that Harvard abided by the Office of Research Integrity (ORI) policy on misconduct; the relevant information on what constitutes misconduct and how it must be investigated are described here (skip to p. 28384 - the numbering starts at 28370). It is important to realize that the burden of proof is with the accuser, that honest error or difference of opinion is not a basis for misconduct, that the respondent (the accused) is given the opportunity to contest the findings of the investigation into misconduct, and is given access to the evidence on which basis those conclusions were reached. And for ease of exposition, here is the ORI definition of misconduct (p. 28386):

Research misconduct means fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results. (a) Fabrication is making up data or results and recording or reporting them. (b) Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record. (c) Plagiarism is the appropriation of another person’s ideas, processes, results, or words without giving appropriate credit. (d) Research misconduct does not include honest error or differences of opinion.

I have already given one interpretation of the information I have received. But there is another. It is conceivable that the data were not fabricated, but rather that the experiment was set up wrong, and that nobody realized this until after it was published. As I detailed in my last post, the monkeys received two stimuli at test, and these were meant to be of two different types, but the investigation found that in fact they were both of the same type. So perhaps the computer program that generated the sounds was written wrong (this kind of thing happens), and perhaps no one checked what sounds it was producing before running the monkeys on the procedure (this would be sloppy), and no one scoring the sounds listened to them as they were playing (this would be appropriate given the method), and perhaps no one checking the scoring afterwards listened to those sounds (this would be appropriate if the checking of the scoring was to ensure consistency, but would be sloppy in respect of not checking after the experiment that the monkeys had been listening to the right thing... after all, perhaps NO sound was being played to the monkeys! How could one know without listening? And if one did listen, given that the two test patterns were meant to be of different types, but were in fact always the same, it would only require listening to one pair of test trials to know that the stimuli were wrong). So it could go unnoticed if there were several breakdowns in experimental rigor. And so the data would have been analysed assuming that the two test trials were of different types, when in fact they were of the same type. So how come the paper retracted from Cognition reported a significant difference between the two types? Well... in principle, if you split the data into two, having assumed that half the data were of one type, and half of another, even though the two halves of data are in fact from the same condition, you could get a difference just through chance. In fact, the statistic reported in the paper suggests that you’d get a difference due to chance around 1 in 50 times. So it’s not totally implausible. Is this what happened? Well... if it is, it would still be the case that the raw data would show this chance effect (that the monkeys responded, just through chance, more on some of the trials than others) - it should be possible to recreate the data that were reported in the Cognition article that was retracted. Evidently, though, that was not the case. If it had been, the investigation would have found an explanation for how one could go from the raw data to the published data. And the findings, as presented to me, do not suggest that videotapes had been lost (which would explain why the pattern of data reported in the article could not be replicated from the raw data) - the information I was given makes reference to the examination of the video recordings and the raw data. If the raw data are intact (I have no reason to believe otherwise, on the basis of the information I have been given), then I am satisfied that there is no straightforward (or even more complex) explanation for how the published data were generated, except for the obvious one.

So what should we conclude? It is conceivable that there was in fact no intent to deceive or fabricate if we assume a whole chain of procedural errors including the loss of some of the original data (perhaps I am reading more into the Dean’s letter to me than I should). So if we suppose that data might have been lost, is it still more plausible to assume that there was misconduct here? We still do not know what actually did or did not take place in the Hauser lab. We do not know what actually were the charges against Hauser - that is, which aspects of the ‘workflow’ were deemed to have been misconducted. We don’t even know for sure whether any of the charges are associated with the Cognition article specifically. The Dean’s publicly released letter said that “problems” were found, but it was not clear whether these problems were associated with misconduct. Instead, they could just have been as I described above: sloppy science compounded by chance data and perhaps data loss (though I would have worded the information I was given quite differently if the investigation had concluded that data was indeed missing). So perhaps the charges of misconduct are about something else. And in that case, I would be the first to change my continuing view that, in respect of the article published in Cognition, there was an intention to deceive. I just find it hard to believe that a top lab, run by such a smart person, would compound error after error. But anything’s possible. The issue is whether it’s probable. And given what I have been told, I think the scenario I have outlined is, quite simply, not probable (this is not to say that the investigating team won’t have considered it. I would be surprised if they had not).

It is time now, I believe, to step back, and allow due process to conclude. Most likely, neither of the parties involved (Harvard, Hauser) are able to say anything publicly if there are still federal investigations underway. My hope is that the investigation’s results will be published (I believe they will be), and that when he is able to, Hauser will himself given an account of what he did or did not do. But further conjecture is unlikely to yield new conclusions. My own interpretation may be wrong, in which case, with the right information, I will be the first to wish to correct it.

Unless something major happens to change my thinking about all this, I intend to resume my normal blog-specific activities as soon as is practicable - I much prefer to speculate publicly about the mundane and the personal. I am looking forward to the time when the F-word can resume its usual connotation.

Wednesday, 1 September 2010

Harvard misconduct: setting the record straight, Part 2