Retracted papers – what happened at USENIX ATC’12?
At the opening session of this year’s USENIX Annual Technical Conference I mentioned that accepted papers were withdrawn, and commended the authors on their integrity. This created some interest, including people tweeting about it. I think the episode deserves some publicity.
There were, in fact, two papers which were withdrawn. Paper A was a “full” paper which was conditionally accepted by the program committee (PC), while Paper B was a “short” paper which was accepted. What happened?
Paper A
Paper A is actually a good example of the scientific peer-review process working well. The paper investigated a particular performance effect in storage devices, and at the PC meeting there was intense discussions of its merits.
The experts debated whether the result could be true and there was significant concern that the effect might be a fluke, potentially an issue with a particular set of devices. On the other hand, one PC member with access to confidential manufacturer data from a similar (yet unreleased) device had evidence that it might be real. If the effect was real, the work would be quite interesting, and definitely worth publishing.
After much debate, we decided to accept the paper conditionally, subject to the authors providing satisfactory answers to critical questions from the shepherd (who was an expert on these devices and the main sceptic).
I should add that, like most systems conferences, ATC uses shepherding of accepted papers. The shepherd is a PC member who needs to approve the final (camera-ready) version of the paper. Specifically, the shepherd ensures that all issues raised by the reviewers are dealt with by the authors. In this sense, all acceptances are somewhat conditional, but this case was different: there were fundamental doubts on whether the paper was sound, and the shepherd’s job was to investigate and either put the doubts to rest or kill the paper.
As it turned out, the reviewers (PC members) had indeed touched a sore point, as the authors noted when trying to provide answers to the shepherd’s questions. In the end, they found that the student doing the experiment had stuffed up, and misrepresented a crucial number by two orders of magnitude! And, for good measure, the student had made another mistake of similar magnitude, which had confused things further. (This is really bad, of course, but it’s a student, still learning the ropes, and it’s the professors’ responsibility to ensure that everything is correct.)
To their credit, the authors reacted quickly, and within less than a week admitted that they had stuffed up, apologised profoundly, and withdrew the paper. Btw, the “sceptic” on the PC still thinks that the work is likely to lead to publishable results, so the authors should definitely complete their investigation!
Paper B
Paper B’s story was a bit more surprising. It was also about a performance issue, this time with networking equipment. Since it was a “short” paper, no earth-shaking results were expected. Yet it was somewhat borderline, with most reviewers being luke-warm at best if not negative about it, thinking it didn’t really account to much of significance. It got up because one reviewer championed it, arguing that it was a surprising result coupled with some nice performance tuning work by the authors. In the end, the paper was accepted, subject to standard shepherding.
When preparing the final version, and re-running experiments to address issues raised by reviewers, the authors found that they could not reproduce the original results. This meant they had no real story that they could back with solid experimental data. So they had to withdraw the paper, and did so without any attempt to bullshit themselves out of the situation.
Lessons
Of course, neither team should have got into this situation in the first place. In both cases there was a failure of the authors’ internal process which allowed (potentially) incorrect results to get out.
However, nobody is perfect, and shit happens. And more of it happens when you’re under time pressure (as we almost always are working towards a paper deadline). And shit happens in the best of families (one of the paper was from very well-known and respected authors).
But it is important to be alert, particularly in a university environment, where inevitably the experimental work, and the low-level evaluation, is done by students, who have limited experience. The profs must be sure they know what’s going on and that they can be confident of the integrity of the experimental data.
The important thing is what both teams did once they realised they had a problem: if you have stuffed up, own up to it rather than making it worse by trying to hide it! Retracting an accepted paper is very painful and highly embarrassing, but that is the smaller evil compared to publishing nonsense. Someone is bound to find out, and then you’re looking really bad! In this case, it was only a few people on the program committee who knew about it, and they weigh the integrity of the authors’ actions higher than their mistakes.
So, let this be an example of what to avoid, but also of how to recover in an honourable way from a bad situation. I commend both teams on their actions.
Thank you for this summary! As someone who is keenly interested in repeatable systems research, I was very glad to finally read what happened. Yes, commendations to the authors of the affected papers.
Everyone loves what you guys tend to be up too.
This sort of clever work and reporting! Keep up the superb works
guys I’ve added you guys to my personal blogroll.