Science’s Significant Stats Problem

Researchers’ rituals for assessing probability may mislead as much as they enlighten.

In medicine, as in most other realms of science, observing low-probability data like that in the HIV study is cause for celebration. Typically, scientists in fields like biology, psychology, and other social sciences rejoice when the chance of a fluke is less than 1 in 20. In some fields, however, such as particle physics, researchers are satisfied only with much lower probabilities, on the order of one chance in 3.5 million. But whatever the threshold, recording low-probability data—data unlikely to be seen if nothing is there to be discovered—is what entitles you to conclude that you’ve made a discovery. Observing low-probability events is at the heart of the scientific method for testing hypotheses.
Scientists use elaborate statistical significance tests to distinguish a fluke from real evidence. But the sad truth is that the standard methods for significance testing are often inadequate to the task. In the case of the HIV vaccine, for instance, further analysis showed the findings not to be as solid as the original statistics suggested. Chances were probably 20 percent or higher that the vaccine was not effective at all.
Mais