The Second Version

27/10/07

Overanalysis

You're a researcher; you are firmly convinced that the hypothesis you formulated - and deeply care about - about some phenomenon is true. But you alve have to face a sheer looming obstacle: the data you've been ableto collect are inconclusive; not openly contradicting yout hypothesis, but neither confirming it. So what can you do?

The best option is to repeat experiments and observations until either you have enough data to positively prove your hypothesis, or to become convinced that you were wrong in the first place. The best case of this are probably the efforts of Michelson and Morley, who set out to prove the existence of aether with a series of increasingly sophisticated experiments. They failed tho, but in the meantime produced strong evidence for Einstein's special relativity theory and perfected the interferometer, an important scientific instrument.

However, repetition is not always feasibble or even possible. There may be time and resource constraints, or the system of interest can be observed but not modified. In epidemiology, for example: we find morally unacceptable to hurt people or deprive them of their freedom in order to perform experiments, so we are stuck with a tangle of causes and effects which can be very hard to study separately. Another such field is climate science, where we obviously can't change experimental conditions and are stuck with observing what we've got, often through the use of proxies of limited reliability.

In these condtions, the temptation to torture data until they spill the beans can be very strong. You take a data series, exclude a couple of outliers for good measure, smooth it with a filter chosen to give the appropriate slope at the extremes, and finally fit a third-order polynomial to a selected region of the series obtaining a correlation coefficient of 0.6... proof! Yeah, proof that you've read the manual of your software. The actual physical meaning of such correlation is in fact rather obscure.

The temptation to over-analyze data was also catching me during my PhD work, but in the end I cut it short stating in me thesis that some calculations I made were more an exercise in curve fitting than anything. But other scientists think differently, and seem to go beyond succumbing to temptation in good faith, into the dark territory of deliberate manipulation and intellectual dishonesty. The blog Climate audit has a depressingly too long list of such cases.

Intellectual dishonesty is always wrong, but when the results of bad science can impact on whole countries, it becomes alarmingly wrong.

Etichette: ,

2 Commenti:

  • "You take a data series, exclude a couple of outliers for good measure, smooth it with a filter chosen to give the appropriate slope at the extremes, and finally fit a third-order polynomial to a selected region of the series obtaining a correlation coefficient of 0.6... proof!"

    That's a nice definition of "scientific truth" for dummies. 'Til the next theory will be proven, same way.
    ciao, Abr

    Di Blogger Abr, Alle 28/10/07 23:18  

  • In fact my criticism is not aimed at science in general, but rather at those scientist who end up "torturing" data in the attempt to prove their hypothesis.

    I don't share the view that "scientific truth" is highly; volatile. Some popular theories have been proven to be utterly wrong, but others have been proven to be restricted to certain particular situations - like Newtonian an quantum mechnics.

    However, it's definitely true that science is based on induction.

    Di Blogger Fabio, Alle 29/10/07 20:29  

Posta un commento

Iscriviti a Commenti sul post [Atom]



<< Home page