Wired’s End of Logic
I hope Chris Anderson gets a big cookie for his obviously provocative piece, The End of Theory. His latest cover feature for Wired asserts that, with a large enough data set, you can pull out correlations upon correlations that overshadow the need to pin down causation, test hypotheses, build theoretical models, and ultimately, employ the scientific method.
There is now a better way. Petabytes [of data] allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.
This is pretty heavy stuff, particularly for generations of scientists, philosophers, economists, and anyone else who grew up thinking that a correlation is naught without knowing the causal link. In fact, I would argue that the desire to uncover a causal link, rather than simply making an observation, is the engine of discovery.
Of course, Anderson’s correlation theory is actually true, insofar that all data can be known, processed, and analysed effectively. This is the Grail of science, for if we can know the state of every single variable in the universe, we could find out how they relate to each other, and then run projections based on time. Anderson touches on that here:
Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.
Sure, with enough data. But we’re not even close to that. “Unprecedented fidelity”? If Anderson means more fidelity than we’ve ever had, then yeah, okay. We also had unprecedented fidelity when we plotted African-Americans on a bell curve and thought they were stupid.
The idea that some Google-IBM cluster with its 1600 CPUs and terabytes of memory is somehow heralding the end of scientific theory is the kind of thing you’d find in a… magazine article. It might be able to figure out how to match advertising to user profiles, but we’re many, many decades from being able to collect the kind of data needed (outside a controlled test) for anything more meaningful. The last thing science needs is people running around finding correlations and then saying ‘that’ll do’.
I think Anderson’s incredibly insightful at times, but this is pretty tenuous at best, and simple behaviourism at worst. And as if to remove any remaining shreds of credibility, the article ends with this apparently thought-provoking question:
It’s time to ask: What can science learn from Google?
Frankly, I don’t even know what that means. What can a method for objectively discovering and understanding the world learn from a company that collects data? By asking this question, he seems to be suggesting that scientists don’t know the value of collecting and analysing as much data as possible. Quick, someone call Science and tell him!