about me
art
biz
Chess
corrections
economics
EconoSchool
Finance
friends
fun
game theory
games
geo
mathstat
misc
NatScience
... more
Profil
Logout
Subscribe Weblog

 
Sorry, but you're not allowed to view this story!
masksalesNEW YORK (CNN/Money): Forget about the guesswork from the political pundits and ignore all those election polls. The real key to predicting the outcome of the presidential election is this year's face-off of the Halloween masks.

It's as unscientific as it gets, but the theory, according to some people in the costume business, is that the winner in every election since 1980 has been the candidate whose masks were most popular on Halloween. Click here to read the story. Click here for the Presidential Mask Election Predictor 2004.

Quite interesting.
However, a note of caution
is appropriate here ;-D:


HE WHO MINES DATA MAY STRIKE FOOL'S GOLD
{BUSINESS Week, June 16, 1997}

Michael Drosnin has performed a tremendous public service by writing The Bible Code, the fast-selling new book that claims to find hidden messages in the Bible about dinosaurs, Bill Clinton, and the Land of Magog. Not because Drosnin is correct, but because his methodology is so bad that it's a valuable example of how not to read data. The pitfall Drosnin tumbled into threatens to ensnare any unwary practitioner of "data mining," the popular technique for building predictive models of the real world by discerning patterns in masses of computer data. Done right, data mining can help discover drugs, forecast recessions, weed out credit card fraud, and pinpoint sales prospects. Done wrong, it produces bogus correlations that range from useless to dangerous.

The error Drosnin committed in The Bible Code was a data-mining classic. He wrote out the Hebrew Bible on a huge grid of letters and used a computer to look for words that appear across, up, down, or diagonally. The cryptic "messages" consist of seemingly related words that appear near each other, for instance, dinosaur and asteroid.

GARBAGE in. It's best not to spill too much ink on the bible Code. Drosnin says that he used the code to foresee the assassination of Israeli Prime Minister Ytzhak Rabin, among other events. But his approach is immune to statistical verification or rebuttal, for that matter. Eliyahu Rips, the Israeli mathematician whom Drosnin credits as the code's discoverer, says he doesn't support the book. Its main value, then, is to illustrate a principle enunciated by Andrew W Lo, a finance professor at Massachusetts Institute of Technology: "Given enough time, enough attempts, and enough imagination, almost any pattern can be teased out of any data set." Experts from economists to epidemiologists have made similar mistakes. It was once common to mine health records in search of "hot spots” with above-average cancer rates. Epidemiologists would then develop hypothesis about what might have caused the apparent outbreak. This terrorized residents, usually for no good reason. Some places have above average cancer rates by pure chance.

Data mining can lead to costly misinterpretations. ProCyte Corp Kirkland, Wash., was dismayed in 1992 when a clinical trial found that its new drug, Iamin, didn’t seem to promote general healing of diabetic ulcer wounds. So the company searched through subsets of the data and found that Iamin seemed to work on certain foot wounds. But that was a statistical fluke, as it turned out after another expensive and fruitless clinical trial. Not allowed drug status, Iamin is now sold as a wound dressing.

Finance is rife with wrong-headed data mining. David J. Leinweber, managing director of First Quadrant Corp. in Pasadena, Calif., which manages $20 billion in assets, likes to illustrate the problem with "Stupid Data-mining Tricks." For example, he sifted through a United Nations CD-ROM and discovered that the single best predictor of the Standard & Poor's 500 stock index was butter production in Bangladesh. The lesson: A formula that happens to fit the data of the past won't necessarily have any predictive value. That's true even of the Index of Leading Economic Indicators, which the Commerce Dept. turned over to the Conference Board in 1995.

University of Pennsylvania economist Diebold says the Commerce Dept.’s periodic rejiggering of the index made it fit the historical data more closely but didn’t improve it as a forecasting tool. The problem could get worse. With desktop computers becoming more powerful, data-mining tools are being used by people who are clueless about statistics. It's human nature to search for patterns whether constellations in the stars or faces in the clouds. And computers allow that impulse to run wild. Says Ajexis DePlanque, a senior research analyst at MEtA Group in Stamford, Conn.: "We need to be sure we're not just empowering people to shoot themselves in the foot." That's true whether the data come from supermarket scanners or the Bible.