You get an anonymous letter on January 2nd informing you that the market will go up during the month. It proves to be true, but you disregard it, owing to the well-known January effect (stocks have gone up historically during January). Then you receive another one on Feb 1st telling you that the market will go down. Again, it proves to be true. Then you get another letter on March 1st – same story. By July you are intrigued by the prescience of the anonymous person and you are asked to invest in a special offshore fund. You pour all your savings into it. Two months later, your money is gone. You go spill your tears on your neighbour’s shoulder and he tells you that he remembers that he received two such mysterious letters. But the mailings stopped at the second letter. He recalls that the first one was correct in its prediction, the other incorrect.
What happened? The trick is as follows. The con-operator pulls 10,000 names out of a phone book. He mails a bullish letter to one half of the sample, and a bearish one to the other half. The following month he selects the names of the persons to whom he mailed the letter whose prediction turned out to be right, that is, 5,000 names. The next month he does the same with the remaining 2,500 names, until the list narrows down to 500 people. Of these there will be 200 victims. An investment in a few thousand dollars worth of postage stamps will turn into several million.
It is not uncommon for someone watching a tennis game on television to be bombarded by advertisements for funds that did (until that minute) outperform others by some percentage over some period. But, again, why would anybody advertise if he didn’t happen to outperform the market? There is a high probability of the investment coming to you if its success is caused entirely by randomness. This phenomenon is what economists and insurance people call adverse selection. Judging an investment that comes to you requires more stringent standards than judging an investment you seek, owing to such selection bias. For example, by going to a cohort composed of 10,000 managers, I have 2/100 chances of finding a spurious survivor. By staying home and answering my doorbell, the chance of the soliciting party being a spurious survivor is closer to 100%.
The same logic that applies to the spurious survivor also applies to the skilled person who has the odds markedly stacked in her favour, but who still ends up going to the cemetery. This effect is the exact opposite to the survivorship bias. Consider that all one needs is two bad years in the investment industry to terminate a risk-taking career and that, even with great odds in one’s favour, such an outcome is very possible. What do people do to survive? They maximise their odds of staying in the game by taking black-swan risks; those that fare well most of the time, but incur a risk of blowing up.
The most intuitive way to describe the data mining problem to a non- statistician is through what is called the birthday paradox, though it is not really a paradox, simply a perceptional oddity. If you meet someone randomly, there is a one in 365.25 chance of your sharing their birthday, and a considerably smaller one of having the exact birthday of the same year. So, sharing the same birthday would be a coincidental event that you would discuss at the dinner table. Now let us look at a situation where there are 23 people in a room. What is the chance of there being two people with the same birthday? About 50%. For we are not specifying which people need to share a birthday, any pair works.
A similar misconception of probabilities arises from the random encounters one may have with relatives or friends in highly unexpected places. “It’s a small world!” is often uttered with surprise. But these are not improbable occurrences – the world is much larger than we think. It is just that we are not truly testing for the odds of having an encounter with one specific person, in a specific location at a specific time.
Rather, we are simply testing for any encounter, with any person we have ever met in the past, and in any place we will visit during the period concerned. The probability of the latter is considerably higher, perhaps several thousand times the magnitude of the former.
When the statistician looks at the data to test a given relationship, say to ferret out the correlation between the occurrence of a given event, like a political announcement, and stock market volatility, odds are that the results can be taken seriously. But when one throws the computer at data, looking for just about any relationship, it is certain that a spurious connection will emerge, such as the fate of the stock market being linked to the length of women’s skirts. And just like the birthday coincidences, it will amaze people.
What is your probability of winning the New Jersey lottery twice? One in 17 trillion. Yet it happened to Evelyn Adams, whom the reader might guess should feel particularly chosen by destiny. Using the method we developed above, researchers Percy Diaconis and Frederick Mosteller estimated at 30 to 1 the probability that someone, somewhere, in a totally unspecified way, gets so lucky!
Some people carry their data mining activities into theology – after all, ancient Mediterraneans used to read potent messages in the entrails of birds. Michael Drosnin provides an interesting extension of data mining into biblical exegesis in The Bible Code. Drosnin, a former journalist (seemingly innocent of any training in statistics), aided by the works of a “mathematician,” helped “predict” the former Israeli Prime Minister Yitzhak Rabin’s assassination by deciphering a bible code. He informed Rabin, who obviously did not take it too seriously. The Bible Code finds statistical irregularities in the Bible; these help predict some such events. Needless to say, the book sold well enough to warrant a sequel predicting with hindsight even more such events.
The same mechanism is behind the formation of conspiracy theories. Like The Bible Code, they can seem perfect in their logic and can cause otherwise intelligent people to fall for them. I can create a conspiracy theory by downloading hundreds of paintings from an artist or group of artists and finding a constant among all those paintings (among the hundreds of thousand of traits). I would then concoct a conspiratorial theory around a secret message shared by these paintings. This is seemingly what the author of the bestselling The Da Vinci Code did.
My favorite time is spent in bookstores, where I aimlessly move from book to book in an attempt to make a decision as to whether to invest the time in reading it. My buying is frequently made on impulse, based on superficial, but suggestive clues. Frequently, I have nothing but a book jacket as appendage to my decision making. Jackets often contain praise by someone, famous or not, or excerpts from a book review. Good praise by a famous and respected person or a well-known magazine would sway me into buying the book.
What is the problem? I tend to confuse a book review, which is supposed to be an assessment of the quality of the book, with the best book reviews, marred with the same survivorship biases. I mistake the distribution of the maximum of a variable with that of the variable itself. The publisher will never put on the jacket of the book anything but the best praise.
Some authors go even a step beyond, taking a tepid or even unfavourable book review and selecting words in it that appear to praise the book. One such example came from one Paul Wilmott (an English financial mathematician of rare brilliance and irreverence) who managed to announce that I gave him his “first bad review,” yet used excerpts from it as praise on the book jacket (we later became friends, which allowed me to extract an endorsement from him for my book).
The first time I was fooled by this bias was upon buying, when I was 16, “Manhattan Transfer”, a book by the American writer John Dos Passos, based on praise on the jacket by the French writer and “philosopher” Jean-Paul Sartre, who claimed something to the effect that Dos Passos was the greatest writer of our time. This simple remark, possibly blurted out in a state of intoxication or extreme enthusiasm, caused Dos Passos to become required reading in European intellectual circles, as Sartre’s remark was mistaken for a consensus estimate of the quality of Dos Passos rather than what it was, the best remark. (In spite of such interest in his work, Dos Passos has reverted to obscurity.)
I am frequently asked the question: when is it truly not luck? There are professions in randomness for which performance is low in luck: Like casinos, which manage to tame randomness. In finance? Perhaps. All traders are not speculative traders: there exists a segment called market makers whose job is to derive, like bookmakers, or even like store owners, an income against a transaction. If they speculate, their dependence on the risks of such speculation remains too small compared to their overall volume. They buy at a price and sell to the public at a more favorable one, performing large numbers of transactions. Such income provides them some insulation from randomness. Such category includes floor traders on the exchanges, bank traders who “trade against order flow,” moneychangers in the souks of the Levant. The skills involved are sometimes rare to find: Fast thinking, alertness, a high level of energy, an ability to guess from the voice of the seller her level of nervousness; those who have them make a long career (that is, perhaps a decade). They never make it big, as their income is constrained by the number of customers, but they do well probabilistically. They are, in a way, the dentists of the profession.
Nassim Nicholas Taleb
for Markets and Money