Testing Madness: What the odds of picking a perfect NCAA Tournament bracket can teach us about running valid tests
Several companies are offering multi-million dollar rewards to anyone who can pick a perfect bracket in the NCAA Tournament. Sounds like a good deal, doesn’t it? You can enter for free, and the chances must be better than the lottery, right?
Ask yourself…what do you think the odds are? Maybe one in a million. Perhaps one in 50 million.
Or maybe you put a little more effort into it and do a few basic calculations. You figure that in the first round of 32 games, the probability of having a perfect prediction is one in four billion.
So if you’re prone to extrapolate you might think that, for the total 63 games in the championship, the overall chance would be something like one in eight billion, right? Logically speaking there are twice as many games, so half the probability.
I was wondering myself, so I actually ran the numbers. The chance of predicting a perfect bracket for March Madness is one in 9.22 quintillion. That’s one in nine billion billions. In other words, you have a better chance of getting struck by lightning, being hit by a meteor, and winning the Mega Millions lottery.
But wait – I know my college basketball
“But wait,” you say. “I’ve been following college basketball and I know which teams have a better chance of winning. No way Arkansas-Pine Bluff has any chance of beating Duke.”
Fair enough. In the above example, I used a random-result probability model (a 50/50 chance for every game). So I also created an informed-result probability model.
In this model, I assumed that the higher-ranked team had a two-thirds chance of winning in the first two rounds (after that, it’s still anybody’s game). The chance now is one in 9.29 trillion. Much improved, but still amazingly long odds.
But wait – I know my customers
There’s a greater lesson to be learned here for testing. Chance is an intuitive concept but estimating chance is not.
When running an online test, chance has to be accounted for. We can’t rely on intuition or a feeling that we know what our customers want. We can’t just assume that because we got the results we hoped for that the results are significant or a test has run long enough.
We need to implement the appropriate statistical validation to assure that what we are seeing is not just random chance, but likely representative of our market as a whole.
Bad data equals bad decisions
Here is an extreme example to show you what I’m talking about. Let’s assume we turn a test and our treatment page gets four visitors. If three visitors buy, and one visitor bounces, we cannot assume that 75% of our traffic will buy. Because once we get to 10 visitors, we may find that now six have bounced.
While the above example is obvious, not every testing scenario is. Perhaps you’ve run the test for a week and feel like that is long enough. Or the sample size seems quite large. Or, and perhaps the biggest danger which I referenced above, you feel that you know your customers well enough that when a result comes along that you were hoping to see, you’re prepared to stop testing.
Making a business decision based on any of these scenarios is a dangerous thing. And therein lies the power of true statistical validity. We are given an assurance (for example, 95%) that the results from the test we just ran can be relied upon and not attributed just to chance. This way, when we duplicate across the entire population (assuming we tested on a representative sample), we’ll sensibly be expecting similar results.
By doing so we never completely eliminate the chance of an unpredictable event (one in 9.29 trillion is still possible), but we gain a strong enough understanding of it to confidently make bottom-line decisions based on quality data.