Conversion Rate Optimization: 3 factors to ensure your results are reliable
So, you want to increase conversion rates on your website …
How do you do that?
You test, test and test again until you learn more about how you can better serve your customers … and then you test some more.
Sounds simple enough, right?
Sure, just test a lot.
When testing begins, there are a few things to take into account that can have a significant impact on the validity of your results and the analysis of your data.
Fluctuations in customer behavior are bound to happen over time due to random variation, seasonal trends or other factors. Or, perhaps your website could convert more prospects by optimizing a specific aspect like hidden friction, which is not necessarily obvious at first.
These are just a few reasons of how continuous testing can help you meet your goals.
In today’s MarketingExperiments blog post, we will touch on three common factors of testing and analysis that will hopefully keep your testing efforts reliable and valid.
What jalapeños can teach us about sample sizes
First, let’s talk about sample sizes.
You’re going to need a large enough amount of Web traffic to your test treatment groups before you can be statistically confident the customer behavior you’re testing is representative of the entire population of people that would normally visit your site.
And, maybe this is a no-brainer for some, but the impact of sample sizes on test validity is too important to overlook. Let’s take a look at an example to understand why.
Recently, my girlfriend and I stopped by a popular restaurant here in Jacksonville Beach and ordered a pizza.
She is not a fan of spicy food, so we decided to split our pizza, half with jalapeños and half without. My first bite with jalapeno did not feel spicy at all. Confident the pizza was harmless, I asked my girlfriend if she would like to try a slice since it seemed harmless. But, she declined.
However, when I took another bite, a small fire had ignited in my mouth as the beads of sweat began to trickle down my forehead.
This was a particularly angry jalapeño that taught me (and luckily, not my girlfriend) an important lesson about why adequate sample sizes are important before drawing conclusions. A sample size of one jalapeño was not sufficient to draw the conclusion that all of the other jalapeños on my half of the pizza were mild.
When testing your website, remember the smaller a sample size, the lower the statistical likelihood of correctly rejecting a false null hypothesis, or in this case, drawing a false conclusion that all the jalapeños on your pizza are mild.
While I did have enough jalapeños to measure, with a certain level of confidence, the spiciness of my individual pizza, I certainly didn’t have enough to measure the spiciness of all jalapeños in the world. If you have a small email list or a limited amount of website traffic, this blog post about small sample size testing by Lauren Maki, Research Manager, MECLABS, can help ensure you learn from your tests.
Remember, Monday is no substitute for Friday
Another factor to consider when testing continuously is how a sample size may fluctuate over time – or the history effect, as we call it. Understanding how time impacts your Web traffic will help you accurately capture the true nature of your customers.
For example, let’s say you start a test on a Monday.
By the end of the day, you collect an adequate sample size to statistically validate your test results.
You high-five your team in excitement, but there’s one problem …
Collecting an adequate sample size in less than 24 hours does not necessarily mean you can generalize the findings from Monday’s sample as a reliable insight for other days of the week, let alone a month.
It also will not be enough insight for the year or more you might be using those results to inform your marketing decisions because you’ve optimized a landing page based on them.
It is still very possible traffic during other days of the work week and especially the weekend can behave radically differently from your Monday traffic. So as a general rule, when testing, it is ideal to keep a test running for a minimum of at least seven days to identify any daily fluctuations in traffic behavior that would never have been identified otherwise.
With age comes wisdom … and historical data
Ideally, possessing historical data spanning at least a year as a comparison is helpful and will increase the accuracy and precision of your testing decisions and conclusions.
But, wait! What if you don’t have years upon years of data?
If you’re a new company starting out or simply don’t have historical data as a baseline for testing, remember you have to start somewhere – and now is as good a time as any to start testing and building your customer theory.
So in the meantime, the caveat here is to be careful of putting all your eggs in one basket and assuming the rest of the year will perform in the same manner as the slice of time you’ve tested so far.
Keep this important factor in mind as you build testing benchmarks that will hopefully help you reach greater heights of true knowledge about your customers.
After all, that’s the real benefit of valid testing. Getting to know what your customers really want. That helps them, and helps you increase your conversion rate.