When testing, the validity of the data is a function of the how much a difference there is between your results, and the sample size.

Simply put, if you have a larger variance between two results, then you will need a smaller sample size to achieve a strong degree of confidence.

For example, if we run a landing page optimization test and receive the following results:

 Treatment Unique Visits Leads Conversion Landing Page A 4,203 32 0.76% Landing Page B 3,454 534 15.46%

To determine the statistical significance of a data set, we need to look at both the sample size and the difference in our results. In this particular example, the difference is great however the sample size for Landing Page A Leads is relatively small, so there is a high amount of room for error caused from sampling.

There are obviously very complex algorithms for calculating the statistical relevance of a given data sample. You might also like
1 Comment
1. Dave Morgan says

I think you’ve got the following statement backwards:

“Simply put, if you have a larger variance between two results, then you will need a smaller sample size to achieve a strong degree of confidence.”

Actually, two test groups with smaller variances are different (statistically speaking) at smaller samples than two test groups with larger variances. Also, to be totally accurate, it’s important to realize that the two test groups with small variances must not overlap in order to refute the null hypothesis.

Thanks, and keep up the great work!

Dave, thanks for the comment. I am not entirely sure what you are asking. What I am referring to is the scenario where we have 2 sets of results from the same sample…

Jalali

Dave @ SiteSpect