Split Testing – Trial by Fire

Testing - I find your lack of tests disturbing

Split testing is a cheap and reliable way to test two or more versions of a design against each other and see how they perform under live conditions. When split testing you focus one or a couple of quantitative metrics (such as like revenue, number of completed sales or sign-ups) and use them to judge how each design performs. It is a great method if you want to try out an advertising campaign, a set of rewritten purchase instructions or a new sign-up process. Sure, the method has its flaws (which I will get to later) but it is still a great method for finetuning a web page that every web developer should have in her toolbox.

In this post I will be outlining the basics concepts and list the pros and cons of the two main methods for split testing – A/B Testing and Multivariate Testing.

A/B Testing

The idea behind A/B Testing is very simple: do business as usual but provide half of your customers with one version (usually the existing version) and half with the other (usually the new version to be tested out) and see which version does better. The different versions are usually evenly fed to the users by a script (usually PHP, ASP or Javascript) but can also be distributed unevenly to not hurt sales or conversions too much if the new design is very risky and experimental. For example, 80% of the users get to see the current, proven design and 20% the new design.

  • Cheap.
  • Real users evaluate real features.
  • Can detect very small differences in performance.
  • No usability knowledge needed to interpret the results.
  • Cost for usability testing is disguised as web development or marketing. This is a bit sneaky, but the truth is that it is hard to get a budget for dedicated usability testing approved (even though testing is usually the easiest usability activity to get approved). For your first A/B Test, it can be easier to tag along with established activities like development and marketing. Then, when you got some good results to show you have a much better argument to get money for dedicated usability activities.
  • Can only track change to one factor at a time. If more than one factor (such as link color AND ad placement) changes between the designs, it becomes very hard to pinpoint which change it was that improved the result. Running sequential tests – first testing link color, changing to the winner, then testing text, then ad placement and so on – doesn’t allow you to see any synergy effects. For example, red links and placing ad at the bottom might produce fantastic results but blue links and placing ad at the top might be even better. This kind of complex relationship is very hard to identify with A/B Testing.
  • Can only collect quantitative, computer-collectable data. For a broader perspective and visionary input, you have to use qualitative methods (such as observed user testing and focus group interviews with users).
  • Only works on fully designed pages. Early on in a process, you get much better and less restricted results with rapid prototyping (such as paper prototyping).
  • Focus is on what you gain, hard to measure what you lose. For example, sales might go up but overall brand credibility may go down. Long-term effects are very hard to measure with split testing.

Read Jakob Nielsen’s excellent Putting A/B Testing in Its Place for more pros and cons about A/B Testing.

Multivariate Testing

Multivariate Testing has an advantage over A/B testing, as it can be used to test multiple factors at once. Instead of just alterating between two versions, Multivariate Testing can test several different versions of different features at once.

Multivariate Testing is made possible by the modular design of modern webpages. This allows the designer to create modules with differing content or design while keeping everything else the same, thus allowing her to focus the testing on a specific set of features. For example, a page may be shown with one of six different headers, one of ten different background colors and with one of eight different ad placements. This creates 480 different page versions (6*10*8) that get created dynamically by the script and is fed to the visitors of the page. This allows you to identify synergy effects between different features – a clear advantage over A/B Testing. By using cookies, you can ensure that the every user consistently only sees a single version of the page every time she visits your site.

Multivariate testing has the same advantages as A/B Testing but is usually more expensive than A/B Testing since more extensive development is needed.

By combining the factors of the test in a matrix you can get a good overview of the factors you want tested. Constructing the matrix can be done by using Taguchi methods or a morphological box to make the process more effective.

The hard part to implement is usually not the split but actually the tracking of the results from the different split pages. Getting this right might be the subject of a follow-up article.

Final Thoughts

Early on in the design process, split tests are too narrow to be of much use. I recommend using qualitative methods (such as focus group interviews and observed user testing) to generate an approximate of what you should develop and then finetune it with split tests.

Used correctly, split testing (especially multivariate testing) is an effective and cheap way of finetuning a design that has a clear goal that can be measured by a computer (such as sales or sign-ups).

Leave a Comment