The Two Safeguards to Stopping an AB Test

I feel like I can spend only spend a few paragraphs and a graph motivating why waiting for statistical significance is important in AB testing.

Here is a test we ran for an ecommerce client:

AB testing statistical significance

The goal we are measuring here is successful checkouts (visitors reaching the receipt page). The orange variation is beating the blue by 30.4% (2.96% vs. 2.27%), and the test shows 95% “chance to beat” or “statistical significance”!

This client has an 8 figure ecommerce business. Let’s say it’s $30 million a year in revenue (close enough…me hiding who they are and their revenue should give you confidence that if you work with us, I won’t go announcing your financials all over the internet).

So a 30% lift in orders is worth $9,000,000 in extra revenue a year!

(By the way, this is the power of AB testing).

This lure of extra revenue is enticing. Very much so. So much so that it can lead otherwise well meaning people to make a grave mistake: stopping a test too early.

In this case, you might be tempted to think: well the test has run for reached 95% significance, which everyone says is the cutoff. It had 3000 visitors to each variation. This is convincing. Let’s stop the test and run the orange variation at 100%!

But here’s a secret: both variations were the same.

This was, in CRO parlance, and A/A test. We ran it not to “test significance” which Craig Sullivan, in that preceding link argues is a waste of time (and I agree), not to give me an opportunity to look smart by writing this article, but just to check if revenue goals were measuring properly (they were, thank you for asking).

So how in the world are you supposed to know that this test is not worth stopping? Or as CRO people say, how do you know when you’re supposed to “call” the test?

You actually need two safeguards to make sure you don’t get duped by random fluctuations of data, i.e. statistical noise.

Safeguard 1: Statistical Significance

Safeguard 2: Sample Size

In this case we saw it satisfied the first safeguard, statistical significance. But that’s why the 2nd safeguard, sample size, exists.

We’ll discuss what both mean, in English, not math — well maybe a little math, but mostly English — in future articles (you can join our newsletter here). But for now, take the above example as a warning to not just stop a test the moment it reaches 95% significance.

Auxiliary Safeguards

Finally, in addition to the two safeguards above, there are also a few details you should pay attention to when deciding whether to stop a test:

First, are you tracking multiple goals through your purchase or signup funnel? 

You should be paying attention to supporting goals. Do the supporting goals show a consistent result? (They don’t have to, and they may not, but it’s good to know). In the A/A test example above, most actually did, but some did not, for example, here is the goal that tracks clicks on the size dropdown on the PDP, the results show largely no difference at all (note 45%, 55%, even chance of beating each other):

clicks on size

Tracking multiple goals will help give you a more holistic picture of what’s happening. I’m not saying all goals have to show the same result for a test result to be valid. The world is a complex place. But if just one or two goals are showing a big difference but the others aren’t, you should ask why that could be.

Second, pay attention to purchase cycles

Again, I have to word this in vague terms like “pay attention to”. If you’re thing “OMG just tell me what I should do” your frustration is noted.

“Purchase cycles” for most online businesses, means calendar weeks. Conversion rates and audience types vary by the day, so if you start a test on Tuesday and end it on Friday, you’re risking seeing different results than if you ran it for a week. Like anything it’s good to get multiple weeks in so you know the week wasn’t an anomaly.

Second, pay attention to seasonality. For a swimsuit business, things that work in July and things that work in December may be different things. (They may not, but they may be.) Again, when you run test after test after test, you learn to talk in slightly vague terms because you’ve seen so many different results.

Ecommerce Goals for Optimizely or VWO: What goals should you use in AB tests?

Optimizely goals for ecommerceI see a lot of clients use just these goals on ecommerce AB tests in Optimizely or VWO (Visual Website Optimizer):

  • Engagement (included by default by Optimizely)
  • Revenue
  • Checkouts

That’s a good start (Except engagement, have you ever made an actionable decision based on the engagement goal?).

But you could get a lot more information by adding upstream goals. Here are the most common that you should add:

  • Add to cart button clicks
  • Proceed to checkout page
  • Pageview goals for your entire funnel
    • All PDP pageviews
    • Cart Pageviews
    • Checkout pageviews (this can be multiple different pageview goals if your checkout flow is separate into different pages like /checkout/shipping, /checkout/payment/ etc.)

Here are some common scenarios for why the funnel goals are important

Scenario 1: You run a test and checkouts are up 10% with 92% significance

You are not tracking upstream goals:  You are only measuring checkouts and not the upstream goals. You decide the test has been running for a while and is probably significant (this is dangerous, but that’s another topic). You stop the test and declare the variation a winner and implement it.

You are tracking upstream goals: You are measuring the funnel goals and notice that pageviews of the checkout page, pageviews of the cart page, and clicks on the proceed to checkout and add to cart buttons are down. So why are checkouts up? Something doesn’t look right so you run the test for another 2 weeks. After 2 weeks the checkout goal regresses to the other goals and is also down a few percent versus original. Phew, good thing you didn’t call the test early when checkouts were looking up.

Scenario 2: You implement a variation that everyone “just knows” is going to win.

You are not tracking upstream goals: Checkouts tank and are down 30% with 99% significance after 1 week. Uh oh. Why?! You sit around the room and debate a bunch of reasons why but in reality no one really knows.

You are tracking upstream goals: You notice in your upstream goals that add to cart and proceed to checkout goals are up 20% as expected but everyone drops off at the checkout page. Coincidentally your variation made some changes to that page as well. You look further. When looking at session recordings of the checkout page and notice there was some unintended behavior on mobile from your variation. You didn’t realize your variation pushed some important information far down the page. Culprit found! Now you can re-do the test with that mobile issue fix.

There are more scenarios than this, but these are two common situations (one with a positive checkout result and another with a negative) where tracking upstream goals can make a huge difference (possibly costing or making the company millions).

In our experience when a variation is a clear winner, all or most goals will show an upward trend. That is: more add to cart clicks, more views of the checkout page, more successful checkouts, more revenue.

Not all tests will show that and if you have a test that runs for a fixed period of time, reaches 99% significance, with hundreds or thousands of checkouts per variation over multiple purchase cycles (calendar weeks), it’s hard to argue with that, and that does happen.

But most AB tests (as anyone who has done a lot of testing will attest to) are not so textbook clean. Like anything in life, the majority of scenarios are in the grey area. When that happens, tracking more goals than just the end goal (checkouts and revenue) will help interpretation a lot.

Want help setting up goals for your ecommerce site in Optimizely? Contact us.

eCommerce Copywriting Case Study: 3 Advanced Tactics We Used to Increase Product Page Conversion Rates 13.9%

We partnered with copywriter Brian Speronello of Accelerated Conversions on this project. This case study was written by Brian and edited by Devesh and Brian.

Writing powerful eCommerce copy that improves conversion rates is one of the most effective ways to generate more revenue for your online business.  Just by changing the words your visitors read when they come to your site, you can generate significant increases in sales for your company.

For example, online direct-to-consumer mattress brand Amerisleep was able to achieve a multi-million dollar increase in annual revenues just by rewriting their product details page.  A/B testing showed that checkouts increased by 13.9% with 98% confidence.

ecommerce product page conversion rate optimization

Note: Our agreement with Amerisleep prevents us from publishing their actual product page traffic and conversion rates but for the integrity of the test we’re showing the confidence level and hundreds of conversion events per variation. The test was also run for multiple buying cycles.

GrowthRock performed the conversion testing for this project, and Brian Speronello from Accelerated Conversions (that’s me) wrote the copy.

In this eCommerce case study, I’m going to share three advanced copywriting techniques that helped drive the 13.9% increase in orders on the new product page for Amerisleep.

eCommerce Conversion Copy Tactic 1: Go Beyond Bush-League Benefits

bush league benefits

If you’re reading this post, I’m going to assume that you’re at least moderately interested in conversion and copywriting strategy.  You should already know why it’s important to promote benefits ahead of features.

(If not, before you invest time going over this case study, you should read more about the basics of copywriting.  That way you’ll be able to get the most out of the strategies presented here).

This section is NOT about benefits versus features.

It’s pretty easy to tell the difference between a benefit and a feature.  The former describes a result for the reader.  The latter describes a function of the product or service.

Detecting Bush-League Benefits is more difficult, because it’s a matter of tone and perspective.

Bush-League Benefits are actual benefits that the target audience will get from your product or service.  But they’re benefits that the end user doesn’t care about.

Effective copy doesn’t just focus on benefits before features.  Sometimes it has to go through several layers of benefits to reach the deepest desires of the audience.

As an example, let’s look at a classic marketing and copywriting analogy: the electric drill.

A New Version of a Classic Marketing Tale

As the old saying about marketing a drill goes, the benefit of a drill is a hole in the wall.  That’s the immediate result (benefit) that the user gets from the features the drill has.  If you wrote copy for a drill that promoted how it puts holes in your wall, you would technically be promoting its benefits, since you’re not talking about features like torque or battery life.

But is “a hole in the wall” really what the reader wants?

Does a drill customer wake up in the morning and say “man, I really need some holes in my wall?”  (And if so, why not just buy a few kegs and invite over the local frat?  Problem solved.)

So while “a hole in the wall” is certainly a benefit of owning a drill, it’s a Bush-League Benefit because it doesn’t tap into the real desires of the audience.  It doesn’t connect with a problem or desire that the audience actively thinks about on a regular basis.

So what’s the real benefit of having the drill?

Is it having holes that are a precise size?  Is it the drill’s ability to make holes in a wall with little physical effort?  To find out, you have to ask yourself why the customer would want the holes in the wall.

You’d probably come up with an answer like “to hang family portraits.” This response is far more emotional than “holes in the wall.”  And it makes for a more compelling benefit to the reader.

For example, you can imagine a family moving into a new home and dreaming about hanging their family portraits above the fireplace.  That’s how you know you’ve gone deeper than a Bush-League Benefit.  From there you could write your copy targeting “hanging family portraits” as the real benefit of your drill.

But you’d be wrong. (Sort of…)

Because while the benefit of “hanging your family portraits” is definitely more emotional and compelling to the reader than holes in their wall, you can still go deeper.

What would happen if you asked “why?” again, researching the reason a family would want to hang their family portraits?  They might say something like “We just moved into a new house, and we want to make it feel like home.”

Now you’re getting somewhere.

Which drill would you rather buy?

  • Drill A: Our high-powered electric drill will put precision-sized holes in your walls.
  • Drill B: A new drill from our company will let you hang your family portraits so your new house finally feels like home.

No contest, you’d take Drill B.  Even though Drill A mentions a benefit (holes in your walls) it’s a Bush-League Benefit because it doesn’t connect to the emotions and desires of the audience.

The Challenge with Bush-League Benefits

The challenge with going beyond Bush-League Benefits is deciding when you’ve discovered a benefit that connects with the reader’s emotions enough to make them want to purchase.  It’s also why I said Bush-League Benefits are a matter of tone and perspective, so you’d only be “sort of” wrong to use “hanging family portraits” as the main benefit in copy for the drill.

Maybe your audience wants their house to feel like home because they’re part of an elite social circle and care about their status or impressing their neighbors.  Or maybe it’s because they’re looking to be featured in Better Homes & Gardens magazine or on HGTV.

Those desires are even more compelling than “making the house feel like home.” You could write your copy to focus on them, if you found they applied to the majority of your audience, and probably get even better conversions.

So it’s up to you to decide how deep you need to go based on your product or service and your audience.  As a rule of thumb, the more expensive, complicated, or important your offer is for the reader, the more your copy needs to connect with their deepest hopes and fears.

But if you want your copy to convert well, it must connect the benefits of your product or service with needs and desires that are emotional and top-of-mind for the reader.

You can’t settle for Bush-League Benefits and expect to get a lot of buyers.

Going Beyond Bush-League Benefits with Amerisleep

If you asked novice copywriters to tell you the benefit of a new mattress, they’d probably say “getting more sleep.”  And they’d be getting tricked by a Bush-League Benefit.

Don’t get me wrong, sleep is great.  And because most people understand that, copy that uses sleep as the main benefit can get decent results all by itself.

However even though sleep can be an effective benefit on its own — in fact because it’s so easy to settle for using sleep as your main benefit — it’s actually a Bush-League Benefit that has tricked many copywriters.  That’s because probing further into the reader’s desire for sleep reveals even more powerful motivators.

One of the biggest changes that we made on the Amerisleep product page was adding the following section.  It’s short, but it plays a key role in connecting the Bush-League Benefit of “more sleep” with deeper aspirations of the audience.  It relates “more sleep” with two highly emotional and nearly universal desires: An improvement in physical health and better mental performance at work.

Amerisleep non bush leage benefit copywriting example

This section positions sleep as the solution to better health, better job performance, and a higher quality of life.  And it’s a mistake to trust the reader to make these connection on their own.

You have to remind readers about the important role sleep plays in how we feel each and every day before they start to care about getting more of it.

Only once the reader is actively craving sleep will “more sleep” move from a Bush-League Benefit to a real one.  This section helps create that desire for sleep in the reader, and in turn it makes the other parts of the page where we talk about increasing sleep more persuasive.

So when it comes to your own copy, make sure that you connect the immediate benefit of your product or service to deeper and more compelling desires the audience has.

Otherwise you’ll just be peddling Bush-League Benefits.

eCommerce Copy Tactic 2: How to Shut Down the One Competitor Every Business Has in Common

biggest competitor inaction

There is one competitor that every single business in the entire world has in common.  One competitor who is taking money out of the pockets of every company on the planet.

And my guess is that this competitor is not on your radar, even if you spend a massive amount of your time and effort analyzing your place in the market.

Can you name it?

It’s the status quo.  Otherwise known as inaction or doing nothing.

If you analyzed the behavior of prospects who ultimately do not buy from you, which of the following situations do you think is more common?

  1. They spend their money with one of your competitors instead of you.
  2. They decide to save their money and not to buy from anyone.

Let’s say you run a website and you convert 10% of your visitors to customers.  That’s an outstanding conversion rate, and yet 90% of your visitors still do not purchase.

Does that mean 90% of your visitors wind up buying from your competitors?

Obviously not!  Otherwise their own conversion rates would be off the charts.

The market of people who explore their options but ultimately decide to do nothing is much larger than the market of people that choose your competition over you.

Rather than spending the majority of your time butting heads with your competitors, you should spend more of your time working to convert prospects who would otherwise do nothing at all.  Not only is this a bigger market, but because your competition is likely focusing on trying to beat you, it’s also relatively uncontested.

Copy Techniques for Overcoming Inaction

No matter who you are or what you sell, your clients and customers are always going to prefer doing nothing to buying from you.  Doing nothing is cheaper, easier, less uncertain, less mentally demanding, less socially and politically risky, and less stressful.

The only way to compete with the status quo is to vividly show the reader how doing nothing will lead to him or her being worse off.  This is usually a great place to leverage loss aversion, where you show readers what they will lose or miss if they don’t take action.  Another option is to ask the reader a question that follows the general outline “If you don’t do this, what will you do that is better?”

Here are some hypothetical examples from a cross-section of industries that show how you could use loss aversion and questions about alternative choices to make readers aware that doing nothing is actually worse for them:

  • Appliances: If you don’t upgrade to a new, energy-efficient washing machine, it will cost you up to $1,000 more every year on your water and electricity bill. (Notice I didn’t say our energy-efficient washing machine. This is any of them. We’ll get to your company’s specific offer later.)
  • Conferences: Last year people who attended our conference on average were able to add $300,000 to their bottom line by the end of the year using the techniques we teach.  It’s your choice if you decide to join us or not — we’re not going to give you the hard sell.  But if you don’t attend, do you have a better plan for adding $300,000 to your profits in the next 12 months?
  • Dating Coach: I know the idea of working with a dating coach can make some people feel embarrassed — after all, shouldn’t we just naturally know what to do?  The hard truth is that it’s not natural, and the social systems that used to guide us have gone away.  So ask yourself what’s more embarrassing, working with someone to help you improve an important (maybe the most important) part of your life?  Or doing nothing and continuing to get rejected by the people you’re attracted to and waking up next to an empty pillow every morning?

Overcoming Inaction with Amerisleep’s Copy

We applied this principle on the Amerisleep product page in the same segment of copy from the last example.  We talked about the benefits of sleep, waking up without pain, not feeling groggy at work, and an overall better lifestyle.  Then we said:

The number of daily problems that go away with a good night’s sleep is astounding — but that only happens if you actually do something to improve your sleep.

If you just keep things the same, you’ll keep getting the same disappointing results. And don’t you deserve better?

By clearly addressing how doing nothing makes readers worse off, it takes away a lot of the mental excuses they can use to justify maintaining the status quo.

It’s only after you’ve convinced readers that they need to take action to address a problem or desire they have — period — that positioning yourself as the best choice among your competitors pays off.

How to do that effectively is what we’ll be discussing next.

eCommerce Copy Tactic 3: Position Yourself as the Market Leader Using Comparisons

deveshdesign canvas

Once your prospects decide to take action on a problem or desire they have, their mindset changes.  Before that, their number one question is “Should I even worry about this right now?”

After your prospects make up their mind to move forward though, they begin to ask “Who should I hire/How should I solve this problem?” instead.

When your prospects reach this stage in the decision-making process, it finally becomes effective for you to spend time proving how your offer is superior to your competition.

Here are a few techniques for making prospects see your company as their best choice, even if you operate in a crowded market.

Absolute Comparative Statements

Absolute comparative statements say that one thing is absolutely better than the other.  Claims like better, bigger, and the best are absolute comparatives.

(If you’re wondering “doesn’t that make all comparative statements absolute?” I’ll show you how that’s not true in the next section on Faux Comparatives.)

The biggest mistake copywriters make when using comparatives is not being specific.  If you’re going to use a comparative statement, you need to explain compared to what.

Too many companies will say “Our product is the best!” But that doesn’t answer the question “the best compared to what?” And without the “compared to what” piece, the comparison is ineffective.

Here are two examples of absolute comparatives from the Amerisleep product display page.  I’ve bolded the comparative term and underlined the “compared to what” part of the sentence.

  • On top of that, our foam is also the most environmentally friendly. Our patented foam-making process uses plants to replace some petroleum, and is the only manufacturing method that exceeds the standards of the Clean Air Act.
  • Our foam is also better than traditional memory foam because it recovers its shape faster.

Absolute comparative statements are the most powerful way to make your offer appear superior than the competition.  But what if you can’t say your product or service is absolutely better than the competition in any measurable way?

In that case, some copy trickery can help you give the impression you’re better than your competition, when really you’re only claiming to be equal.

I call these copy tricks “Faux Comparatives.”

How “Faux Comparatives” Let You Turn Equality into Superiority

Tell me what this sentence means:

  • No other mattress is more supportive than Amerisleep’s Revere bed.

Most people would believe it says the Amerisleep Revere bed is more supportive than any other mattress…

But in reality it only says that no one is better — which means there could be many others that are equal.

Take another look at the sentence, this time with the implied meaning in parenthesis:

  • No other mattress is more supportive than Amerisleep’s Revere bed (but there are several others that are equally supportive).

If you’re in a market where the offers are all relatively equal, you can claim that no one is better and be factually accurate.  No one can say you’re making false advertising claims.

If the audience interprets “no one is better” to mean that “we are the best,” that’s their misunderstanding.  Your job is to make the case for your product or service in the most compelling — and truthful — manner possible. If that misunderstanding works out in your favor, lucky you.

In some respects all marketing and copywriting is presenting the truth in the most flattering light.  As long as the statements you make are true, choosing the words that make your copy the most convincing to the audience is just doing your job as a writer.

Here’s an example of how we used this Faux Comparative strategy on the Amerisleep product display page in the headline before describing the materials inside each mattress:

faux comparitives

This headline suggests that Amerisleep’s mattresses are more carefully engineered than any other brand.  It’s a great message to introduce before explaining what goes inside each mattress.  However in reality all it says is that, while there aren’t any mattresses more carefully engineered, there may be others that are of equal quality.

Another Faux Comparative method that gives the impression of superiority, when you are actually only claiming equality, is adding “one of” to an otherwise absolute comparison.

Instead of saying that you are the best, the most, or the biggest, you can say you are one of the best, the most, or the biggest in your category.

Like with the last Faux Comparative, the reader will focus on the comparison that you are the best, most, or biggest in your category.  They will think you are one of a select few who are at the front of the market, but in reality you could be one of many who are all equal.

We used this approach in the sub-headline for the copy from the previous Faux Comparative:

  • Our innovative and proprietary materials let us build the one of most comfortable mattresses ever

We also used it in the eco-friendly section:

faux comparison copywriting example

In both cases, it makes Amerisleep seem like they’re at the top of their market.  But in reality, it could actually mean they are just one of many companies with similar characteristics.

Increasing Conversions from Your eCommerce Copy

You’ve just learned three eCommerce copy techniques that can increase your sales by double-digit percentages.  But your conversions only go up if you invest the time to rewrite your copy.  If you simply move on with your day now, nothing changes and your sales will stay the same.

So before you close this page, set aside time in your calendar or add an item to your to-do list to incorporate the three copy strategies from this article.  Because if you’re not going to do anything with these ideas, why did you spend 15 minutes reading this case study?  Without applying the tactics it covered, the all you just did was 15 minutes of mental masturbation.

(And if this section feels familiar…remember what I said about your biggest competitor being inaction?)

You can use the following checklist to apply the lessons from this case study:

  1. Highlight the primary benefits in your copy. Ask if they are emotionally engaging. If not, come up with benefits that your audience relates to more deeply — and if you can ask actual customers for feedback, even better.
  2. Add a section to your copy that discusses the costs, risks, and problems that your prospects will experience if they don’t buy from you.
  3. Find comparison terms, like more, better, and the best.  Make sure that each one includes a “compared to what?” part of the statement.
  4. Where possible, add “no one is more…” and “one of the…” Faux Comparative statements so readers think aspects of your business that are only equal to the competition are actually superior.

But what if you’re legitimately focused on other priorities and can’t make the time to rewrite your copy — even though you want to?  Or what if you want to go beyond these three tactics and get the maximum increase in your site’s sales and conversion rate?

In that case, you can contact me about working together on your site’s copy.  And for conversion optimization, you can contact GrowthRock if you manage a 7 or 8-figure annual revenue site.

Should you AB test large site redesigns, and if so how?

An issue that’s come up with more than one client for us recently is when a large site redesign is “in the pipeline” and people in the company disagree about whether or not they should test the large redesigns or feature release.

The arguments for not testing large site redesigns is usually some flavor of the following:

These changes have been things we’ve wanted to do for a long time, we know we’re going to implement them, so why test them?

Sometimes I’ve heard it veiled in phrases like “This change is too important.” or “This will be too difficult to test.” But in the end the real reason is that some people in the company just don’t want to test it.

Don’t buy the “this is too hard to test” argument. As I explain below, almost any site change can be AB tested with even the most basic AB testing tool.

You shouldn’t just agree blindly to this idea of not testing large redesigns, though, because if your site is bringing in enough revenue, the ROI of continuous AB testing can be significant.

Why large or inevitable site changes should still be AB tested

I’ll list my arguments for why you should still test large site changes in response to the various objections I listed above that we often here in our work.

My hope is, if you’re the person in the organization arguing for testing, you can use these arguments to help with your battle.

“This change is going to happen regardless of the test outcome”

I have a couple responses to this, one is nicer than the other.

The nicer response: That’s okay if the change will be implemented regardless, but don’t we want to know what the effect on revenue will be?

For example (and this is common): Say an 8-figure ecommerce company is finally updating their entire site design. Their site was made 6 years ago, parts were added piecemeal over time, sales have grown tremendously, but the site hasn’t caught up. It doesn’t have modern design, the checkout process is definitely not optimal, and it’s not mobile friendly. The company knows it needs to update the site. So they hired an expensive agency to totally redesign it.

Anyone who has AB tested elements of an ecommerce site knows that seemingly simple changes to single pages can swing orders by 10%. A 10% change in orders is worth millions for an 8-figure revenue company.

Even if you’re going to implement the large change regardless of the outcome, don’t you want to know if it will reduce revenues by over a million dollars?

If it does hurt sales, you can delay the launch a bit, hypothesize what parts of the new design could be causing the decrease, retest just those elements, and isolate the problem.

The not as nice response: Rolling out a large change without testing it first is irresponsible. See arguments above for why.

Warning: Be careful if the web design agency is arguing that you should just roll out the changes and not test them. It’s not in their interest to see if their new design performs better or not. It’s in their interest to just tell you their design is fantastic and have you love it. Unfortunately customers and their wallets are the true (and ruthless) judge of whether the site is “better”.

The change is too big to test. AB testing is for front end changes, and this changes a lot more.

You don’t have to code the test in Optimizely or whatever testing platform you use. In fact, it’s not recommended that you do this for large tests. Instead, you should code and deploy the new design on your own servers with slightly modified URLs (site.com/home, site.com/page-1, etc) and have the AB testing program simply redirect users to both versions of the site. Most programs can do this.

This method can handle large changes as well.

It will be confusing if customers see two different experiences at the same time

First, every AB testing platform I’ve heard of cookies users so they’ll always see the same variation unless they clear cookies or use a different browser or device.

Second, to protect against confusion if they use separate devices (checking something at work vs. at home, etc.), on the new variation, you can always install a soft popup or bottom of page slider that says “Hey, we’re testing a new site design and would love your feedback. Let us know how you like it by…”.

Lastly, it’s just not that big of a deal. Modern companies test sites all the time now. Sites get updated, they change. Frankly, this idea that if you test a new site for a limited amount of time, and a fraction of customers see an inconsistent site experience for a while will “hurt your brand” is old fashioned marketing thinking.

This thinking relies more on “gut instinct” and opinions of people in conference rooms rather than data from the only opinions that matter: your customers.

Heard any other objections to testing large redesigns or features? Ask away in the comments.

If your business brings in 7 to 8-figures of annual revenue online and you’re interested in getting a conversion audit of your site to being increasing its conversion rate, email us or fill out the form on our homepage.

The ROI of AB Testing: When is AB testing worth it?


I have a simple 2-step criteria for determining if AB testing has a high enough ROI to be seriously considered for your marketing team:

  1. Criteria #1: Your business makes $2,000,000 or more in revenue
  2. Criteria #2: You get 100,000 monthly unique visitors or more to your site

Yes, like any numerical “cutoff”, these numbers are somewhat arbitrary (e.g. In the U.S. at 15.5 years old, you are too young to drive, but at 16.1 years, you’re not).

But, when you zoom out, cutoffs have reasoning behind them: You wouldn’t trust an 8 year old to drive, and waiting until people are 30 to drive is also ridiculous.

Similarly, instead of debating the numbers themselves, let’s discuss the reasoning so you can adjust the above numbers as needed for your business. (Fortunately, unlike a driver’s license, our 2 step criteria above aren’t hard rules).

Finally, I’ll also discuss a 3rd corollary rule that falls out of the reasoning behind the first two: If you fulfill the first two criteria and begin investing in AB testing but aren’t getting routine lifts in orders or sales of 5% – 10% or more, the ROI on your testing investment may also not be worth it.

Why a Revenue a Cutoff? Because AB testing isn’t free.

AB testing isn’t free. Even if you include software costs, the majority of AB testing costs are for the people needed to run the operation. This can come in two forms:

  1. An outside agency, that charges you a monthly rate. From my un-scientific survey of other CRO agencies, they typically charge between $2,000/m – $15,000/m, with most experienced ones being typically towards the middle/high end of that range: $5,000 – $10,000/month.
  2. Using internal employees, which also burn cash (almost always more than an agency).

I’ll soon have an entire article that dives into these costs and discusses how to choose between using in-house resources vs. outside. But for now, as an example, say your monthly AB testing costs are $5,000.

Deciding if a dedicated CRO program is “worth it” at $5,000/month is a matter of estimating:

  1. How much of a revenue lift you’re likely to achieve
  2. How long you’ll have to spend to achieve it

Before you get worked up about this, let me make this clear: it’s impossible to know the answers to these questions a priori.

But luckily, you’re not the first company to do AB testing, so we can look at what’s typical, conservative, aggressive, etc., based on past experience.

Because this is all speculation, however, I’m going to just boil things down to this general rule: assume you can get (1) a 10% increase in revenue in (2) 6 months of testing.

So, that means you’d spend $30,000 ($5,000/m for 6 months) and, if you were making $2 million in revenue before (bare minimum of the criteria above), you’ve increased annual revenues by $200,000/year.

That’s a 567% ROI from the first year of revenues alone. Pretty good.

At this point, let me re-iterate a point I made above about arbitrariness in different words: You may not get a 10% lift in 6 months, or, you may get way more than a 10% lift in revenue in 6 months, or you may get the lift in 2 months.

I can’t say.

You can’t say.

No one can predict this.

But, I know that 10% revenue lift in 6 months is reasonable and achievable — our agency has achieved this multiple times for multiple businesses.

If you want to discuss whether this is achievable for your business and possibly get a few initial ideas to test from us, click here.

Now, similar to the driver’s license example, instead of arguing about whether you you should get a 10% or 25% or 7.72% lift in this time period, let’s zoom out and look at some extremes.

If you told me “For a 6 month AB testing campaign, we’ll need to see a 150% increase in sitewide revenue for it to be worth it.” I’d respond with “Maybe you should focus on something else.”

I’m not saying that’s impossible.

Anything is possible.

It’s possible you could do some user research for a month or two, discover some immense barrier to purchasing that was due to on-site elements, UI/UX, or product related issues, fix them, and see an increase in revenue of 150%.

That is possible.

But it’s just not common, and I’d go as far as to say it’s not even reasonable to expect 150% increase in revenue in a few months.

But 10%? That’s reasonable.

Examining Different Revenue Ranges

Revenue of $500,000 – $1,000,000

Let’s look at lower revenue numbers to see why I put our arbitrary criteria at $2 million.

At $500,000, 10% is $50,000 a year. That’s awfully close to the $30,000 you’d spend doing this for 6 months. Sure maybe you’d see that 10% lift in 3 months, but would you then stop your testing spend just to make sure your ROI was good? No, you’d keep going.

Or, you may only see 5% lift, which is only $25,000.

Either way, it’s not likely that at $500,000 sitewide revenue you’re going to get an outstanding, no-brainer ROI on AB testing.

After months of testing, you may only see an increase in revenue of between $50,000 – $150,000, and spend about $60,000 a year in AB testing costs. So although the ROI could be positive, it’s not a no brainer.

More importantly, for businesses in this revenue range, there are often bigger wins.

In my experience, these bigger wins are usually SEO or paid media optimization. Those two drivers of traffic can move the needle significantly.

I’ve seen in multiple client’s analytics, traffic double from one year to the next. If that fruit is still hanging for your business, pick it first. Don’t worry about trying to eek out 10% or even 35% lifts via CRO.

(Yes, it’d be ideal to do both, but resources are finite.)

Or, your business may be hitting the $500,000 or $1,000,000 revenue mark with some initial paid ad spend that is by no means saturated. If that’s the case, can you double paid ad spend and still maintain sufficient profitability?

The answer may be yes, the answer may be no, but that question should be asked and explored in great detail before starting up a CRO program from scratch.

Revenue Less than $500,000

Startups that are just starting to make money often ask us about AB testing.

If your company is making less than $500,000 in revenue, regardless of how much traffic you have, it’s hard to make the math work for CRO. If you make say $250,000/year, you could easily spend 6 months and $30,000 to get a 10% lift and you wouldn’t be making your money back.

Aside: If you’re thinking, “I’m the founder and I’m barely paying myself,” stop. Your time is the most valuable, and almost always at this stage your focus should be on getting better product market fit.

More importantly, even if you did make your money back (or had a positive ROI) from CRO, it’s not likely to produce a step change in growth. The increase will be incremental, and your business is likely at a stage where you’re looking to find significant growth — so focusing there makes more sense. (Typically that opportunity is better product/market fit or traffic generation.)

Revenues of More than $10,000,000

Now, let’s look at 8 figure businesses and above with the same criteria.

A 10% lift, achieved in 3 – 6 months (reasonable), would yield over a $1,000,000 in annual revenue.

You’re not likely going to spend anywhere near $1,000,000 for 3 – 6 months of testing. Even if you paid $10,000 a month to an agency for 6 months, your ROI is significant: 1567% on a $60,000 investment.

Alternatively, even if you increased revenues by just 5%, that’s $500,000 of extra annual revenue — a 733% ROI.

This math, combined with our observation that 8 figure businesses have more often than not spent years optimizing traffic and paid advertising (and thus either don’t have super low hanging fruit or already have a healthy operation focused on cranking out consistent wins on those two fronts), means that starting up a CRO program for an 8-figure+ business is a no brainer.

Note: Even understanding “what is reasonable?” is easier for an established company. They have a traffic history and a paid media history that they can look to. For example: Has your SEO traffic grown at 20% – 30% for the last 3 years? Great, you can probably expect the same this year. 

The word “starting” in the previous sentence is particularly important because like any marketing initiative, the easiest wins are there for the taking at the beginning, so a site that has never before been formally “optimized” likely has some easy wins that can be realized by CRO (user research + hypothesis generation + ab testing) in the first 3 – 6 months.

Corollary Rule #3: You also need to be seeing consistent conversion lifts to get a good ROI on testing

So we see from the above analysis that a certain revenue range is required for reasonable lifts in conversion rate to yield increases in revenue that make testing “worth it”. But that means the inverse is also true: you need to achieve reasonable lifts in conversion rate for testing to be worth it! 

If you check the boxes on the first two criteria:

  1. We have more than $2MM in revenue
  2. We have more than 100,000 monthly uniques

And you start investing in AB testing, great. That means the potential of seeing a great ROI on testing are there. But it doesn’t mean you will achieve that potential.

Continuing with the numbers in my examples above, you’d need to achieve a 10% lift in orders or sales on a consistent basis to realize your ROI potential.

If you are testing with a typical ad hoc or “conference room” approach (everyone sits around a conference room and throws up their favorite pet ideas to test), then it’s quite likely you won’t achieve that result. Feel free to reach out to us at the link below if you want to discuss was to solve this problem in your organization.

Final thoughts

Again, before you start drafting your comment arguing that my numbers are arbitrary, I want to emphasize (for the 15th time) that I’m fully aware they are arbitrary. Don’t use them to the exact digit. Adjust them for your business, and use the guiding principle, which is:

Before investing in CRO, first compare reasonably achievable revenue increases that CRO could produce versus the equivalent increases you could possibly get from other channels or initiatives (e.g. Improving the product, increasing traffic via SEO, paid ads, content marketing, etc.)

Have a 7 or 8-figure business and want to a evaluation of conversion optimization opportunities for your business? Contact us.

Want updates when articles like this come out? Join our email list (at the top of this page).

Load More