How to run A/B tests correctly without wasting traffic
Published:
Most A/B tests are invalidated before they start: insufficient volume, too many variables, premature stopping. Here's the methodology that produces real results.

Why Most A/B Tests Are Useless
80% of tests we see in client reports ran for 3 days with 200 sessions and declared a winner at +2%. That's not A/B testing - it's statistical noise. An incorrect test is worse than no test: it gives false certainty and leads to decisions that hurt conversion rates.
Minimum Requirements for a Valid Test
95% statistical significance - at most 5% chance the result is random. Minimum 100 conversions per variant - if your site does 10 conversions/week, you need 10 weeks per test, not 3 days. One variable at a time - headline, image, or CTA button. Not all three simultaneously.
Use a sample size calculator before starting. Tools like Evan Miller's calculator tell you exactly how many visitors you need per variant. Most Romanian e-commerce sites with under 5,000 sessions/month can only validly test 1-2 elements per quarter.
Types of Tests and When to Use Each
Classic A/B test
50% of traffic sees variant A, 50% sees variant B. Best for major changes: new headline, new offer, completely different page structure. Requires the most traffic but gives the clearest signal.
Multivariate testing
Tests combinations of variables simultaneously (headline × image × CTA = 8 combinations). Needs 4-8x more traffic. Only makes sense for sites with 50,000+ sessions/month.
Split URL testing
Traffic split between two different URLs, not variants of the same page. Right for completely redesigned pages or new landing pages where you want a clean comparison.
What's Worth Testing
Focus on high-impact elements: hero headline, main offer, primary CTA, contact form length, social proof positioning, price/pricing structure. Don't waste traffic volume testing button colour, font, or spacing - the effect size is too small to measure reliably.
Classic Mistakes
- Stopping early - if after 2 days variant B shows 15% more conversions, the temptation is to declare it a winner. With low volume, a 15% difference is statistical noise. Set the test duration before starting and don't stop early.
- Running multiple tests simultaneously - each active test consumes the same traffic pool and can interfere with others.
- Ignoring seasonality - a test running Mon - Thu only is not valid. Run tests for at least 1-2 complete weeks to cover weekend variation.
Why A/B Testing Is Harder to Implement Than to Understand
The logic is simple. Execution isn't. To run a valid test you need sufficient traffic, a correctly configured tool, a clear hypothesis, and the discipline not to stop the test before reaching statistical significance. On a site with 3,000 sessions per month you can validate one element per quarter. On a site with 500 sessions per month, you practically can't test anything with certainty.
And if the test turns out negative, the interpretation is just as hard. Did variant B lose because the hypothesis was wrong, because the test ran too short, because seasonality distorted the data, or because traffic segments behaved differently? The answer to that question determines whether you test again or draw the wrong conclusion and revert to variant A without understanding why.
Frequently asked questions
What is A/B testing and how does it work?
A/B testing (or split testing) is the method of comparing two versions of an element (page, ad, email, CTA) to determine which performs better. Traffic is split randomly between version A (control) and version B (new variant). The version with the higher conversion rate wins, if the difference is statistically significant.
How long should an A/B test run?
Minimum 2 weeks, ideally 4 weeks. Don't stop a test after 3 days and 100 sessions - results are not statistically significant. You need at least 100 conversions per variant to draw valid conclusions. If your traffic is low (under 500 sessions/day), A/B tests take much longer and may be inefficient.
What element should I test first: CTA, headline, or design?
Start with elements that have the highest potential impact: 1) Page headline or main heading - influences first impression. 2) CTA (call-to-action button) - text, color, and position. 3) Lead capture form or checkout flow. Test one element at a time; testing multiple elements simultaneously makes it impossible to identify the cause.
What does statistical significance mean in A/B testing?
Statistical significance (usually 95%) means there's a 95% probability that the observed difference is not by chance. Statistical significance calculators (free online) show you when you can declare a winner. Avoid declaring a winner before reaching statistical significance - you might implement a change that actually doesn't work.
Can I do A/B testing on ads in Google Ads or Meta Ads?
Yes. Google Ads has the Experiments feature that lets you test entire campaigns or ad groups. Meta Ads has integrated A/B Test in Ads Manager. For landing page tests, Optimizely, VWO, or Google Optimize (or alternatives) are dedicated tools. Ad tests should run simultaneously to eliminate seasonality effects.
At DAFE Digital we test systematically for you. Well-grounded hypotheses, sufficient volume, clear conclusions without statistical noise.
Incorrect A/B testing produces false conclusions and bad decisions. We know how to build tests so you get real answers about what works - not confirmation of prior beliefs.

Adela Mincea
Performance Marketer · Fondatoare DAFE Digital · Formator ANC
Adela is a Performance Marketer with 10+ years of paid media across Europe, the US and Asia. She founded DAFE Digital in 2023 after agency roles in London and Hong Kong, in-house work inside client organisations, and independent consulting across 27+ industries.


