Statistical Significance
Statistical significance is the discipline of not making bid decisions on samples too small to be meaningful. In PPC, the practical threshold is roughly 30+ clicks or 3+ orders before any keyword-level inference is defensible.
Statistical significance in PPC is the discipline of distinguishing signal from noise — refusing to make bid or budget decisions on samples too small to be meaningful. It is the structural defence against the Whack-a-Mole Effect.
Why it matters
A keyword with 8 clicks and 0 sales has an ACOS of ∞ — but the result is consistent with a 5% true CVR (probability of 0 conversions in 8 trials at 5% CVR is ~66%). The "high ACOS" is essentially random; bidding the keyword down throws away a probably-decent keyword.
A keyword with 200 clicks and 0 sales has the same nominal ACOS — but the result is inconsistent with any plausible positive CVR. This keyword genuinely doesn't convert.
The two situations look identical in a CSV. They are not the same situation.
Practical thresholds
Most accounts don't need formal hypothesis testing — they need defensible heuristics:
| Decision | Minimum sample |
|---|---|
| Promote search term to exact-match (harvest) | ≥3 orders AND ≥15 clicks |
| Add search term as negative | ≥15 clicks AND 0 orders |
| Bid down a keyword | ≥30 clicks in the analysis window |
| Bid up a keyword | ≥3 orders AND ACOS demonstrably below target |
| Pause a campaign | ≥100 clicks across the campaign with 0 orders |
| Declare a creative test winner | ≥1000 impressions per variant AND ≥30 conversions |
These are conservative defaults. Higher-velocity SKUs can use shorter windows; lower-velocity SKUs need longer.
CVR confidence intervals
For a more rigorous take, the 95% confidence interval on observed CVR shrinks roughly as ±1.96 × √(p(1-p)/n). Practical implications:
- At 30 clicks with 3 conversions (10% observed CVR), the 95% CI is roughly 2%–27%. Wide enough that bid changes based on the point estimate are mostly noise-chasing.
- At 300 clicks with 30 conversions (10% observed CVR), the CI tightens to roughly 7%–14%. Now actionable.
- At 3,000 clicks with 300 conversions, CI is roughly 9%–11%. Precise enough to bid math against.
Time-windowed vs. event-windowed
Significance is about events, not time. A keyword with 100 clicks in a week and a keyword with 100 clicks in a quarter carry the same statistical weight (modulo external shifts in CVR over the period). Don't measure significance by "the last 7 days" — measure by accumulated events since the last bid change.
Common mistakes
- Reading 7-day or daily ACOS on low-volume keywords. The volatility is purely sampling noise.
- Declaring an A/B test winner on the first day. Almost never significant.
- Combining heterogeneous data to "get more clicks." Aggregating across SKUs or match types to reach significance often pools data with structurally different CVRs and produces a meaningless average.
- Ignoring significance for "obvious" decisions. Even obvious decisions can fail at small samples; discipline applies uniformly.
Related terms
Mentioned in
- NewsAMAnews January 2025 — New FBA Return Fees, Title Guidelines & AI Updates
- Sponsored SuccessThe PPC Lookback Period: How Far Back Your Bid Optimizer Should Really Look
- Sponsored SuccessPrice Changes, Add-to-Cart Behavior and Repricers: How They Wreck Amazon PPC
- Sponsored SuccessDuplicate Keywords in Amazon PPC: Why They Bleed Budget and How to Clean Them Up
- Sponsored SuccessFinding New Keywords for Amazon PPC: A Repeatable Process for Sellers and Vendors