Glossary
Glossary

Statistical Significance

Statistical significance is the discipline of not making bid decisions on samples too small to be meaningful. In PPC, the practical threshold is roughly 30+ clicks or 3+ orders before any keyword-level inference is defensible.

statistical significancesignificance thresholdsignificancesample size

Statistical significance in PPC is the discipline of distinguishing signal from noise — refusing to make bid or budget decisions on samples too small to be meaningful. It is the structural defence against the Whack-a-Mole Effect.

Why it matters

A keyword with 8 clicks and 0 sales has an ACOS of ∞ — but the result is consistent with a 5% true CVR (probability of 0 conversions in 8 trials at 5% CVR is ~66%). The "high ACOS" is essentially random; bidding the keyword down throws away a probably-decent keyword.

A keyword with 200 clicks and 0 sales has the same nominal ACOS — but the result is inconsistent with any plausible positive CVR. This keyword genuinely doesn't convert.

The two situations look identical in a CSV. They are not the same situation.

Practical thresholds

Most accounts don't need formal hypothesis testing — they need defensible heuristics:

DecisionMinimum sample
Promote search term to exact-match (harvest)≥3 orders AND ≥15 clicks
Add search term as negative≥15 clicks AND 0 orders
Bid down a keyword≥30 clicks in the analysis window
Bid up a keyword≥3 orders AND ACOS demonstrably below target
Pause a campaign≥100 clicks across the campaign with 0 orders
Declare a creative test winner≥1000 impressions per variant AND ≥30 conversions

These are conservative defaults. Higher-velocity SKUs can use shorter windows; lower-velocity SKUs need longer.

CVR confidence intervals

For a more rigorous take, the 95% confidence interval on observed CVR shrinks roughly as ±1.96 × √(p(1-p)/n). Practical implications:

  • At 30 clicks with 3 conversions (10% observed CVR), the 95% CI is roughly 2%–27%. Wide enough that bid changes based on the point estimate are mostly noise-chasing.
  • At 300 clicks with 30 conversions (10% observed CVR), the CI tightens to roughly 7%–14%. Now actionable.
  • At 3,000 clicks with 300 conversions, CI is roughly 9%–11%. Precise enough to bid math against.

Time-windowed vs. event-windowed

Significance is about events, not time. A keyword with 100 clicks in a week and a keyword with 100 clicks in a quarter carry the same statistical weight (modulo external shifts in CVR over the period). Don't measure significance by "the last 7 days" — measure by accumulated events since the last bid change.

Common mistakes

  • Reading 7-day or daily ACOS on low-volume keywords. The volatility is purely sampling noise.
  • Declaring an A/B test winner on the first day. Almost never significant.
  • Combining heterogeneous data to "get more clicks." Aggregating across SKUs or match types to reach significance often pools data with structurally different CVRs and produces a meaningless average.
  • Ignoring significance for "obvious" decisions. Even obvious decisions can fail at small samples; discipline applies uniformly.

Related terms

Mentioned in