Interactive model

ED: Sample Size Calculator

Estimate per-variant sample size from baseline rate, lift, and power.

This interactive page estimates the per-variant sample size for a two-arm conversion experiment. Set a baseline rate, minimum detectable lift, and target power to see the rough sample requirement for each variant. The live distribution figure inside the model updates with those selections, showing how the control and target variant rates separate under the current assumptions.

Use the estimate after defining the product surface, primary signal, and guardrail metrics. A sample-size number is only useful when the target matrix and baseline rate describe the same eligible population.

Planning Figures

The figures below show how the same calculation changes as assumptions move. They are static companions to the calculator: use them to see the shape of the tradeoff before tuning exact values.

Per-variant sample size falls as the minimum detectable lift gets larger for an 8% baseline conversion rate and 80% power.
Per-variant sample size falls as the minimum detectable lift gets larger for an 8% baseline conversion rate and 80% power.
Per-variant sample size changes across baseline conversion rates for 8%, 12%, and 20% minimum detectable lift assumptions.
Per-variant sample size changes across baseline conversion rates for 8%, 12%, and 20% minimum detectable lift assumptions.

Interactive model

Sample size calculator

Per variant0
Total sample0

Live figure

Sampling distributions

Change the inputs to compare control and variant uncertainty.

Control and variant sampling distributions Curves update when baseline rate, minimum detectable lift, or power changes. 0% 0% Control Variant

Continuous quality monitoring

The calculator is a planning model, not a final statistical review. Before acting on the output, check that event logging is stable, assignment units are clear, and guardrail metrics cover quality, accessibility, and privacy-sensitive flows.

The baseline rate should come from recent measurements for the same audience and surface. If the baseline is unstable, the next step is often instrumentation repair or a broader study design rather than a larger launch test.