How the Significance Calculation Works
Uniform's significance calculation engine helps identify which A/B test variant is outperforming the rest—based on data, not guesswork.
The process uses a two-sided Z-test for proportions to assess whether differences in conversion rates between a control group and one or more test variants are statistically significant.
How it works#
Setup and Validation
The system ensures that:
- A "default" variant is present — usually the fallback experience (no criteria) or the last variant in the test.
- All variants (including default) meet a minimum traffic threshold (e.g. 200 views) to avoid unreliable results.
Conversion Rate and Variability
For each variant, we calculate:
- The conversion rate (conversions divided by views)
- The statistical variability (standard error) in that rate
Z-Score and P-Value Calculation
Each test variant is compared to the control. We compute the Z-score (how different it is from the control) and translate that into a p-value, which represents the probability that the observed difference is due to chance.
Statistical Significance Check
If a variant’s p-value is below the predefined threshold (e.g., 0.05 for 95% confidence), the difference is considered statistically significant—regardless of whether it’s better or worse than the control.
Role of the Default (Control)
The default variant serves as the baseline for all comparisons.
- It is never considered a winner itself
- A winning variant must show a statistically significant improvement over this default
- Without a valid default, significance cannot be determined
Final Output
Each variant is tagged with:
- Its conversion rate
- P-value
- Whether the result is statistically significant
- Whether more data is needed
- Whether it is the default
This structure ensures we identify not just differences, but meaningful improvements, with the default variant acting as the anchor for decision-making.
A winner, if present, is any non-default variant that outperforms the control with statistical confidence.