How the Significance Calculation Works

Uniform's significance calculation engine helps identify which A/B test variant is outperforming the rest—based on data, not guesswork.

The process uses a two-sided Z-test for proportions to assess whether differences in conversion rates between a control group and one or more test variants are statistically significant.

  1. Setup and Validation

    The system ensures that:

    • A "default" variant is present — usually the fallback experience (no criteria) or the last variant in the test.
    • All variants (including default) meet a minimum traffic threshold (e.g. 200 views) to avoid unreliable results.
  2. Conversion Rate and Variability

    For each variant, we calculate:

    • The conversion rate (conversions divided by views)
    • The statistical variability (standard error) in that rate
  3. Z-Score and P-Value Calculation

    Each test variant is compared to the control. We compute the Z-score (how different it is from the control) and translate that into a p-value, which represents the probability that the observed difference is due to chance.

  4. Statistical Significance Check

    If a variant’s p-value is below the predefined threshold (e.g., 0.05 for 95% confidence), the difference is considered statistically significant—regardless of whether it’s better or worse than the control.

  5. Role of the Default (Control)

    The default variant serves as the baseline for all comparisons.

    • It is never considered a winner itself
    • A winning variant must show a statistically significant improvement over this default
    • Without a valid default, significance cannot be determined
  6. Final Output

    Each variant is tagged with:

    • Its conversion rate
    • P-value
    • Whether the result is statistically significant
    • Whether more data is needed
    • Whether it is the default

This structure ensures we identify not just differences, but meaningful improvements, with the default variant acting as the anchor for decision-making.

A winner, if present, is any non-default variant that outperforms the control with statistical confidence.