Basic Statistics to Bayes: a really simple step-by-step (with SVG formulas)

All mathematical expressions below are rendered as SVG images (via CodeCogs) so they work in any WordPress without plugins.

Step 1 — Population, sample, and variable

Population: the full set you want to study (e.g., all website users).
Sample: an observed subset of the population (e.g., the last 300 visits).
Variable: what you measure (e.g., “clicked the button?” yes/no; “time on page” in seconds).

Why it matters? Without distinguishing population from sample, you generalize too much and get weak conclusions.

Microexercise: pick a problem (e.g., click-through rate). Define population, sample, and variable.

Readings:
Statistics ·
Population and sample

Step 2 — Simple measures: mean, median, and proportion

Mean: sum ÷ count (good when there are no strong outliers).
Median: the middle value (robust to outliers).
Proportion: fraction of “successes” (e.g., 18 clicks in 300 visits → 18/300 = 6%).

Microexercise: calculate the proportion of successes in your own example.

Readings:
Mean ·
Median

Step 3 — Variability in one sentence

Standard deviation: measures how spread out the data is around the mean.

Intuition: the same mean can hide very different behaviors if variability changes.

Microexercise: imagine two days with the same average visits but different spread; note why “regularity” (low variability) improves predictability.

Reading: Standard deviation

Step 4 — Probability without pain

Probability = chance between 0 and 1 (or 0% and 100%).

  • Sum rule (mutually exclusive events):
    Sum rule
  • Product rule (independent events):
    Product rule

Microexercise: if 30% visit page A and 20% visit B, with no overlap, what’s the probability of visiting A or B?

Reading: Probability

Step 5 — Conditional probability (the key piece)

Conditional probability
  and  
Inverse conditional probability

Attention:
Do not confuse conditionals

Microexercise: “P(has disease | test positive)” is not equal to “P(test positive | has disease)”. Explain why in one sentence.

Reading: Conditional probability

Step 6 — Bayes’ theorem (pocket version)

Bayes updates your prior belief with new evidence (likelihood) and produces a posterior belief.

Bayes theorem

Summary:
Proportional Bayes
 (normalized to become a probability).

  • Prior
    Prior: your belief before data.
  • Likelihood
    Likelihood: how compatible data is with the hypothesis.
  • Posterior
    Posterior: belief after seeing data.

Microexercise: think why a rare event (low prior) remains unlikely even with a “good” test.

Reading: Bayes’ theorem

Step 7 — A numerical example (very simple)

Scenario: rare disease (1%). Test sensitivity 99%, false positive rate 5%.

  • P(disease)
  • P(test+ | disease)
  • P(test+ | no disease)

Evidence (overall positive):

Evidence

Posterior (probability of having the disease given a positive test):

Posterior

Takeaway: because the disease is rare, it is still unlikely even after a positive test (base rate effect).

Step 8 — Proportions with “pseudo-counts” (practical rule)

For rates/proportions (e.g., click-through rate), a prior Beta(α, β) acts as “pseudo-counts”.

  • Start simple with α=1, β=1 (uniform prior: all values equally plausible).
  • Observed s successes and f failures? Then
    Posterior Beta.
  • Posterior mean is
    Posterior mean.

Example: 18 clicks in 300 visits, uniform prior →
Example Beta → mean ≈
19/302 (≈ 6.3%).

Reading: Beta distribution

Step 9 — What to report (without heavy formulas)

  • Point: posterior mean (or median).
  • Uncertainty: credible interval (e.g., 95%).
  • Decision: in A/B testing, ask “what’s the probability that A is better than B?”.

Step 10 — Common pitfalls

  • Ignoring the base rate → overestimates risk after a positive test.
  • Overly strong prior → dominates the data. Use weak priors when unsure.
  • Confusing
    P(A|B)
    with
    P(B|A) → conditional inversion.
  • Forgetting uncertainty → always report estimate and uncertainty.

Next steps

  1. Repeat the test example with your own numbers.
  2. Repeat the proportion (Beta-Binomial) example with your own data.
  3. Later, move to Python (PyMC/ArviZ) to compute credible intervals and Bayesian A/B tests.

Summary table

Step What to know Example Practical outcome
1 Population × sample All users × last 300 Avoids wrong generalizations
2 Mean, median, proportion 18/300 = 6% Simple measures already help
3 Variability Same mean, different variance Regularity improves prediction
4 Basic rules Sum  | 
Product
Combine probabilities correctly
5 Conditional Conditional Avoids critical confusion
6 Bayes (idea) Bayes proportional Updates belief with data
7 Numerical example Rare disease + test After positive: ~16.6%
8 Bayesian proportion Posterior Beta simple Practical pseudo-count rule
9 What to report Mean + interval Estimate with uncertainty
10 Pitfalls Base rate, strong prior, conditional More reliable diagnostics & A/B

Quick references

Statistics ·
Probability ·
Conditional probability ·
Bayes’ theorem ·
Beta distribution

Edvaldo Guimrães Filho Avatar

Published by