Basic Statistics to Bayes: a really simple step-by-step (with SVG formulas)

All mathematical expressions below are rendered as SVG images (via CodeCogs) so they work in any WordPress without plugins.

Step 1 — Population, sample, and variable

Population: the full set you want to study (e.g., all website users).
Sample: an observed subset of the population (e.g., the last 300 visits).
Variable: what you measure (e.g., “clicked the button?” yes/no; “time on page” in seconds).

Why it matters? Without distinguishing population from sample, you generalize too much and get weak conclusions.

Microexercise: pick a problem (e.g., click-through rate). Define population, sample, and variable.

Readings:
Statistics ·
Population and sample

Step 2 — Simple measures: mean, median, and proportion

Mean: sum ÷ count (good when there are no strong outliers).
Median: the middle value (robust to outliers).
Proportion: fraction of “successes” (e.g., 18 clicks in 300 visits → 18/300 = 6%).

Microexercise: calculate the proportion of successes in your own example.

Readings:
Mean ·
Median

Step 3 — Variability in one sentence

Standard deviation: measures how spread out the data is around the mean.

Intuition: the same mean can hide very different behaviors if variability changes.

Microexercise: imagine two days with the same average visits but different spread; note why “regularity” (low variability) improves predictability.

Reading: Standard deviation

Step 4 — Probability without pain

Probability = chance between 0 and 1 (or 0% and 100%).

Sum rule (mutually exclusive events):
$Sum rule$
Product rule (independent events):
$Product rule$

Microexercise: if 30% visit page A and 20% visit B, with no overlap, what’s the probability of visiting A or B?

Reading: Probability

Step 5 — Conditional probability (the key piece)

$Conditional probability$
and
$Inverse conditional probability$

Attention:
$Do not confuse conditionals$

Microexercise: “P(has disease | test positive)” is not equal to “P(test positive | has disease)”. Explain why in one sentence.

Reading: Conditional probability

Step 6 — Bayes’ theorem (pocket version)

Bayes updates your prior belief with new evidence (likelihood) and produces a posterior belief.

$Bayes theorem$

Summary:
$Proportional Bayes$
(normalized to become a probability).

Prior
$Prior$ : your belief before data.
Likelihood
$Likelihood$ : how compatible data is with the hypothesis.
Posterior
$Posterior$ : belief after seeing data.

Microexercise: think why a rare event (low prior) remains unlikely even with a “good” test.

Reading: Bayes’ theorem

Step 7 — A numerical example (very simple)

Scenario: rare disease (1%). Test sensitivity 99%, false positive rate 5%.

$P(disease)$
$P(test+ | disease)$
$P(test+ | no disease)$

Evidence (overall positive):

$Evidence$

Posterior (probability of having the disease given a positive test):

$Posterior$

Takeaway: because the disease is rare, it is still unlikely even after a positive test (base rate effect).

Step 8 — Proportions with “pseudo-counts” (practical rule)

For rates/proportions (e.g., click-through rate), a prior Beta(α, β) acts as “pseudo-counts”.

Start simple with α=1, β=1 (uniform prior: all values equally plausible).
Observed s successes and f failures? Then
$Posterior Beta$ .
Posterior mean is
$Posterior mean$ .

Example: 18 clicks in 300 visits, uniform prior →
$Example Beta$ → mean ≈
$19/302$ (≈ 6.3%).

Reading: Beta distribution

Step 9 — What to report (without heavy formulas)

Point: posterior mean (or median).
Uncertainty: credible interval (e.g., 95%).
Decision: in A/B testing, ask “what’s the probability that A is better than B?”.

Step 10 — Common pitfalls

Ignoring the base rate → overestimates risk after a positive test.
Overly strong prior → dominates the data. Use weak priors when unsure.
Confusing
$P(A|B)$
with
$P(B|A)$ → conditional inversion.
Forgetting uncertainty → always report estimate and uncertainty.

Next steps

Repeat the test example with your own numbers.
Repeat the proportion (Beta-Binomial) example with your own data.
Later, move to Python (PyMC/ArviZ) to compute credible intervals and Bayesian A/B tests.

Summary table

Step	What to know	Example	Practical outcome
1	Population × sample	All users × last 300	Avoids wrong generalizations
2	Mean, median, proportion	18/300 = 6%	Simple measures already help
3	Variability	Same mean, different variance	Regularity improves prediction
4	Basic rules	$Sum$ \| $Product$	Combine probabilities correctly
5	Conditional	$Conditional$	Avoids critical confusion
6	Bayes (idea)	$Bayes proportional$	Updates belief with data
7	Numerical example	Rare disease + test	After positive: ~16.6%
8	Bayesian proportion	$Posterior Beta simple$	Practical pseudo-count rule
9	What to report	Mean + interval	Estimate with uncertainty
10	Pitfalls	Base rate, strong prior, conditional	More reliable diagnostics & A/B

Quick references

Statistics ·
Probability ·
Conditional probability ·
Bayes’ theorem ·
Beta distribution