Confidence Intervals

Confidence Interval vs Level

If you’ve got a bunch of data and you’re trying to approximate the average value, you’re going to have to bake in some wiggle room for your prediction.

Assuming that you’ve already cleared our usual sampling conditions, we want essentially want to come up with an expression of

$ourEstimate \pm marginOfError$

How we calculate this “marginOfError” depends on whether we’re looking at a sample mean or proportion (more below), but is basically the product of two parts:

  • The standard error, an approximation of the Standard Deviation
  • Our test statistic, which is a function of what distribution we’re using (mean vs proportion) and our confidence level

Confidence Level

The confidence level is expressed as a percentage and essentially says

In an infinite number of trials, X% of these error bands would contain our population mean

Confidence Interval

By extension of that, these “error bands” are your confidence interval.


Calculating the confidence interval for a sample proportion follows the equation

$ \hat{p} \pm Z^* \sqrt\frac{\hat{p}(1-\hat{p})}{n}$

Arriving at a Z_star value for a given confidence level has traditionally involved using a lookup table much more robust than below

CL    Z*
80%   1.28
90%   1.645
95%   1.96
98%   2.33
99%   2.58


Similarly, calculating the confidence interval for a sample mean looks like the following

$ \bar{X} \pm t^* \frac{s_x}{\sqrt{n}}$

The lookup table for t_star is a bit more complicated than it’s Z counterpart and also involves another dimension for degrees of freedom.

In Python

So how do we calculate these in Python?

import numpy as np
from scipy import stats


1000 values of either 0 or 1

X = np.random.randint(0, 2, (1000))
p_hat = X.mean()
std_err = np.sqrt((p_hat * (1 - p_hat)) / len(X))
print(p_hat - 1.96 * std_err,
      p_hat + 1.96 * std_err)
0.477013645945 0.538986354055

scipy has a ROBUST stats api with a ton of functionality, including calculating confidence intervals

stats.norm.interval(0.95, loc=np.mean(X), scale=stats.sem(X))
(0.47699871080518186, 0.53900128919481816)

But I found that for my purposes, I was seldom reaching for the customization they provide. So instead, I threw together stats101 as a thin wrapper with more of the arguments handled automagically.

Now we just apply the confidence level and our data

from stats101 import confidence_interval

confidence_interval.proportion(0.90, X)
(0.48198289796420185, 0.53401710203579811)
confidence_interval.proportion(0.95, X)
(0.47699871080518186, 0.53900128919481816)
confidence_interval.proportion(0.99, X)
(0.46725739973505387, 0.54874260026494615)


Works just as easy for a sample mean

X = np.random.randint(0, 100, (1000))
confidence_interval.mean(0.8, X)
(47.149856666859471, 49.460143333140529)
confidence_interval.mean(0.95, X)
(46.537387386009932, 50.072612613990067)
confidence_interval.mean(0.99, X)
(45.980336643055075, 50.629663356944924)