Geometric Distribution

Let’s assume we perform Bernoulli trials, i.e. random experiments where each experiment can have the outcome 1 (success) or 0 (failure). Let \(X\) be a random variable which denotes the number of trials we need to perform until a random experiment returns success. Then, the probability distribution of \(X\) is called a geometric distribution.

Let \(p\) denote the success probability of a random experiment. The probability that the \(n\)th experiment is the first one that returns success is given by:

\[P(X=n) = (1-p)^{n-1}p\]

The expected number of trials we need to perform in average until we get success is

\[E[X] = \frac{1}{p}\]

Proof:

Let’s assume we do Bernoulli trials until a Bernoulli trial returns success. The number of trials required for the first success is denoted by \(X_1\). Then, we repeat this and store the number of trials to get the next success in \(X_2\). We do this \(n\) times so that \(X_i\) denotes the number of trials that we required to get the first success in the \(i\)th run.

The expected value of \(X\), i.e. \(E[X]\), is the number of trials we need in average to get a success, i.e.

\[E[X] = \frac{1}{n} \sum_{i=1}^nX_i\]

Let \(k_i\) denote the number how often we have seen the first success in the \(i\)th trial. Formally,

\[k_i = \vert\{X_j\ \vert\ X_j = i\}\vert\]

Then,

\[\begin{split} E[X] &= \frac{1}{n}\left(1k_1 + 2k_2 + 3k_3 + \dots \right) \\[2ex] &= 1\frac{k_1}{n} + 2\frac{k_2}{n} + 3\frac{k_3}{n} + \dots \end{split}\]

Now, if \(n\to \infty\) the fraction \(k_i/n\) converges to the probability that we see the first success in the \(i\)th trial, i.e.

\[\begin{split} E[X] &= 1\cdot P(X=1) + 2\cdot P(X=2) + 3\cdot P(X=3) + \dots \\[2ex] &= \sum_{i=1}^\infty i \cdot P(X=i) \\[2ex] &= \sum_{i=1}^\infty i \cdot (1-p)^{i-1}p \\[2ex] &= p\sum_{i=1}^\infty i \cdot (1-p)^{i-1} \end{split}\]

With \(\frac{d}{dx}x^i = ix^{i-1}\) we can easily verify that \(-\frac{d}{dp}(1-p)^i = i(1-p)^{i-1}\). If follows

\[\begin{split} E[X] &= p\sum_{i=1}^\infty -\frac{d}{dp}(1-p)^i \\[2ex] &= -p\frac{d}{dp}\sum_{i=1}^\infty (1-p)^i \\[2ex] &= -p\frac{d}{dp}\left(-1 + \sum_{i=0}^\infty (1-p)^i\right) \end{split}\]

The term \(\sum_{i=0}^\infty (1-p)^i\) is a geometric series which converges to \(\frac{1}{1-(1-p)}\) (see here for the details) so that we get

\[\begin{split} E[X] &= -p\frac{d}{dp}\left(\frac{1}{1-(1-p)} -1\right) \\[2ex] &= -p\frac{d}{dp}\left(\frac{1}{p} -1\right) = -p\left(-\frac{1}{p^2}-0\right) \\[2ex] &= \frac{1}{p} \end{split}\]