The different features of the Poisson distribution are well known but
the 0-truncated Poisson is perhaps less known. 0-truncated means that
there are no 0:s in the data. There can be several reasons why the 0:s
are omitted. In one factory the 0:s were considered unnecessary and thus
incorrectly omitted. In another study the number of items bought by
people in a shopping queue was recorded. Obviously, no 0-values.
There are several methods for estimating the value of the
λ-parameter. In 1912 Sir R Fisher coined the expression
Maximum Likelihood estimator (discussed below).
Below
some important expressions are listed for the two distributions.
Ordinary Poisson distribution
The probability distribution is a discrete distribution for X > 0 and contains only one parameter, λ:
\[
\begin{align}
\Large P(X=x) =\frac{\lambda^x e^{-\lambda}}{x!}
\end{align}
\]
The Likelihood function is simply the product of the probabilities, one for each of the measurements:
\[
\begin{align}
\Large L(\lambda) =\prod_{i = 1}^{n} \frac{\lambda^{x_i}
e^{-\lambda}}{x_i!}
\end{align}
\]
The mathematical problem is now to find the λ-value that maximizes the L(λ)-function. Usually, this function is rewritten by using logarithms:
\[ \begin{align} \Large l(\lambda) = -n\lambda + ln(\lambda)\sum_{i = 1}^{n} x_i-\sum_{i = 1}^{n} ln(x_i!) \end{align} \]The next mathematical step is to find the derivative of the L(λ)-function:
\[ \begin{align} \Large \frac{\partial l(\lambda)}{\partial \lambda} = -n + \frac{1}{\lambda}\sum_{i = 1}^{n} x_i \end{align} \]By setting the derivative above equal to 0, the MLE can be found. (The ‘hat’ on the λ-value indicates that it is an estimate based on the current X-values):
\[ \begin{align} \Large \hat{\lambda}=\frac{\sum_{}^{} x_i}{n} \end{align} \]
Expected value E(X) and the variance E(X) are as follows:
\[ \begin{align} \Large E(X)=\lambda \end{align} \]
\[ \begin{align} \Large V(X)=\lambda \end{align} \]
The 0-truncated Poisson distribution
When the probability of X = 0 is removed the rest of the probabilities must be adjusted (raised) otherwise it would not be a proper probability distribution (must add to 1). The followin constant is used:
\[ \begin{align} \Large constant=\frac{1{} }{1-P(X=0)}=\frac{1{} }{1-e^{-\lambda}} \end{align} \]
Expected value E(X) and the variance E(X) for the 0-truncated Poisson distribution are as follows:
\[ \begin{align} \Large E(X)=\left(\frac{1{} }{1-e^{-\lambda}}\right)\cdot\lambda \end{align} \]
\[
\begin{align}
\Large V(X)=\left(\frac{1{} }{1-e^{-\lambda}}\right)^2\cdot\lambda
\end{align}
\]
Deriving the MLE using R-commands and some graphs
1. MLE for an ordinary Poisson distribution
library("ggplot2") # Loads the ggplot2-package for graphics
n <- 40 # Number of simulated X-values.
lam <- 2.2 # The true papametwr value.
Xdata <- rpois(n, lam) # Creating the data.
MLE <- mean(Xdata) # MLE-estimate for an oprdinary Poisson distribution.
library(ggplot2)
undre <- 1 # Limits for a loop and X-axis of graph.
ovre <- 4
step <- (ovre - undre)/200 # Step size.
lamlista <- c() # To store lambda-values.
likeli <- c() # To store X-values.
raknare <- 1 # A counter.
for (lamvarde in seq(undre, ovre, by = step)) { # A loop for calculating L(lambda) for various lambda-values.
lamlista[raknare] <- lamvarde
likeli[raknare] <- prod((lamvarde^Xdata * exp(-lamvarde))/factorial(Xdata))
raknare <- raknare + 1
}
toppMax <- max(likeli) # Lowering the top of the curve.
likeli <- likeli*0.95
resultat <- data.frame(lamlista, likeli) # Creating a data frame for the plotting.
grafordPo <- ggplot(resultat, aes(x = lamlista, y = likeli)) + geom_line() + xlim(undre, ovre) +
ylab('Likelihood function') + xlab('lambda values') +
geom_vline(xintercept = MLE, linetype = "dotted", color = "grey", linewidth = 1) +
theme(axis.text.y=element_blank()) +
geom_vline(xintercept = lam, linetype = "dotted", color = "red", linewidth = 1) +
theme(axis.text.y=element_blank()) + ylim(0, toppMax) +
geom_point(x = MLE, y = 0, shape = 17, colour = "grey", size = 4.5) + geom_point(x = lam, y = 0,
shape = 17, colour = "red", size = 4.5) +
annotate("text", x = -Inf, y = Inf, label="Likelihood function ordinary Poisson", vjust = 1.8,
hjust = -0.05, size = 6, colour = "blue" ) +
annotate("text", x = -Inf, y = Inf, label="Red: true lambda", vjust = 3.8, hjust = -0.1,
size = 6, colour = "red" ) +
annotate("text", x = -Inf, y = Inf, label="Grey: MLE", vjust = 5.8, hjust = -0.15,
size = 6, colour = "grey" )
grafordPo
Comments. The curve above takes its maximum at the MLE, i.e. the ‘best’ value of the λ-value.
2. Logarithm of MLE for an ordinary Poisson distribution
undre <- 1
ovre <- 4
basicPoLn <- function(lam) {-n*lam + log(lam) * sum(Xdata) - sum(factorial(Xdata))}
graf <- ggplot() + xlim(undre, ovre) + geom_vline(xintercept = MLE, linetype = "dotted", color = "grey",
linewidth = 1) + # vertical line at MLE
geom_vline(xintercept = lam, linetype = "dotted", color = "red",
linewidth = 1) + # vertical line at true lambda.
theme(axis.text.y=element_blank()) +
annotate("text", x = -Inf, y = Inf, label="Log likelihood function ordinary Poisson", vjust = 1.8,
hjust = -0.05, size = 6, colour = "blue" ) +
annotate("text", x = -Inf, y = Inf, label="Red: true lambda", vjust = 3.8, hjust = -0.1, size = 6,
colour = "red" ) +
annotate("text", x = -Inf, y = Inf, label="Grey: MLE", vjust = 5.8, hjust = -0.15, size = 6,
colour = "grey" ) +
geom_function(fun = basicPoLn, colour = 'blue', linewidth = 1, linetype = 1) + ylab('Log likelihood function') +
xlab('lambda values') + theme(axis.text.y=element_blank())
graf
Comments. The curve above is the log of the likelihood. The reason for using the log (instead of the original expession) is that it is easier to use the tool of derivative on a log-function, see next step.
3. Derivative of logarithm of MLE for an ordinary Poisson distribution
derivatabasicPoLn <- function(lam) {-n + 1/lam * sum(Xdata) }
graf <- ggplot() + xlim(undre, ovre) + geom_hline(yintercept = 0, linetype = "dotted", color="black",
linewidth=1) +
geom_vline(xintercept = MLE, linetype = "dotted", color = "grey", linewidth = 1) +
theme(axis.text.y=element_blank()) +
geom_vline(xintercept = lam, linetype = "dotted", color = "red", linewidth = 1) +
theme(axis.text.y=element_blank()) +
annotate("text", x = -Inf, y = Inf, label="Derivative of log likelihood function ordinary Poisson",
vjust = 1.8, hjust = -0.05, size = 6, colour = "blue" ) +
annotate("text", x = -Inf, y = Inf, label="Red: true lambda", vjust = 3.8, hjust = -0.1, size = 6,
colour = "red" ) +
annotate("text", x = -Inf, y = Inf, label="Grey: MLE", vjust = 5.8, hjust = -0.15, size = 6, colour = "grey" ) +
annotate("text", x = -Inf, y = 0, label="0", vjust = -0.15, hjust = -0.2, size = 6, colour = "black" )
graf <- graf + geom_function(fun = derivatabasicPoLn, colour = 'red', linewidth = 1, linetype = 1) +
ylab('') + xlab('lambda values')
graf
MLEest <- mean(Xdata) # The MLE of a Poisson is equal to the mean of the X-data.
paste0("Mean of X-data (= MLE for an ordinary Poisson distribution = ", MLEest)
## [1] "Mean of X-data (= MLE for an ordinary Poisson distribution = 1.95"
Comments. The curve above (the derivative) crosses the 0-line at MLE. This value is also equal to the mean of the data.
The 0-truncated Poisson distribution
The probability distribution is here multiplied by a constant. The maximum of L(λ) is obtained by R-commands.
\[ \begin{align} \Large L(\lambda) =\prod_{i = 1}^{n} \frac{1{} }{1-e^{-\lambda}}\cdot\frac{\lambda^{x_i} e^{-\lambda}}{x_i!} \end{align} \]4. MLE-function for a 0-truncated Poisson distribution
n <- 50
lam <- 1.2
Xdata <- rpois(n, lam)
Xdata <- Xdata[Xdata > 0] # Removes 0-values from the data.
undre <- 0.1
ovre <- 3
lamlista <- c()
likeli <- c()
raknare <- 1
for (lamvarde in seq(undre, ovre, by = 0.01)) {
lamlista[raknare] <- lamvarde
likeli[raknare] <- prod( 1/(1-exp(-lamvarde)) * lamvarde^Xdata * exp(-lamvarde)/factorial(Xdata))
raknare <- raknare + 1
}
resultat <- data.frame(lamlista, likeli)
maxrad <- which.max(resultat$likeli)
maxlam <- resultat$lamlista[maxrad] # lam-value giving highest of 'max likelihood function'.
# ----- Finds maximum och 'Likelihood function' by optimize-function in R.
likelihood <- function(lamvarde) { prod( 1/(1-exp(-lamvarde)) * lamvarde^Xdata * exp(-lamvarde)/factorial(Xdata)) }
answer <- optimize(likelihood, interval = c(undre, ovre), maximum = TRUE)
x_max <- answer$maximum
# ----------------------
toppMax <- max(likeli) # Lowering the top of the curve.
likeli <- likeli*0.95
resultat <- data.frame(lamlista, likeli)
graftruncPo <- ggplot(resultat, aes(x = lamlista, y = likeli)) + geom_line() + xlim(undre, ovre) +
ylab('Likelihood function') + xlab('lambda values') +
geom_vline(xintercept = x_max, linetype = "dotted", color = "grey", linewidth = 1) +
theme(axis.text.y=element_blank()) +
geom_vline(xintercept = lam, linetype = "dotted", color = "red", linewidth = 1) +
theme(axis.text.y=element_blank()) + ylim(0, toppMax) +
geom_point(x = x_max, y = 0, shape = 17, colour = "grey", size=4.5) + geom_point(x = lam, y = 0,
shape = 17, colour = "red", size = 4.5) +
annotate("text", x = -Inf, y = Inf, label="Likelihood function 0-truncated Poisson", vjust = 1.8,
hjust = -0.05, size = 6, colour = "blue" ) +
annotate("text", x = -Inf, y = Inf, label="Red: true lambda", vjust = 3.8, hjust = -0.1, size = 6,
colour = "red" ) +
annotate("text", x = -Inf, y = Inf, label="Grey: MLE", vjust = 5.8, hjust = -0.15, size = 6, colour = "grey" )
graftruncPo
Comments. The maximum of the curve above is obtained by using an ‘optimize’-function of R.
Final remarks. This document tries to illuminate the idea of Maximum Likelihood as a method of estimating a parameter. This method can of course also be used for other probability distributions.