The Poisson distribution is commonly used to analyze ‘event’ data such as ‘number of customers per week’, ‘number of customer complaints per month’, ‘number of bacteria in a millilitre’, ‘number of flaws per square’, etc.
The lowest numerical value is of course 0 but there is no upper limit. The probability distribution contains only one parameter, often designated by λ. The expressions for the mean and the variance (sigma-squared) are showed below, simply as λ (The derivation is quite simple but is not shown here).
(See https://ovn.ing-stat.se [7] and [20] for more details about the Poisson distribution.)


\[ \begin{align} \large f(x) = \frac{\lambda e^{-\lambda x}}{x!} & \phantom{litet tomt ut} \large \mu = \lambda & \phantom{litet tomt ut} \large \sigma^{2} = \lambda \end{align} \]


As the mean and variance are equal in a Poisson, this can be used as a crude test of data at hand. The calculated value mean/variance should be at least equal to 1. If the value is less than 1, this means that the variance of the data is exessive, indication that the data contains some unknown sources of variation.
In this document regression analysis is used to investigate the data which consists of the results from two different machines. However, the ordinary theories of regression analysis - based on normal probability theory - is extended to incorporate the Poisson distribution using so-called ‘general linearized models’ (which in fact includes the normal distribution).


Four diffent simulated examples:


   1. Both groups having identical Po-intensities, no change over time

   2. Different Po-intensities, no change over time

   3. Different Po-intensities changing over time

   4. Different Po-intensities changing in different directions


Note that for each example a curve is created that gives the lambda values (see the first six lines of code in each example).



 1. Both groups having identical Po-intensities, no change over time

The example shows data from two groups/machines/processes ‘A’ and ‘B’, each from a Poisson distribution with the same lambda-value (here a straight line):


library(ggplot2)                               # Contains functions for graphing.
library(dplyr)
library(patchwork)                             # Different placing av graphs.

y1A <- 2.5; Xstart <- 0                        # Start point for curve of Po-intensity grp 'A'.
y2A <- 2.5; Xstopp <- 200                      # Stop point for curve of Po-intensity grp 'A'.
lamA <- -log(y2A/y1A)/Xstopp

y1B <- 2.5                                     # Start point for curve of Po-intensity grp 'B'.
y2B <- 2.5                                     # Stop point for curve of Po-intensity grp 'B'.
lamB <- -log(y2B/y1B)/Xstopp

Xaxel <- c(0:(Xstopp-1))                       # X-axis.

lambdaChangeGrpA <- y1A * exp(-lamA*Xaxel)     # Column of intensity values, grp A.
lambdaChangeGrpB <- y1B * exp(-lamB*Xaxel)     # Column of intensity values, grp B.

dataA   <- rpois(Xstopp, lambdaChangeGrpA)     # Simulated Po-data grp A.
dataB   <- rpois(Xstopp, lambdaChangeGrpB)     # Simulated Po-data grp B.
allData <- c(dataA, dataB)                     # All data in one column.

# ------- Start graphs Example 1

result <- data.frame(allData) %>% group_by(allData) %>% summarise(number = n())

hist <- ggplot(result, aes(x = allData, y = number, fill = 'red')) +
        geom_bar(stat = 'identity', width = 0.4, colour = 'black') +
        scale_x_continuous(breaks = allData) +
        labs(x = "Simulated Poisson data, Group A and B", y = "") +
        theme(legend.position="none")

average  <- mean(allData)                  # Average and variance of simulated data.
variance <- var(allData)

func1 <- function(Xaxel) {y1A * exp(-lamA*Xaxel)}   # Theoretical intensity curves.
func2 <- function(Xaxel) {y1B * exp(-lamB*Xaxel)}

dia <- ggplot() + xlim(0, Xstopp) +                 # Diagram of the two intensity curves.
   geom_function(fun = func1, colour='red', linewidth=1, linetype=2) +
   geom_function(fun = func2, colour='black', linewidth=1, linetype=2) +
   annotate("text", x = mean(Xaxel), y = Inf, label="Intensity curves\n(red: grp A, black: grp B)\n(Example 1)",
            vjust = 1.5, size = 5, colour = "blue") +
   ylab("Po-intensity values") + xlab("")

 dia + hist                                 # Side by side.

Example 1. The left diagram above shows the two lambda values (here identical) för the two groups A and B. The histogram shows all data, i.e. both the A- and the B-data.
Below the very analysis is performed using the ‘glm()’-function (glm stands for ‘General linearized models’, the ‘family’-option indicates tha Poisson-data is analyzed):


Ttot    <- c(Xaxel, Xaxel)
allData <- c(dataA, dataB)

process <- c(rep("A", length(Xaxel)), rep("B", length(Xaxel)))

analysAB <- glm(allData ~ process + Ttot, family=poisson)
summary(analysAB)
## 
## Call:
## glm(formula = allData ~ process + Ttot, family = poisson)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  9.401e-01  7.035e-02  13.362   <2e-16 ***
## processB    -4.643e-02  6.355e-02  -0.731    0.465    
## Ttot        -9.945e-05  5.502e-04  -0.181    0.857    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 499.23  on 399  degrees of freedom
## Residual deviance: 498.67  on 397  degrees of freedom
## AIC: 1500.1
## 
## Number of Fisher Scoring iterations: 5

Exampel 1. The analysis above shows a common printout for regression analyses. The ‘Estmate’-column contains the estimated parameters of the linear model. As the simulated data contains no difference between the two groups and no trend in the lambda parameter the analysis is expected. (Note that the parameter of ‘processB’ is shown as a comparison to ‘processA’.)

The model of lambda. Suppose that the three ‘Estimate’ are close to 0.92, 0, and 0 (varies between runs):

\[ \begin{align} \Large f(x) = e^{(0.92+0\cdot processB + 0\cdot Ttot)}= e^{0.92}=2.5 \phantom{invisible}\text{Note:} & \phantom{litet tomt ut} \large e^{0}=1 \end{align} \]


The result corresponds to the simulated value in ‘Example 1’.


Below the two estimated curves and all data is shown:


b0 <- analysAB$coef[1]
b1 <- analysAB$coef[2]
b2 <- analysAB$coef[3]

funcA <- function(Xaxel) {exp(b0 +  0 + b2 * Xaxel)}
funcB <- function(Xaxel) {exp(b0 + b1 + b2 * Xaxel)}

farg<- recode(process, `A` = 2, `B` = 1)

graf <- ggplot() +
  geom_point(aes(c(Xaxel, Xaxel), allData), size=2, col=farg) +
  geom_function(fun = funcA, colour='red', linewidth=1) +
  geom_function(fun = funcB, colour='black', linewidth=1, linetype=2) +
  ylab("Po-data") + xlab("") +
  annotate("text", x = mean(Xaxel), y = Inf, 
           label="Po-data and estimated intensity curves\n(red: grp A, black: grp B, Example 1)",
           vjust = 1.1, size = 5, colour = "blue")
graf

Example 1. The diagram above contains all data and the two estimated curves. These curves were simulated as two identical lines. Any difference between them as well as any slope is caused by the randomness in the data.






 2. Different Po-intensities, no change over time

The example shows again data from the two groups ‘A’ and ‘B’, but here each group has its own constant parameter value:


library(ggplot2)
library(dplyr)
library(patchwork)

y1A <- 1.5; Xstart <- 0                        # Start point for curve of Po-intensity grp 'A'.
y2A <- 1.5; Xstopp <- 200                      # Stop point for curve of Po-intensity grp 'A'.
lamA <- -log(y2A/y1A)/Xstopp

y1B <- 2.5                                     # Start point for curve of Po-intensity grp 'B'.
y2B <- 2.5                                     # Stop point for curve of Po-intensity grp 'B'.
lamB <- -log(y2B/y1B)/Xstopp

Xaxel <- c(0:(Xstopp-1))                       # X-axis.

lambdaChangeGrpA <- y1A * exp(-lamA*Xaxel)     # Column of intensity values, grp A.
lambdaChangeGrpB <- y1B * exp(-lamB*Xaxel)     # Column of intensity values, grp B.

dataA   <- rpois(Xstopp, lambdaChangeGrpA)     # Simulated Po-data grp A.
dataB   <- rpois(Xstopp, lambdaChangeGrpB)     # Simulated Po-data grp B.
allData <- c(dataA, dataB)                     # All data in one column.

# ------- Start graphs Example 2

result <- data.frame(allData) %>% group_by(allData) %>% summarise(number = n())

hist <- ggplot(result, aes(x = allData, y = number, fill = 'red')) +
  geom_bar(stat = 'identity', width = 0.4, colour = 'black') +
  scale_x_continuous(breaks = allData) +
  labs(x = "Simulated Posson data, Group A and B", y = "") +
  theme(legend.position="none")

average  <- mean(allData)                           # Average and variance of simulated data.
variance <- var(allData)

func1 <- function(Xaxel) {y1A * exp(-lamA*Xaxel)}   # Theoretical intensity curves.
func2 <- function(Xaxel) {y1B * exp(-lamB*Xaxel)}

dia <- ggplot() + xlim(0, Xstopp) +                 # Diagram of the two intensity curves.
  geom_function(fun = func1, colour='red', linewidth=1, linetype=2) +
  geom_function(fun = func2, colour='black', linewidth=1, linetype=2) +
  annotate("text", x = mean(Xaxel), y = Inf, label="Intensity curves\n(red: grp A, black: grp B)\n(Example 2)",
           vjust = 1.5, size = 5, colour = "blue") +
  ylab("Po-intensity values") + xlab("")

dia + hist                                 # Side by side.

Example 2. The left diagram above shows the two lambda values for the two groups A and B. The histogram shows the mixture of all data.
Below is the analysis:


Ttot    <- c(Xaxel, Xaxel)

process <- c(rep("A", length(Xaxel)), rep("B", length(Xaxel)))

analysAB <- glm(allData ~ process + Ttot, family=poisson)
summary(analysAB)
## 
## Call:
## glm(formula = allData ~ process + Ttot, family = poisson)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 0.2957950  0.0854953   3.460 0.000541 ***
## processB    0.5187938  0.0729209   7.114 1.12e-12 ***
## Ttot        0.0010826  0.0006116   1.770 0.076698 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 505.92  on 399  degrees of freedom
## Residual deviance: 450.46  on 397  degrees of freedom
## AIC: 1350.8
## 
## Number of Fisher Scoring iterations: 5

Example 2. The analysis above shows that the term for ‘process’ is significant (< 0.05), but the term ‘Ttot’ is not, as exepcted (see the column far right).

The model of lambda. Using the three ‘Estimate’ from a simulation (rounded to two decimals), the following can be calculate (note that the software codes ‘A’ as 0 and ‘B’ as 1 unless otherwise is stated):


\[ \begin{align} & \large exp(0.40+0.57\cdot processA + 0\cdot Ttot)=1.49 \phantom{invisible} \Large (process A = 0) \\ \\ & \large exp(0.40+0.57\cdot processB + 0\cdot Ttot)=2.64 \phantom{invisible} \Large (process B = 1) \end{align} \]


The result corresponds to the simulated value in ‘Example 2’.


Below the two estimated curves and all data are shown:


b0 <- analysAB$coef[1]
b1 <- analysAB$coef[2]
b2 <- analysAB$coef[3]

funcA <- function(Xaxel) {exp(b0 +  0 + b2 * Xaxel)}
funcB <- function(Xaxel) {exp(b0 + b1 + b2 * Xaxel)}

farg<- recode(process, `A` = 2, `B` = 1)

graf <- ggplot() +
  geom_point(aes(c(Xaxel, Xaxel), allData), size=2, col=farg) +
  geom_function(fun = funcA, colour='red', linewidth=1) +
  geom_function(fun = funcB, colour='black', linewidth=1, linetype=2) +
  ylab("Po-data") + xlab("") +
  annotate("text", x = mean(Xaxel), y = Inf, 
           label="Po-data and estimated intensity curves\n(red: grp A, black: grp B, Example 2)",
           vjust = 1.1, size = 5, colour = "blue")
graf

Example 2. The diagram above contains all data and the two estimated curves. These curves were simulated as two different horizontal lines and any slope is caused by the randomness in the data. Note that it is impossible to spot the diffence between A and B by just using the diagram. A more or less formal analysis is needed.





 3. Different Po-intensities changing over time

The example shows again data from the two groups ‘A’ and ‘B’, but here the lambda-parameter changes over time for each group:


library(ggplot2)
library(dplyr)
library(patchwork)

y1A <- 1.5; Xstart <- 0                         # Start point for curve of Po-intensity grp 'A'.
y2A <- 2.5; Xstopp <- 200                       # Stop point for curve of Po-intensity grp 'A'.
lamA <- -log(y2A/y1A)/Xstopp

y1B <- 2.5                                      # Start point for curve of Po-intensity grp 'B'.
y2B <- 3.5                                      # Stop point for curve of Po-intensity grp 'B'.
lamB <- -log(y2B/y1B)/Xstopp

Xaxel <- c(0:(Xstopp-1))                        # X-axis.

lambdaChangeGrpA <- y1A * exp(-lamA*Xaxel)      # Column of intensity values, grp A.
lambdaChangeGrpB <- y1B * exp(-lamB*Xaxel)      # Column of intensity values, grp B.

dataA   <- rpois(Xstopp, lambdaChangeGrpA)      # Simulated Po-data grp A.
dataB   <- rpois(Xstopp, lambdaChangeGrpB)      # Simulated Po-data grp B.
allData <- c(dataA, dataB)                      # All data in one column.

# ------- Start graphs Example 3

result <- data.frame(allData) %>% group_by(allData) %>% summarise(number = n())

hist <- ggplot(result, aes(x = allData, y = number, fill = 'red')) +
  geom_bar(stat = 'identity', width = 0.4, colour = 'black') +
  scale_x_continuous(breaks = allData) +
  labs(x = "Simulated Posson data, Group A and B", y = "") +
  theme(legend.position="none")

average  <- mean(allData)                  # Average and variance of simulated data.
variance <- var(allData)

func1 <- function(Xaxel) {y1A * exp(-lamA*Xaxel)}   # Theoretical intensity curves.
func2 <- function(Xaxel) {y1B * exp(-lamB*Xaxel)}

dia <- ggplot() + xlim(0, Xstopp) +                 # Diagram of the two intensity curves.
  geom_function(fun = func1, colour='red', linewidth=1, linetype=2) +
  geom_function(fun = func2, colour='black', linewidth=1, linetype=2) +
  annotate("text", x = mean(Xaxel), y = Inf, label="Intensity curves\n(red: grp A, black: grp B\nExample 3)",
           vjust = 1.5, size = 5, colour = "blue") +
  ylab("Po-intensity values") + xlab("")

dia + hist  

Example 3. The left diagram above shows the lambda functions for the two groups A and B. The histogram shows the mixture of all data.
Below is the analysis:


Ttot    <- c(Xaxel, Xaxel)

process <- c(rep("A", length(Xaxel)), rep("B", length(Xaxel)))

analysAB <- glm(allData ~ process + Ttot, family=poisson)
summary(analysAB)
## 
## Call:
## glm(formula = allData ~ process + Ttot, family = poisson)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) 0.5275797  0.0759887   6.943 3.84e-12 ***
## processB    0.4690441  0.0644790   7.274 3.48e-13 ***
## Ttot        0.0014024  0.0005445   2.576     0.01 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 490.62  on 399  degrees of freedom
## Residual deviance: 429.60  on 397  degrees of freedom
## AIC: 1448.5
## 
## Number of Fisher Scoring iterations: 5

Example 3. The analysis above shows that terms for ‘process’ and ‘Ttot’ are significant (< 0.05), and should be included in the model.

The model of lambda. Using the three ‘Estimate’ from a simulation (rounded to three decimals), the following functions can be calculate (note that the software codes ‘A’ as 0 and ‘B’ as 1 unless otherwise is stated):


\[ \begin{align} & \lambda =\large exp(0.532+0.372\cdot processA + 0.002\cdot Ttot) \phantom{invi} \large (processA = 0)\\ \\ & \lambda =\large exp(0.532+0.372\cdot processB + 0.002\cdot Ttot) \phantom{invi} \large (processB = 1) \end{align} \]


The result corresponds to the two functions simulated in ‘Example 3’.


Below the two estimated curves and all data is shown:


b0 <- analysAB$coef[1]
b1 <- analysAB$coef[2]
b2 <- analysAB$coef[3]

funcA <- function(Xaxel) {exp(b0 +  0 + b2 * Xaxel)}
funcB <- function(Xaxel) {exp(b0 + b1 + b2 * Xaxel)}

farg<- recode(process, `A` = 2, `B` = 1)

graf <- ggplot() +
  geom_point(aes(c(Xaxel, Xaxel), allData), size=2, col=farg) +
  geom_function(fun = funcA, colour='red', linewidth=1) +
  geom_function(fun = funcB, colour='black', linewidth=1, linetype=2) +
  ylab("Po-data") + xlab("") +
  annotate("text", x = mean(Xaxel), y = Inf, 
           label="Po-data and estimated intensity curves\n(red: grp A, black: grp B, Example 3)",
           vjust = 1.1, size = 5, colour = "blue")
graf

Example 3. The diagram above contains all data and the two estimated curves. These curves were simulated as two different lines. Note that it is impossible to spot the diffence between A and B by just using the diagram. A more or less formal analysis is needed.





 4. Different Po-intensities changing in different dirctions

The example shows again data from the two groups ‘A’ and ‘B’, but here the lambda-parameter changes over time but in different directions (increasing or decreasing lambda).
This is called an interaction between variables and can be described with a question and its answer: ‘How does the lambda intensity change over time? It depends on which group is considered’. This dependence is modelled with an inteaction term in the model (‘cross product’).


library(ggplot2)
library(dplyr)
library(patchwork)

y1A <- 1.5; Xstart <- 0                         # Start point for curve of Po-intensity grp 'A'.
y2A <- 2.5; Xstopp <- 200                       # Stop point for curve of Po-intensity grp 'A'.
lamA <- -log(y2A/y1A)/Xstopp

y1B <- 2.5                                      # Start point for curve of Po-intensity grp 'B'.
y2B <- 1.5                                      # Stop point for curve of Po-intensity grp 'B'.
lamB <- -log(y2B/y1B)/Xstopp

Xaxel <- c(0:(Xstopp-1))                        # X-axis.

lambdaChangeGrpA <- y1A * exp(-lamA*Xaxel)      # Column of intensity values, grp A.
lambdaChangeGrpB <- y1B * exp(-lamB*Xaxel)      # Column of intensity values, grp B.

dataA   <- rpois(Xstopp, lambdaChangeGrpA)      # Simulated Po-data grp A.
dataB   <- rpois(Xstopp, lambdaChangeGrpB)      # Simulated Po-data grp B.
allData <- c(dataA, dataB)                      # All data in one column.

# ------- Start graphs Example 4

result <- data.frame(allData) %>% group_by(allData) %>% summarise(number = n())

hist <- ggplot(result, aes(x = allData, y = number, fill = 'red')) +
  geom_bar(stat = 'identity', width = 0.4, colour = 'black') +
  scale_x_continuous(breaks = allData) +
  labs(x = "Simulated Posson data, Group A and B", y = "") +
  theme(legend.position="none")

average  <- mean(allData)                  # Average and variance of simulated data.
variance <- var(allData)

func1 <- function(Xaxel) {y1A * exp(-lamA*Xaxel)}   # Theoretical intensity curves.
func2 <- function(Xaxel) {y1B * exp(-lamB*Xaxel)}

dia <- ggplot() + xlim(0, Xstopp) +                 # Diagram of the two intensity curves.
  geom_function(fun = func1, colour='red', linewidth=1, linetype=2) +
  geom_function(fun = func2, colour='black', linewidth=1, linetype=2) +
  annotate("text", x = mean(Xaxel), y = Inf, label="Intensity curves\n(red: grp A, black: grp B\nExample 4)",
           vjust = 1.5, size = 5, colour = "blue") +
  ylab("Po-intensity values") + xlab("")

dia + hist                   

Example 4. The left diagram above shows the lambda functions for the two groups A and B. The histogram shows the mixture of all data.
Below is the analysis:


Ttot    <- c(Xaxel, Xaxel)

process <- c(rep("A", length(Xaxel)), rep("B", length(Xaxel)))

analysAB <- glm(allData ~ process + Ttot + process*Ttot, family=poisson)
summary(analysAB)
## 
## Call:
## glm(formula = allData ~ process + Ttot + process * Ttot, family = poisson)
## 
## Coefficients:
##                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)    0.5538858  0.1047863   5.286 1.25e-07 ***
## processB       0.4728098  0.1384168   3.416 0.000636 ***
## Ttot           0.0007679  0.0008938   0.859 0.390265    
## processB:Ttot -0.0037020  0.0012330  -3.002 0.002678 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 494.51  on 399  degrees of freedom
## Residual deviance: 478.97  on 396  degrees of freedom
## AIC: 1381.1
## 
## Number of Fisher Scoring iterations: 5

Example 4. The analysis above shows that all terms in the model are significant and should be included in the model. The interaction term is here designated as ‘processB:Ttot’

The model of lambda. Using the four ‘Estimate’ from a simulation (rounded to three decimals), the following functions can be calculate (note that the software codes ‘A’ as 0 and ‘B’ as 1 unless otherwise is stated):


\[ \begin{align} & \lambda =\large exp(0.288+0.587\cdot processA + 0.004\cdot Ttot-0.005\cdot processA \cdot Ttot) \phantom{inve} \large (processA = 0)\\ \\ & \lambda =\large exp(0.288+0.587\cdot processB + 0.004\cdot Ttot-0.005\cdot processA \cdot Ttot) \phantom{inve} \large (processB = 1) \end{align} \]


b0 <- analysAB$coef[1]
b1 <- analysAB$coef[2]
b2 <- analysAB$coef[3]
b3 <- analysAB$coef[4]

farg<- factor(recode(process, `A` = 2, `B` = 1))

funcA <- function(Xaxel) {exp(b0 + b1*0 + b2*Xaxel + b3*Xaxel*0)}
funcB <- function(Xaxel) {exp(b0 + b1*1 + b2*Xaxel + b3*Xaxel*1)}

graf <- ggplot() +
  geom_point(aes(c(Xaxel, Xaxel), allData), size=2, col=farg) +
  geom_function(fun = funcA, colour='red', linewidth=1) +
  geom_function(fun = funcB, colour='black', linewidth=1, linetype=2) +
  ylab("Po-data") + xlab("") +
  annotate("text", x = mean(Xaxel), y = Inf, 
           label="Po-data and estimated intensity curves\n(red: grp A, black: grp B, Example 4)",
           vjust = 1.1, size = 5, colour = "blue")
graf

Example 4. The diagram above contains all data and the two estimated curves with very different slopes. Note that it is impossible to spot the diffence between A and B by just using the diagram. A more or less formal analysis is needed.


Final remarks
The results in the examples can perhaps be positive or negative. One process owner perhaps likes a decreasing λ (decrease in number of customer complaints?). Another process owner perhaps likes an increasing λ-trend (increased number of customers?), etc.
The examples above are by no means exhausting the possibilities and perhaps necessary diagnostics. But here the simplicity has been emphazised.
There are many books about the subject of ‘Po-regression’. Here the book ‘Negative binomial Regression’ (2nd ed, Joseph M. Hilbe) has been used, especially chapter 7.


(The analyses have been performed by the ‘R’-software and with the graphical interface ‘R’-studio. Both these ara available on the net. See https://www.indstat.se and the button [Statistikprogram - R] for installation.)

(See https://www.indstat.se for many more simulations and animations.)