All simulated data

Average:

Std dev:

Min:

Max:

Censored data

Average:

Std dev:

Min:

Max:

Show/hide 'life lines'

Show/hide 'hazard function'

Info, references, m.m...

Life length analysis – some exercises with Weibull

Hazard function

0

*a*-parameter (shape):

*b*-parameter (scale):

Number of values (40-5000):

Censoring time:

Expected value (*μ*):

Standard deviation (*σ*):

%Life (fig 1)...

%Life (fig 2)...

The top four inputs contain values that control the simulation. They are also recorded in the commands used together with the Minitab-macro.

** a-parameter (shape):** This parameter controls the shape of the distribution. Values around 3.6 gives an approximately
symmetrical distribution.

** b-parameter (scale):** This parameter controls how the distribution stretches over the (positive) X-axis.

**Number of values:** The number of simulates values, shown below the distribution. Values to the right of
the censoring value (vertical red line) are grey. These values are not known but replaced by the censor value in the analysis.

**Censoring time:** The time value on the X-axis at which an item on test will be stopped and this value is recorded as its 'life time'.

**Expected value:** The theoretical mean, se formula below.

**Standard deviation:** The theoretical standard deviation, se formula below.

**"%Life 'fig 1'...":** An example of the 'fig 1'-graph created by Minitab-macro.

**"%Life 'fig 2'...":** An example of the 'fig 2'-graph created by Minitab-macro.

Calculation of mu (*μ*) and sigma (*σ*):

The following two expressions are valid for all distributions:

$$\mu =E\left(X\right)$$
$${\sigma}^{2}=E(X-\mu {)}^{2}$$

For the Weibull distribution the following is valid:

••••

**The distribution** shown is a Weibull
distribution based on the values of the *a*- and *b*-parameters.
A Weibull distribution starts at 0 on the X-axis and stretches towards the positive infinity and it can be negatively skewded,
symmetrical and positively skewed, depending of the value of the *a*-parameter.

**The expected value** is indicated by a short vertical, red line on the X-axis. The standard deviation is shown as vertical
grey dotted lines. The censor time is indicated by a vertical red line across the distribution.

Every time a parameter is changed, the distribution is changed and a sample is simulated.

••••

'Copy/paste' the two rows to the 'Session window' in Minitab at the "MTB >"-prompt and press the [Enter]-key of the keyboard. This will launch the Minitab-macro '%LifeWeb'.

The first row contains the call of the macro. The second row contains:

- the *a*-parameter

- the *b*-parameter

- number of values

- the censor value

- an indication of 'Weibull'/'Exponential' ((0/1), this value is controlled by the *a*-parameter)

Any change of an input value will be stored in the command area.

••••

**Exercise 1** – a first run

Click the different 'info'-buttons to get an idea of this program. (The Minitab macro involved ("%HistWeb") can be downloaded from the front page
(ovn.ing-stat.se) via the button upper right.) These exercises have points common with also the pages http://ovn.ing-stat.se/multfx/flerafx5.php and
http://ovn.ing-stat.se/fordelningar/WeibSlid1.php. An example of the graphs created can be found under "%Life (fig 1)" "%Life (fig 1)".

**Exercise 2** – an exponential distribution

The exponential distribution (a special case of the Weibull distribution and also the gamma distribution) is of outmost importance in many statistical
analyses. Here the *a*-parameter = 1 which also is its expected value and standard deviation. A common description of the exponential distribution is its
'lack-of-memory', 'no-aging' or simular. This is a reference to its hazard function being a straight horizontal line which means that the hazard does
not increase nor decrease over time (thus 'no-aging' or 'no-memory' as it does not remember where its is on the X-axis).

**Exercise 3** – a Rayleigh distribution

Here the *a*-parameter = 2 which creates a hazard function as a straight increasing line. The Rayleigh distribution is common in telecommunication. It
is also a model for radial distances when the X- and Y-deviations are normally distributed.

**Exercise 4** – a decreasing hazard function

When the *a*-parameter < 1 the a hazard function decreasing. This means that the product or process gets an increased quality. Perhaps faults and flaws in a design
are removed. This is typical in computer programs, although these are tested before release there might be many faults appearing when the design is launched and used
by many people. (Perhaps there will be an *increase* in the hazard function much later when e.g. the platform is redesigned, supporting software is changed etc.).

••••

**The concept** of *Life length analysis* contains all the ideas and theories connected to
any statistical analysis – data, distributions, hypothesis, analysis, inference, graphs, report, etc.

However, life length analysis contains a few extra topics such as *censoring of data*.
This means e.g. that an item under test is run up to a certain time where the test is aborted.

This is usually a small loss compared to the practical gain of e.g. planning the whole test, use of resources, etc.

**The analysis** also involves some extra features such as *hazard function*, *hazard graph*,
*survival function*, *survival graph*, etc. These graphs are shown in the Minitab analysis of the input.

**Regression analysis** is a powerful statistical tool that also can be used to understand relationships between variables
and the results.

(There is a rich literature about *life length analysis*, *survival analysis*, *reliability analysis*,
and other connected areas.)

**%Life (fig 1)...** The three diagrams illustrate the concept of censoring. The histogram of the measured result (C)
has a tall bar to the right because of the number of censored values.

**%Life (fig 2)...** The upper right probability plot estimates the Weibull and from this estimate the other three diagrams
are drawn. The estimated *a*- and *b*-parameter (1.32 and 4.7) are fairly close to the true values (1.3 and 4.1).

••••

**The hazard function** models how the 'death rate' changes over time. (Similar rates are 'customers/day',
'phone calls/hour', 'accidents/week', etc.) The hazard function is defined by the expression below and its derivation is
fairly easy to follow having a good book explaining it (Note that the hazard function is not a probability.):

$$h\left(x\right)=\frac{f\left(x\right)}{1-F\left(x\right)}$$

**The shape** and appearance of the hazard function must of course agree with engineering knowledge. If e.g. an analysis gives that
the hazard rate is *decreasing* over time (t.e. the product becomes better over time) while the engineers think that the
product ages and its quality deteriorates, there must be a reanalysis of data and assumptions.
For a Weibull distribution the general formula above is simplified to the following:

$$h\left(x\right)=\frac{a}{{b}^{a}}\cdot {x}^{a-1}$$

**Different parameter settings** (*a* and *b*) gives rather different hazard rates. For example if *a* = 2, the
hazard function becomes a straight increasing line (when *a* = 2 the distribution is sometimes called the
*Rayleigh distribution*):

$$h\left(x\right)=\frac{2}{{b}^{2}}\cdot x$$

**If a = 1,** the hazard function becomes a straight horizontal line (when

$$h\left(x\right)=\frac{1}{{b}^{}}$$

**The 'bathtub'-curve** is a hazard function that is composed of a descreasing, a constant and an increasing hazard function and
typically is used in larger products or systems that have these stages during its life length. (This gives the curve the form
of a bathtub.)

**A final remark.** In certain situations the original time measurements might suggest two possible distributions, either
a *lognormal distribution* or some other distribution (run e.g. http://ovn.ing-stat.se/multfx/flerafx5.php.) However, the hazard
function of a lognormal distribution first increases and then decreases and it can be difficult to motivate this type of behaviour of the
hazard function.

••••

'**All simulated data**' are based on *all* the simulated data,
as if there is no censoring.

**Average:** The average is calculated in the ordinary way, i.e. sum all data and divide by the number of values.
This result should be at least approximately equal to the *Expected value* shown to the left. (This is calculated using the *a*- and *b*-parameters.
The formulas are shown in the 'info' of the parameter frame to the left.)

**Std dev:** The standard deviation is calulated in the ordinary way (see the literature for details). This result should be at least
approximately equal to the *Standard deviation* shown to the left. (The formulas are shown in the 'info' of the parameter frame to the left.)

**Min:** Shows the minimum value. (Theoretical minimum in a Weibull is 0).

**Max:** Shows the maximum value. (Theoretical maximum in a Weibull is the positive infinity).

'**Censored data**' are based on the censored data, i.e.
every simulated value more than the censoring value is replaced by the censoring value.

Doing this the average and standard deviation no longer represent the true values of the process. As seen in the table
these values are smaller than corresponding values for 'All simulated data'.

(Of course, these differences become smaller when the censoring value is set to higher and higer values.)

The above mistake is unfortunately very common when analysis any type of time measurement e.g. waiting times.

••••