Hosmer-Lemeshow goodness of fit test

A Stata program that implements the Hosmer-Lemeshow goodness of fit test, including using external prediction probabilities

By Gareth Ambler

The Hosmer-Lemeshow goodness of fit test can be used to test whether observed binary responses, Y, conditional on a vector of p covariates (risk factors and confounding variables) x, are consistent with predictions, π. In other words it is a test of the hypothesis

H0: Pr(Y=1|x) = π

The predictions, π, often come from a recently fitted logistic regression model, so that:

π = logit(β0 + β1x1 + β2x2 + ... + βpxp)

where βj are the regression parameters. See Lemeshow and Hosmer's American Journal of Epidemiology article for more details.

Although the Hosmer-Lemeshow test is currently implemented in Stata (see lfit), hl can be used to assess predictions not just from the last regression model, but also from an external source (such as a published risk score). In addition, by using the plot option you can easily see how the observed and expected proportions compare within the groups formed by the Hosmer-Lemeshow test. This is commonly referred to as a calibration plot:

Hosmer-Lemeshow calibration plot

hl allows calculation of both the usual C statistic (based on equally sized groups) and the H statistic (based on fixed cut-points on the predictions). To calculate H use the q() option with your own grouping variable.

Example

Predictions have already been obtained and are stored in the variable phat. The binary response variable is y. To calculate C we type:

hl y phat

This uses the default of ten equally sized groups (decile groups) to construct the test statistic, C. To calculate H using the risk groups 0 - 0.1, 0.1 - 0.2, ..., 0.9 - 1, we type:

egen dec=cut(phat), at(0(0.1)1)
hl y phat, q(dec) plot

The calibration plot produced by this command is shown below. The larger circles indicate that these points are based on more data. The reason there isn't 10 groups is because there were no predictions below 0.10 or above 0.72.

Hosmer-Lemeshow calibration plot

Installation

To obtain hl type the following into Stata:

net from https://www.sealedenvelope.com/

and follow the instructions on screen. This will ensure the files are installed in the right place and you can easily uninstall the command later if you wish.