A Stata program to fill in values within clusters

By Tony Brady

If you haven't already done so, you may find it useful to read the article on xtab because it discusses what we mean by longitudinal data and static variables.

xfill is a utility that 'fills in' static variables. It replaces missing values in a cluster with the unique non-missing value within that cluster. It's easiest to see what's meant by this with an example.


To follow this example in Stata type:

use http://www.sealedenvelope.com/stata/long.dta

in the Stata command window.

Look at the variable sex in our example dataset (long.dta):

data listing

sex is a static variable since it does not change within cluster (in this case the cluster is the patient). As is commonly the case with static variables, it has been recorded only in the first record within cluster. To fill in the missing values we type

. xfill sex, i(idnum)

So now our data looks like this:

data listing after xfill

Why would we want to do this? The reason is that it makes the output from other longitudinal commands easier to interpret when performed in subgroups. For instance, look at the results of xcount within by groups before we filled in sex:

xcount by sex

This suggests that we have 14 males, 1 female and 14 patients whose gender is unknown. In fact we know the sex of all 15 patients in this dataset. The 14 phantom patients of unknown gender are being produced because patients are recorded as either male or female in the 1st record and as unknown sex in 2nd and subsequent records. One patient only has a 1st record and no subsequent records which is why the number of patients of unknown gender isn't 15.

The simple solution to this problem is to replace the missing values with their true values, in this case the patient's sex:

xcount by sex after xfill

Now the 14 patients of apparently unknown gender are eliminated from our results. Similar problems occur when summarising static variables using if and in clauses. xfill prevents such confusion arising.


To obtain xfill type the following into Stata:

net from http://www.sealedenvelope.com/

and follow the instructions on screen. This will ensure the files are installed in the right place and you can easily uninstall the command later if you wish.