By Tony Brady

xtab is a generalization of the standard Stata tabulate command, that performs one-way tabulations of longitudinal data.

Longitudinal data refers to information on clusters that is contained in multiple records. Examples are:

Cluster Record
Family Person (mother, father, child etc)
Country GDP by year
Patient Follow-up appointment

The records in two of these examples are ordered within cluster (follow-up appointment and GDP by year), but in the family example they are not. In Stata, longitudinal data that is ordered by time is called cross-sectional time-series (xt) data. xtab is suited to both ordered and unordered longitudinal data.

Example

To follow this example in Stata type:

use http://www.sealedenvelope.com/stata/long.dta

in the Stata command window.

Patients in a clinical trial were regularly monitored. Systolic blood pressure (sbp) was measured at each visit and patients were asked whether they were currently taking beta-blockers (beta). Here's an extract of the data (long.dta):

idnum date sex sbp beta region
10927 Feb 92Male180NoLondon
10924 Sep 92.140.London
10925 Mar 93.156YesLondon
10923 Sep 93.150.London
11027 Feb 92Male160NoScotland
11022 Oct 92.120.Scotland
11022 Apr 93.130.Scotland
11028 Oct 93.130.Scotland
11028 Apr 94.130.Scotland
11027 Oct 94.152.Scotland
1105 Jan 95.132.Scotland
11027 Apr 95.164.Scotland
11127 Feb 92Male130YesScotland
11227 Feb 92Male148NoScotland
11217 Dec 92.146NoScotland

Longitudinal datasets must always contain a variable that identifies the clusters. In this example the variable is idnum, which contains a unique patient identifying number. All records with the same idnum belong to the same patient. This is the variable you should name in the i() option of xtab and other xt commands. Alternatively you can declare the unique cluster identifier to Stata upfront using the iis command. This is recommended because it means you don't have to keep typing the i() option every time you use xtab.

. iis idnum
. xtab sex

is equivalent to:

. xtab sex, i(idnum)

Either way, we get the following output:

xtab output

The tabulation is at the cluster level rather than individual record level. It tells us there are 15 clusters in this dataset; 14 men and one woman. It turns out that we get the same output from the usual Stata command:

. tab sex

because the sex variable is missing for all records except the first record within each cluster. This is not the case for the region variable, and using Stata's tabulate command gives very different results to xtab:

region

The xtab results tell us 11 patients are from Scotland, 3 are from London and 1 is from Leicester.

more>>



To obtain xtab download this zip file and unzip the contents into your personal ado directory (c:\ado\personal or similar). Alternatively (and better) type the following into Stata:

net from http://www.sealedenvelope.com/

and follow the instructions on screen. This will ensure the files are installed in the right place and you can easily uninstall the command later if you wish.