At the upcoming Conference on Innovations in Trauma Research Methods (CITRM; November 3-4, 2006, West Hollywood, CA), therewill be a session on the use of propensity scores in trauma research. As a prelude to that session, Dr. Jeffrey Sonis, program co-chair of CITRM, offers this brief introduction to the use of propensity scores.

In observational (i.e., non-randomized) studies, factors that predict the outcome are frequently distributed unequally across exposure or treat­ment groups. Those confounding factors will lead to biased estimates of the effect of exposures or treatments if they are not accounted for in the analysis. For years, regression modeling techniques have been used to assess the effect of an exposure or treatment on an outcome, “controlling for” confounding factors.

For example, logistic regression might be used to assess whether per­sons given beta-blocking medications by their physicians are less likely to develop PTSD after exposure to trauma than persons who aren’t given beta-blockers, after controlling for demographic characteristics, severity of trauma exposure, past medical his­tory, past psychiatric history, current medications, and other factors.

However, regression techniques will not work particularly well at eliminating confounding bias if: 1) the treatment groups are very dissimilar; 2) the number of confounding factors is large compared to the number of subjects in the study; 3) the form of the association between a confounding factor and the outcome (e.g., linear) is different than the form imposed by regression assumptions (e.g., exponential in logistic regression).

Moreover, regression models that have a large number of potential confounding factors as independent variables will have imprecise estimates of the main effect of the treatment, resulting in wide confidence intervals and high p values for the main treatment effect.

Propensity scores are a relatively new (circa 1980s) analytic method that can overcome some of the limita­tions of regression models in reducing the effects of confounding. A propensity score is the probability of treatment (or exposure), given a set of confounding factors. For example, in the study described above, a propensity score would be the probability (between 0 and 1) of a person being on a beta-blocker, based on their demographic characteristics, severity of trauma exposure, and past medical and psychiatric history.

The propensity score is itself derived through regression modeling, but regression modeling is used to predict the probability of treatment, not the probability of the outcome, i.e., PTSD. Each person in the study has a propensity score that is the probability of treatment (or exposure), conditional on his or her values on the factors that predict treatment (or exposure).

Propensity scores can then be used in one of several different ways to achieve an unbiased estimate of the treatment effect. First, propensity scores can be included in a regression model as a summary covariate, instead of each of the individual covariates. For example, a logistic regression model would be conducted with PTSD as the dependent variable, and use of beta blockers and the propensity score as the only indepen­dent variables.

A second approach is to match pairs of treated and untreated subjects by propensity score, and then conduct a matched analysis. A third approach is to weight each observation by the inverse of the propensity score.

Use of propensity scores to control confounding offers important advantages over traditional regression methods in some circumstances. Each of the three approaches to propensity scores described above can be used, but additional methodological work is needed to determine which approach is most useful in specific circumstances.

In addition, it is important to remember that, even at their best, propensity scores can only control for confounding factors that were measured in the study. This means that in observational studies, researchers should measure a wide range of potentially confounding factors. It also means that randomized trials will remain the gold standard for estimates of treatment effects, since ran­domization reduces the likelihood of confounding by both known and unknown confounders.