In this appendix, technical issues raised at the committee meeting and summarized in the letter report are discussed in greater detail. The committee offers these technical points and suggestions for improvement in recognition of the complexity of the task, particularly regarding the difficulty of measuring and isolating the confounding variables and the data limitations. The comments are all related to the paper by Charles Kahane.
INDUCED EXPOSURE DATA BASE
The validity of the induced-exposure data base--the source of the key control variables of age and gender--is a major area of concern. Measures of exposure to risk, or vehicle use, are critical to an analysis of the likelihood of being in a fatal crash. Yet, such commonly used measures of exposure as vehicle miles driven are not available for individual vehicles--let alone the driver and environmental characteristics of this travel (e.g., driver age and gender, time of day, type of highway) (Kahane, p. 20). Vehicle registration years--another exposure measure--is available by vehicle make-model; however, no comparable information is available about the age or gender of vehicle drivers (Kahane, p. 21). Thus, an indirect measure of exposure--induced exposure--for which the desired variables can be obtained is used by Kahane as a surrogate for vehicle miles or years. [See Appendix A (p. A-3) for the definition of induced-exposure crashes used by Kahane.]
One of the key problems with the induced-exposure approach, which was introduced as a highway safety analysis technique more than 25 years ago, is the representativeness of the no-fault exposure surrogate group. The Kahane report does not provide sufficient evidence that the induced exposure group of stopped-vehicle crashes is a suitable surrogate for the vehicle fleet and driving population on the same highways as the fatal crashes.
Ordinarily in analyses of this type, a risk assessment is performed using fatalities or injuries from among the exposed population group (Pendleton 1996). In the Kahane study, the fatality and induced-exposure groups describe two very different populations: one that is exclusively defined by crashes in which fatalities occur, and the other that is almost exclusively defined by crashes in which fatalities do not occur among two vehicles, one of which must be standing still. Thus, the ratio of fatalities to induced exposures is calculated from a base group that may not be related in a meaningful way to the fatality group.
Furthermore, the induced-exposure group is subject to constraints that are not applied to the fatality group, most importantly, state-to-state reporting differences based in part on the reporting thresholds established by each state. These state-to-state reporting thresholds affect the likelihood of reporting car and light truck crashes, the former being more likely because of greater damage to lighter cars than to the heavier trucks. In addition, the induced-exposure group may reflect an urban bias; the incidence of crashes involving a vehicle standing still is likely to be greater in more congested urban areas than on rural highways with little traffic, especially so on Interstate highways. Finally, driver differences can also be expected; young aggressive drivers might be expected to have fewer induced-exposure crashes, because they have less of a tendency to wait patiently at intersections and stop lights for traffic to clear.
The Kahane report acknowledges these limitations (pp. 22-23), but could go further and provide descriptive statistics to compare the data sets used. For example, for each of the data sets, it would be helpful to see the distribution of the percentage of crashes for each highway functional class and for driver age, gender, and each of the other independent variables used in the regression models. It would also be helpful to compare the data sets by vehicle registration group to see how representative the vehicles involved in induced-exposure crashes are of the total vehicle fleet.
TREATMENT OF CONTROL VARIABLES
One of the major contributions of the Kahane study is the attempt to estimate the effects of vehicle weight reductions on fatality risk, controlling for key confounding factors, such as driver age and gender. A central challenge is controlling fully for these variables. For example, it has been well documented that younger drivers are more likely to be involved in crashes1; the accident data show a nonlinear relationship between driver age and fatality risk. But many young drivers also choose less expensive, lighter vehicles. The difficulty is holding constant the age (and other characteristics) of the driver. For example, one of Kahane's regressions that estimates fatality risk in passenger car rollover crashes found that driver age was highly statistically significant: the risk of a fatal rollover crash decreases by 7.7 percent for each year of driver age up to age 35 (Kahane, p.68). By comparison, each 100-lb. increase in car weight reduces the risk of a fatal rollover crash by 2.5 percent, approximately one-third the effect of one year of driver age (Kahane, p. 68). Because the influence of age is so strong, a slight misspecification of the nonlinear relation between age and risk could result in substantial changes in the estimated effect of age. In addition, driver age does not satisfactorily comprise all of the driver-related factors that influence rollover crash risk such as drinking, aggressive driving, and safety-belt usage. To the extent these characteristics are also related to car choice (including the size and mass of the car), incomplete control for these variables could introduce bias into the estimated effects of weight.
Similarly, the Kahane regressions may not control for all of the important vehicle-related factors that could also confound the effect of vehicle weight. For example, increased vehicle horsepower for some models could explain part of the fatality risk that is currently being attributed to vehicle weight. Controlling for this variable is not simple, however, as described in the committee's letter report. Other vehicle variables, such as track width and wheelbase, could have been included with vehicle weight in the Chapter 5 regressions. To the extent serious collinearity problems are introduced, they can be documented and the inability to distinguish the effects of the variables can be acknowledged.
Treatment of collinearity, generally, is problematic in the study. Research on collinearities in the statistical literature over the last 15 years suggests that correlations among the predictors up to 0.66 (as found by Kahane and described on p. 169) are usually not sufficient to cause the problems stated in the report2. Yet, the forcing of the age and gender coefficients in Chapter 5 was necessitated, according to the author, by the reportedly spurious results from the high intercorrelation among the key independent variables of curb weight, driver age, and gender (Kahane, p. 171).
Model validation refers to statistical methods that are used to ensure the reasonableness of assumptions needed for the validity of various statistical analyses. For example, one can postulate and fit a linear regression model to the logarithm of the odds of a fatality. Whether in fact the model should be linear in each of the predictor variables is a question that should be addressed. Plotting residuals from a fit to the data is just one technique that can be used to assess the validity of the linearity assumption.
No formal statistical model validation techniques are mentioned in the Kahane report. At a minimum, the linearity and normality assumptions should be checked. Influence diagnostics for outliers should also be examined because severe enough outliers can seriously distort the fit. The possibility of serious outlier effects is suggested in the discussion of Figure 5-3 (Kahane, p. 144), notably the outlier for the 3800 lb. curb weight class. The presence of extreme predictor values (leverage points) should also be assessed, although the grouping of the data into cells may eliminate this as a potential problem.
Because of the complexity of the model fitting methods used in Chapter 5, a statistical method known as cross validation could be used to test the validity of the model results. In this method, a randomly selected portion of the available data is set aside and is not used in the model fitting. Once the final model is fit, the fitted model is used to predict the data that have been set aside. How well the fitted model estimates the correct fatality rates can then be assessed directly using data with known rates. One might choose data from each state or portions of the entire data file to perform cross validation. The cross-validation could be done longitudinally by comparing the model-predicted results with the actual results of another year, or it could be done cross-sectionally by randomly selecting half of the data base for model building and then seeing how well the model predicts the other half. Use of this technique allows a quantitative assessment of the uncertainty in the fitted models. Other methods of model validation, such as the jackknife or bootstrap methods, might also be considered3.
The Kahane study uses a two-step modeling procedure in Chapter 5 to address the issue of small cell size because of the large number of independent variables. The two-stage adjustment, however, requires more careful theoretical justification. Ordinarily, one can fit a regression model in steps. However, to do so correctly, the first step should involve regressing the logarithms of the fatality odds, Kahane's chosen dependent variable, on the Step 1 predictors, and, in a separate regression fit, the Step 2 predictors on the Step 1 predictors. Then, the fatality residuals from the first fit in Step 1 should be regressed on the Step 2 predictor residuals from the second fit in Step 1.
Furthermore, if the models are fit to data in cells, the residuals are only valid for the celled (aggregated) data, not the unaggregated data. Typically, regression coefficients cannot be estimated using aggregated data and then used to form residuals using the unaggregated data. Nor should a new aggregation be formed and averages of the residuals used in each new cell, as was done in the Step 2 regressions in Chapter 5. Disaggregating the data to form residuals and then reaggregating the data using new cells will ordinarily bias the coefficient estimators.
A valid theoretical justification should be provided to demonstrate whether the unusual series of adjustments to the data in Chapter 5 produces unbiased estimates of the Step 2 regression coefficients. It would also be useful to see frequency distributions for the key classification variables of vehicle make, body style, and model year for the three data sets used in the Chapter 5 regressions: fatal crash involvements, induced-exposure crashes, and vehicle registrations in all states.
The Kahane analysis could benefit from a clearer description of the data and their treatment in the analysis. For example, some crash cases were deleted from the analysis because of missing data. The amount and nature of the missing data need to be specified at the appropriate points in the report to provide the reader a sense of how large a problem this may be.
The report should also specify more clearly what constitutes a data record for each type of analysis. In particular, the report needs to be clearer on the use of double records for multiple vehicle crashes (i.e., car-to-car, car-to-light truck, light truck-to-car, and light truck-to-light truck crashes), specifying why and for which analyses this was done. As it is now, the multiple counting of vehicle-to-vehicle crashes affects the analyses performed in Chapter 5 in the following ways. First, regarding car-to-light truck crashes, if one or more fatalities occurs in a crash between a car and a light truck, a record is placed in each of two files: one in which the car is listed as the "case" vehicle, and one in which the truck is listed as the "case" vehicle. When light trucks are modeled using the logistic regression methods of Chapter 5, every crash in which a light truck and a car are involved is counted as a "failure" in computing the logarithm of the odds of a fatality in the cells of the table, regardless of whether the fatalities occurred in the car or in the truck. However, 80 percent of the fatalities in such crashes are the car occupants. With the objective of estimating the risk of a fatality, justification should be provided for including such a large number of fatalities in the light duty truck file when most of the fatalities occurred in cars. It would seem preferable to have one car-light truck data file and to model the risk of a fatality using the characteristics of both vehicles. It would also seem preferable to include variables in the model that identify the vehicles in which the fatality or fatalities occurred. Whichever approach is taken--the approach taken by Kahane or the alternate approach suggested by the committee--its validity for estimating fatality risk should be justified.
Second, regarding car-to-car crashes, the car-to-car fatality file includes two records for each crash in which a fatality occurs. In the first record, car no. 1 is the "case" vehicle; in the second record, car no. 2 is the "case" vehicle (Kahane, p. 139). Consequently, there are likely to be a number of dual record sets for the same fatal crash in which one of the records indicates that the "case" vehicle is a light car and the other, that the "case" vehicle is a much heavier car. The logic of this double counting is not clear and, in any case, would tend to weaken any effects of car weight in the final regression model fit.
Appropriate measures of uncertainty should accompany any statistical model fitting procedure. In a regression setting, the key measures of uncertainty are confidence intervals. Unlike point estimates (e.g., an estimated reduction of 322 fatalities), confidence intervals provide an interval of reasonable estimates, where the width of the interval is determined by the uncertainty in the point estimate. The greater the uncertainty, the wider is the confidence interval. Since any value in the confidence interval is deemed a reasonable estimate of the quantity of interest, a wide confidence interval indicates that there are many values above and below the point estimate that are reasonable estimates of the fatality rate.
Despite the complexity of the regression procedures in Chapter 5, the analysis does not provide any direct measure of the uncertainty of the final fatality estimates. The conclusions drawn from the estimation of the effect of a 100lb-weight reduction for cars might be drastically different depending on the width of the confidence intervals. For example, if the point estimate of 322 fatalities in the Kahane analysis were associated with a 95 percent confidence interval of between 318 and 326 fatalities, this might lead to a different conclusion about the effect of downweighting than a confidence interval of -88 to 722 fatalities. Uncertainty estimates should be derived for the entire two-step procedure, not just for the Step 2 regression fits.
Confidence intervals, however, only measure uncertainties associated with modeled estimates. There are also uncertainties associated with possible misspecification of the model or from uncertainties associated with variables introduced from other data bases (i.e., the exogenously "forced" coefficients for driver age and gender from the induced-exposure crash data base). It may not be possible to quantify these model-related uncertainties precisely, but it is important to provide a sense of their magnitude.
Because of the limitations of the data and the analytic procedures previously discussed, the executive summary of the Kahane report should be modified to reflect the uncertainty that these limitations bring to the results. The current language and the precision of the estimates in the summary and elsewhere in the report imply a degree of statistical certainty that is not supported by the analyses.
A key issue that was raised with respect to the National Highway Traffic Safety Administration's (NHTSA's) earlier analyses of the effects of vehicle weight on safety concerned the sensitivity of the estimates to assumptions about how the weight reduction would be distributed across the fleet of passenger cars and light trucks. The 1992 NRC study pointed out that estimates of societal risk from vehicle downsizing depend on changes in the mix of vehicle weights in the fleet (p. 57 and Appendix D)4.
The Kahane report assumes equal weight reductions of 100-lb. per vehicle across classes of vehicles. To meet CAFE standards in the past, however, automobile manufacturers made substantial reductions in the weight of the heaviest models. To reflect this or other possible downweighting scenarios, the analysis could examine the effect of equivalent percentage weight reductions across classes of vehicles, or look at reductions for certain fleet segments (e.g., light trucks only, or light trucks and heavy cars). Many of these approaches would come closer to how CAFE standards are implemented; automobiles and light trucks must meet fuel economy standards averaged across the new car fleet. Thus, not every model must meet the same standard5.
Another sensitivity test could attempt to separate, at least partially, the effects of driver aggressiveness from vehicle weight on fatality risk by removing from the data base cars known to be associated with risk taking driver behavior and high fatality rates, such as certain sports cars and sport utility vehicles, and then running the regression.
These are just some of the scenarios that could be tested. The analyses would help provide a better sense of the robustness of the estimates to changes in model assumptions and control variables.
1. Kahane lists four age- and gender-related factors that affect fatality risk: vehicle use (annual mileage), which is highest for drivers in the 20- to 50-year old group, and is higher for males; vulnerability to fatal injury, which increases with age, and is greater for women than for men; driving errors, which are greatest for young and old drivers; and driving aggressiveness, which, on average, is highest for young drivers and for males (pp. 15-16).
Return to text
2. For example, Mason et al. (1989) suggest that a correlation exceeding 0.95 indicates serious collinearity problems. Belsley (1991) and Belsley et al. (1980) advocate the use of condition numbers for diagnosing collinearities. They suggest a condition number greater than 30. For a model with only two predictors, a condition number greater than 30 roughly corresponds to a correlation greater than 0.96. Neter et al. (1990) advocate the use of variance inflation factors. They suggest that a variance inflation factor which exceeds 10 indicates severe collinearity problems. For a model with two predictors, a variance inflation factor greater than 10 is equivalent to a correlation greater than 0.90.
Return to text
3. The jackknife method is a cross-validation procedure in which each observation is left out in turn and the other observations are used to fit the model and then predict the one that was left out. Thus, it is included under the general heading of cross validation. The bootstrap method is a computer simulation method that uses the actual data in the simulation.
Return to text
4. Based on two-car collisions and hypothetical estimates of accident variables, Appendix D of the report considers the change in fatalities from various changes in the distribution of vehicle sizes in the fleet. It suggests that, in principle, downsizing could increase, decrease, or leave unchanged total deaths and injuries in two-car collisions, depending on the changed size distribution of cars in the fleet (NRC 1992, 57).
Return to text
5. In fact, fuel economy standards differ for new passenger cars and light trucks. Passenger cars must achieve 27.5 mpg (Davis 1995, 3-47), and light trucks, 20.7 mpg (NHTSA 1996).
Return to text
NHTSA - National Highway Traffic Safety Administration
NRC - National Research Council
Belsley, D.A., E. Kuh, and R.E. Welsch. 1980. Regression Diagnostics, John Wiley and Sons, Inc.
Belsley, D.A. 1991. Conditioning Diagnostics, John Wiley and Sons, Inc.
Davis, S.C. 1995. Transportation Energy Data Book: Edition 15. ORNL-6856, Oak Ridge National Laboratory, Oak Ridge, Tenn., May.
Kahane, C. J. 1995. Relationships Between Vehicle Size and Fatality Risk in Model Year 1985-03 Passenger Cars and Light Trucks. Unpublished manuscript. National Highway Traffic Safety Administration, Washington, D.C.,. Oct.
Mason, R.L., R.F. Gunst, and J.L. Hess. 1989. Statistical Design and Analysis of Experiments, John Wiley and Sons, Inc.
Neter, J., W. Wasserman, and M.H. Kutner. 1990. Applied Linear Statistical Models (third edition), Richard D. Irwin, Inc.
NHTSA. 1996. NHTSA Sets Model Year 1998 Light Truck Fuel Economy Standard. NHTSA 17-96, U.S. Department of Transportation, March 29.
NRC. 1992. Automotive Fuel Economy: How Far Should We Go? National Academy Press, Washington, D.C.
Pendleton, O.J. 1996 (in press). Manual for Indirect Exposure Methodologies. DTFH61-93-00123. Federal Highway Administration, Washington, D.C., June.