Comparative Analysis of Regression Output Summary Statistics in Common Statistical Packages

Article excerpt

Several writers have recently reviewed statistical software for microcomputers and offered comments useful to both users and vendors (Goldstein 1991; Searle 1989). Some of these reviews are comprehensive in scope (Hoffman 1991). Others analyze specific program features and/or identify problem areas. For example, Uyar and Erdem (1990) recently discovered that SAS program regression procedures such as REG and GLM can produce distorted summary statistics (such as the [R.sup.2] and F) for the same equation specified in slightly different but algebraically equivalent ways. This contradiction appears to arise from the inability of SAS to (check for and) validate the internal consistency of all the regression program options listed in the MODEL statement with respect to either the model specification or the input data.

This article extends the Uyar-Erdem (UE) experiment in two directions (using their data and models): first, to PROCedure REG's STEPWISE and RSQUARE methods for regression model selection in SAS (SAS Institute Inc. 1988); and second, to regression methods in other widely used, general-purpose statistical software programs [SPSS (SPSS Inc. 1990), SYSTAT (Wilkinson 1990), BMDP (Dixon 1990), and STATPAC (Walonick 1990)] and econometric software programs [LIMDEP (Greene 1991), SHAZAM (White, Wong, Whistler, and Haun 1993), and Micro-TSP (Hall and Lilien 1988)]. Selection of the software arrayed in Table 1 and included in the analysis was guided by: (1) wide cross-disciplinary usage, (2) general availability, and (3) capacity to implement the regression options needed to accomplish the research objectives on an IBM-PC. Furthermote, and as pointed out by one anonymous referee, (4) BMDP, SPSS, and SAS statistical programs are the classics and should be included in any meaningful comparative analysis; (5) SYSTAT qualifies for inclusion since it is one of the first serious packages available on the PC; and (6) finally, the other packages tested (LIMDEP, SHAZAM, STATPAC GOLD, and Micro-TSP) are all popular econometric packages. Our results generally show that the regression problems UE identified also extend to all but four of the statistical and econometric software programs examined in this study.

Table 1. Software Package Information on Version Number,
Vendor, and 1993 List Price

SAS SYSTEM (Version 6.04)
  Available from SAS Institute, Inc., SAS Circle, Box 8000. Cary,
  NC 27512-8000. Tel. (919)677-8000. 1993 Academic Prices:
  $238 Base SAS (1 user), $190 Statistics Module (1 user). Eight
  other Modules available at $190 each. Site license on yearly
  basis. New Version 6.07 (PC SAS for Windows) is scheduled for
  release soon.
BMDP (Version PC90 Release)
  Available from BMDP Statistical Software, Inc., 1440 Sepulveda
  Blvd., Los Angeles, CA 90025. Tel. (310)479-7799. 1993
  Academic (Commercial) Prices: $495 $2195) for 44 Modules.
  Minor upgrades free to current users, $295 charge for major
SPSS/PC + (Version 4.1)
  Available from SPSS, Inc., 444 N. Michigan Avenue, Chicago, IL
  60611. Tel. 1(800)543-5831 or (312)329-3500. 1993 Academic
  (Commercial) Prices: $195 ($195) for Base, $195 ($395) for
  Intermediate, $195 ($355) for Advanced Modules, $295
  Graphics, $395 MapInfo. Site license available. SPSS/PC +
  System upgrades 5.0 recently released.
SHAZAM: The Econometrics Computer Program (Version 7.0)
  Available from SHAZAM, Department of Economics, University
  of British Columbia, Vancouver B-C., Canada V6R1Z1. Tel.
  (604)822-5062. 1993 Academic (Commercial) Prices: $295
  ($295) for the regular DOS version and $395 for the 386
  version. Site License available.
STATPAC GOLD (Version 4.2)
  Available from Walonick Associates, Inc., 3814 Lyndale Avenue
  S., Minneapolis, MN 55409. Tel. (612)822-8252. 1993 Prices:
  $795 Base, $495 Advanced Statistics, $50 Upgrades. 15%
  Academic Discount.
LIMDEP (Version 6.0)
  Available from Econometric Software, Inc. …