Journal of the American Statistical Association

Journal covers statistical science and its applications, theory, and methods in economic, social, physical, engineering and health sciences.

Articles from Vol. 95, No. 449, March

A Class of Weighted Log-Rank Tests for Survival Data When the Event Is Rare
In many epidemiological and medical follow-up studies, a majority of study subjects do not experience the event of interest during the follow-up period. An important example is the ongoing prostate, lung, colorectal, and ovarian cancer screening trial...
Additive Hazards Regression with Covariate Measurement Error
The additive hazards model specifies that the hazard function conditional on a set of covariates is the sum of an arbitrary baseline hazard function and a regression function of covariates. This article deals with the analysis of this semiparametric...
A Minimum Description Length-Based Image Segmentation Procedure, and Its Comparison with a Cross-Validation-Based Segmentation Procedure
Image segmentation is a very important problem in image analysis, as quite often it is a key component of a good practical solution to a real-life imaging problem. It aims to partition a digital image into a set of nonoverlapping homogeneous regions....
Analyzing a Randomized Cancer Prevention Trial with a Missing Binary Outcome, an Auxiliary Variable, and All-or-None Compliance
The Prostate Cancer Prevention Trial is a randomized chemoprevention trial designed to compare the effect of daily finasteride versus placebo on prostate cancer determined by biopsy. Investigators have scheduled a biopsy at the end of the trial in...
A Nonparametric "Trim and Fill" Method of Accounting for Publication Bias in Meta-Analysis
Meta-analysis collects and synthesizes results from individual studies to estimate an overall effect size. If published studies are chosen, say through a literature review, then an inherent selection bias may arise, because, for example, studies may...
Asymptotics for Analysis of Variance When the Number of Levels Is Large
We study asymptotic results for F tests in analysis of variance models as the number of factor levels goes to [infinity] but the number of observations for each factor combination is fixed. Asymptotic derivations of the type discussed in this article...
Capture-Recapture Models
1. INTRODUCTION Here I briefly review capture--recapture models as they apply to estimation of demographic parameters (e.g., population size, survival, recruitment, emigration, and immigration) for wild animal populations. These models are now also...
Causal Analysis in the Health Sciences
1. INTRODUCTION The final quarter of the twentieth century witnessed a burgeoning of formal methods for the analysis of causal effects. Of the methods that appeared in the health sciences, most can be identified with approaches to causal analysis...
Challenges Facing Statistical Genetics
1. INTRODUCTION The fields of statistics and genetics grew together during the twentieth century, and each faced a period of tremendous growth as the century ended. At the beginning of the twenty-first century, the need for statistical interpretation...
Computational Molecular Biology
1. INTRODUCTION Molecular biology is one of the most important scientific frontiers in the second half of the twentieth century. During this period, the basic principles of how genetic information is encoded in the DNA and how this information is...
Efficient Monte Carlo Methods for Conditional Logistic Regression
Exact inference for the logistic regression model is based on generating the permutation distribution of the sufficient statistics for the regression parameters of interest conditional on the sufficient statistics for the remaining (nuisance) parameters....
Environmental Statistics
1. INTRODUCTION The field of environmental statistics is relatively young. The term "environmetrics" was apparently introduced in a National Science Foundation proposal by Philip Cox in 1971 (Hunter 1994). During the last decade, the field has achieved...
Estimation and Inference for Logistic Regression with Covariate Misclassification and Measurement Error in Main Study/Validation Study Designs
In epidemiological studies, continuous covariates often are measured with error and categorical covariates often are misclassified. Using the logistic regression model to represent the relationship between the binary outcome and the perfectly measured...
Estimation of Discrete Distributions with a Class of Simplex Constraints
Simplex constraints, such as monotonicity and convexity or concavity on the probabilities of a set of discrete distributions, are useful for modeling and analyzing discrete data. This article considers both maximum likelihood estimation and Bayesian...
Extending the Scope of Wavelet Regression Methods by Coefficient-Dependent Thresholding
Various aspects of the wavelet approach to nonparametric regression are considered, with the overall aim of extending the scope of wavelet techniques to irregularly spaced data, to regularly spaced datasets of arbitrary size, to heteroscedastic and...
Functional Components of Variation in Handwriting
Functional data analysis techniques are used to analyze a sample of handwriting in Chinese. The goals are (a) to identify a differential equation that satisfactorily models the data's dynamics, and (b) to use the model to classify handwriting samples...
Genetic Susceptibility and Survival: Application to Breast Cancer
Inherited mutations of the BRCA1 and BRCA2 genes are known to confer an elevated risk of both breast and ovarian cancers. The effect of carrying such a mutation on survival after developing breast or ovarian cancer is less well understood. We investigate...
Inference from Dual Frame Surveys
In a dual frame survey, samples are drawn independently from two overlapping frames that together cover the population of interest. Several estimators for population totals in dual frame surveys are discussed and compared under a unified setup. We...
Inference with Imputed Conditional Means
In this article we present analytic techniques for inference from a dataset in which missing values have, been replaced by predictive means derived from an imputation model. The derivations are based on asymptotic expansions of point estimators and...
Model Selection and Semiparametric Inference for Bivariate Failure-Time Data
We propose model selection procedures for bivariate survival models for censored data generated by the Archimedean copula family. In route to constructing the selection methodology, we develop estimates of some time-dependent association measures,...
Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data
Analysis of randomized experiments with missing covariate and outcome data is problematic, because the population parameters of interest are not identified unless one makes untestable assumptions about the distribution of the missing data. This article...
REACT Scatterplot Smoothers: Superefficiency through Basis Economy
REACT estimators for the mean of a linear model involve three steps: transforming the model to a canonical form that provides an economical representation of the unknown mean vector, estimating the risks of a class of candidate linear shrinkage estimators,...
Receiver Operating Characteristic Methodology
1. INTRODUCTION Diagnostic medicine has progressed tremendously in the last several decades, and the trend promises to continue well into the next millennium. Advances in technology provide new methods for detecting disease or physical impairment....
Reference Bayesian Methods for Generalized Linear Mixed Models
Bayesian methods furnish an attractive approach to inference in generalized linear mixed models. In the absence of subjective prior information for the random-effect variance components, these analyses are typically conducted using either the standard...
Safe and Effective Importance Sampling
We present two improvements on the technique of importance sampling. First, we show that importance sampling from a mixture of densities, using those densities as control variates, results in a useful upper bound on the asymptotic variance. That bound...
Some Contributions of Statistics to Environmental Epidemiology
1. INTRODUCTION The field of epidemiology has come to rely particularly heavily on statistical methods because of its observational nature and the widespread acceptance of a complex "web of causation" as its conceptual basis. As much of modern chronic...
Some Issues in Assessing Human Fertility
1. INTRODUCTION While the human population continues to grow, depleting natural resources and reducing biodiversity, scientists have become concerned about our continued capacity to reproduce (perhaps a testament to the enduring value that we place...
Statistical Issues in Toxicology
1. INTRODUCTION Toxicology is "the study of the nature and mechanism of toxic effects of substances on living organisms and other biologic systems" (Lu 1996). Sometimes, data from human populations serve as the sentinel event indicating adverse...
Statistical Properties and Uses of the Wavelet Variance Estimator for the Scale Analysis of Time Series
Many physical processes are an amalgam of components operating on different scales, and scientific questions about observed data are often inherently linked to understanding the behavior at different scales. We explore time-scale properties of time...
Statisticians' Significance
Statisticians are often not recognized at a level appropriate to their contributions. Moreover, statistical methods of analysis may be receiving more recognition than statisticians themselves. This article considers several measures of the recognition...
Statistics in Animal Breeding
1. INTRODUCTION Genetic improvement programs for livestock aim to maximize the rate of increase of some merit function expected to have a genetic basis. Animals producing progeny with the highest expected merit are kept as parents of subsequent...
Statistics in the Life and Medical Sciences
One of the pleasures of working as an applied statistician is the awareness it brings of the wide diversity of scientific fields to which our profession contributes critical concepts and methods. My own awareness was enhanced by accepting the invitation...
Survival Analysis
1. INTRODUCTION Survival analysis concerns data on times T to some event; for example, death, relapse into active disease after a period of remission, failure of a machine component, or time to secure a job after a period of unemployment. Such data...
The Multiple-Try Method and Local Optimization in Metropolis Sampling
This article describes a new Metropolis-like transition rule, the multiple-try Metropolis, for Markov chain Monte Carlo (MCMC) simulations. By using this transition rule together with adaptive direction sampling, we propose a novel method for incorporating...
The Randomized Clinical Trial
1. INTRODUCTION The randomized clinical trial is among the most important methodological tools in biostatistics. Some have conjectured that it could be the most significant advance in scientific medicine in the twentieth century (Smith 1998). In...
Transitional Regression Models, with Application to Environmental Time Series
Environmental epidemiologists often encounter time series data in the form of discrete or other nonnormal outcomes; for example, in modeling the relationship between air pollution and hospital admissions or mortality rates. We present a case study...
Window Subsampling of Estimating Functions with Application to Regression Models
We propose a subsampling method for estimating the asymptotic standard error of a statistic [[beta].sub.n] that is the solution to an estimating equation 1/n [[[sigma].sup.n].sub.j=1] [U.sub.j]([Y.sub.j], [X.sub.j], [beta]) = 0 where the data [Y.sub.j]...