A Probability Sample of Gay Urban Males: The Use of Two-Phase Adaptive Sampling
Blair, Johnny, The Journal of Sex Research
The advantages of probability sample designs over convenience samples have been known since the 1930s (Neyman, 1934). Strictly speaking, in a probability sample every member of the target population has a known, nonzero chance of selection. This provides a statistical basis for projecting sample estimates back to the target population.
In studies of relatively rare populations which are also not easily identifiable, such as gay men, there are serious obstacles to the implementation of probability designs. The population's rarity means that large numbers of households must be contacted to locate the target sample, which greatly affects the survey's cost. The reluctance of many gay men to report their homosexuality in a survey--particularly at the very start of an interview--can cause severe undercoverage and potential bias. That is, many members of the population are never identified and those that are may differ substantially in their characteristics from the population at large.
These obstacles have often led to the use of extremely loose survey methods, such as sampling patrons of gay bars or members of gay rights organizations. These convenience samples provide no statistical basis for projecting back to the population of interest.
The major goal of the present study was to provide reliable (i.e., replicable) population estimates for the gay male population of four major cities. In order to implement the probability sample necessary to achieve this objective, a survey sample design was selected that took into account the likelihood of flaws in the data used for planning.
Typically, in a sample survey there is sufficient initial information available about the target population size and the sampling frame to specify completely such sample design parameters as stratification definitions, within-stratum sample sizes, sampling rates, and the geographic distribution of the sample. However, when the initial information is incomplete or may be unreliable, it can be useful to make these specifications only provisionally, with their final form dependent on information obtained during sample selection and data collection. When, in addition to these uncertainties, the target population is a small fraction of the general population, the risks to the study costs and the sample size ultimately achieved may be high. That is, errors in the initial assumptions may make the survey much more costly, resulting in fewer interviews being possible for a given fixed budget.
Overestimation of the population prevalence, based on secondary sources, occurs for two reasons. First, such sources usually only approximate the definition (or are indicators of the presence) of the target population; for example, the number of single males in a particular age range or the number of reported AIDS cases. Second, even when a source is a direct estimate of the target group, the identification often is not from a survey. When the sources are not based on survey data, they do not account for underreporting, such as the fact that some members of the target population will deny their eligibility.
Multiple secondary sources are often used and these sources differ in their accuracy. Moreover, data from these sources must be combined into a single estimate of prevalence. This estimation requires some judgment by the researcher, who is, in effect, modeling prevalence. Because of these problems, important areas of sample design focus on efficient methods for locating rare (or moderately rare) or elusive populations. Sudman, Sirken, and Cowan (1988) provide an excellent overview of many of these designs. Sudman (1985) gives a detailed analysis of the balance between costs and variances. One class of such sample designs takes advantage of the natural geographic clustering of the target population.
In this paper, the key aspects of a two-phase, adaptive sampling approach are described for a telephone survey of self-identified gay urban males. …