Sample Size and the Detection of Means: A Signal Detection Account
Anderson, Richard B., Doherty, Michael E., Memory & Cognition
Using statistical theory as a basis, Kareev (e.g., 1995) claimed that people's ability to correctly infer the existence of a population correlation should be greater for small than for large samples. Simulations by R. B. Anderson, Doherty, Berg, and Friedrich (2005) identified conditions favoring small samples but could not determine whether such an advantage was due to sampling skew, variance, or central tendency displacement. In the present study, we investigated theoretical effects of sample size (n) on the detection of population means under circumstances in which sampling variance is unconfounded with skew or central tendency displacement. The results demonstrate an extremely limited, criterion-specific, small-sample advantage that was attributable to n-related sampling variance and that occurred only with highly conservative, suboptimal criterion placement.
Numerous theorists have argued that a crucial first step in investigating the psychology of an organism is to gain insight into the environment in which the organism has evolved and to which it has adapted. Brunswik (1956, p. 119) used strong phraseology in describing the analysis of "textural ecology as a propaedeutic to functional psychology," He had previously exemplified this prescription with his work on the ecological validity of the Gestalt principle of proximity (Brunswik & Kamiya, 1953). Brunswik's call for the study of the ecology has been characterized as "psychology without a subject," emphasizing the importance of studying the environment (see, e.g., Doherty, 2001). An example of this notion was introduced by Simon (1969), who described the complex behavior of an ant on a beach as a "reflection of the complexity of the environment in which it finds itself" (p. 24). Further calls for the necessity of modeling the environment have been made by J. R. Anderson (1990), Shepard (1990), and Gigerenzer and Todd (1999). As Shepard (1990, p. 213) succinctly put it, "what we are beginning to discern about the mind looks very much like a reflection of the world."
QUANTIFYING DECISION ACCURACY
It is well-known that for any given level of tolerance (a) for false positives, the power for detecting an effect increases monotonically with sample size. Thus, it is clear that large samples are better than small ones when decision accuracy is measured according to the prescriptions of null hypothesis testing. Within this prescriptive framework, it is assumed that there is a default and, thus, privileged hypothesis about some aspect of the environment. Moreover, this "null" hypothesis is rejected when a, the probability of incorrectly rejecting the null (which can be conceptualized as a false alarm rate), is less than or equal to some agreed-upon standard. The logic of null hypothesis testing typically assumes that the decision maker should be unwilling to accept a greater-than-α error rate, even though the result would be to substantially increase statistical power-that is, to increase the probability of rejecting the null hypothesis given that the hypothesis is false.
In contrast to null hypothesis testing, signal detection theory (see Macmillan & Creelman, 1991) assumes that data have been sampled from either a signal population or a noise population. It is also assumed that each datum is characterized by a particular value on an evidence scale and that the level of the evidence variable is diagnostic as to whether the datum was drawn from a signal population or a noise population. The decision maker's task is to pick a particular critical value on the evidence scale. When the evidence variable exceeds the critical value, the stimulus is judged to be a signal; otherwise, it is judged to be noise.
For example, in research on recognition memory (for a review, see Yonelinas, 2002), subjects are typically presented first with a study list and then with a recognition list that contains both studied and nonstudied items. The task is to respond "yes" or "no" to each recognition item to indicate whether that item was or was not studied, respectively. …