Academic journal article Memory & Cognition

Ratio and Difference Comparisons of Expected Reward in Decision-Making Tasks

Academic journal article Memory & Cognition

Ratio and Difference Comparisons of Expected Reward in Decision-Making Tasks

Article excerpt

Several models of choice compute the probability of selecting a given option by comparing the expected value (EV) of each option. However, a subtle but important difference between two common rules used to compute the action probability is often ignored. Specifically, one common rule type, the exponential rule, compares EVs via a difference operation, whereas another rule type, the power rule, uses a ratio operation. We tested the empirical validity of each rule type by having human participants perform a choice task in which either the difference or the ratio between the reward values was altered relative to a control condition. Results indicated that participants can compare expected rewards by either ratio or difference operations but that altering the ratio between EVs produces the most dramatic changes in behavior. We discuss implications for several related research fields.

(ProQuest: ... denotes formulae omitted.)

In experience-based choice tasks, such as the n-armed bandit task, a decision maker selects an option in order to maximize benefit and minimize cost. In the n-armed bandit task, the participant makes repeated choices from n different options, with the objective of maximizing total reward by selecting the options with the highest payoffs (Sutton & Barto, 1998). On each trial, the decision maker must decide whether to exploit the option that has been giving a good reward or to explore other options in order to gain new information about the environment. This task is analogous to many real-life decisions, such as whether to dine at a new restaurant or to stick with an option that has been successful in the past. Several of the learning models developed for this task assume that decision makers compare the expected value (EV) of each option when determining the response to select on the next trial (Busemeyer & Stout, 2002; Daw, O'Doherty, Dayan, Seymour, & Dolan, 2006; Sutton & Barto, 1998; Worthy, Maddox, & Markman, 2007). These models use a form of the biased choice rule (Luce, 1959, 1963; Shepard, 1957) to compute the probability of selecting each option. The probability of selecting option a is equal to the EV for option a divided by the sum of the EVs for all possible options.

Action selection rules are ubiquitous in models of choice, and so it is important to understand their behavioral predictions and to determine whether human behavior conforms to the predictions of one selection rule or another. Action selection rules have been incorporated into models of choice behavior in animals (often in the form of the matching law; Corrado, Sugrue, Seung, & Newsome, 2005; Herrnstein, 1961; Lau & Glimcher, 2005; Sugrue, Corrado, & Newsome, 2004), similarity-based models of category learning (e.g., Maddox & Ashby, 1993; Medin & Schaffer, 1978; Nosofsky, 1986; Reed, 1972; Rodrigues & Murre, 2007), and many connectionist models of action selection (e.g., Kruschke, 1992; Minsky & Papert, 1968; Rumelhart, McClelland, & the PDP Research Group, 1986).

In this article, we focused on a subtle difference in the implementation of action selection rules in computational models that is largely ignored in the literature. Some models compare the ratio between the values representing each alternative via a power function, whereas other models compare the difference between the values representing each alternative via an exponential function. Despite this distinction between the models, decision rules are typically incorporated into models without explicitly choosing one rule over the other on the basis of its psychological properties. Although many researchers may understand the differences between a power rule and an exponential rule, there have been few systematic examinations of the effects of using either action selection rule in modeling applications (however, see Rodrigues & Murre, 2007). Furthermore, there have been few empirical tests of theoretical predictions derived from using either rule. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.