Psychophysical Assessments of Image-Sensor Fused Imagery

Article excerpt

INTRODUCTION

Sensor fusion combines images from multiple sensors into a single display, with the aim of enhancing operators' target detection and situational awareness in high-workload environments. Military and civilian applications include enhancing commercial airline pilots' ability to land during low visibility (Nordwall, 1997), increasing observers' ability to detect targets in all weather and terrain environments using multispectral sensors (McDaniel et al., 1998), enhancing helicopter pilots' night tactical terrain flight performance (Ryan & Tinkler, 1995), and improving airborne detection of vehicles (Bergman, 1996).

The results of human performance tests of sensor-fused imagery, however, have been equivocal. Although some studies have found improved performance with fused imagery (e.g., Essock, Sinai, McCarley, Krebs, & DeFord, 1999; McCarley & Krebs, 2000; Toet, Ijspeert, Waxman, & Aguilar, 1997; Waxman et al., 1996), others have not (Krebs, Scribner, Miller, Ogawa, & Schuler, 1998; Steele & Perconti, 1997). Although the discrepancies among the studies may be attributed to methodological inconsistencies, the potential benefit of image-sensor fusion is not overwhelmingly apparent.

Currently two main varieties of night-vision imaging sensors are in widespread use: image intensifiers ([i.sup.2]s), such as the military's night vision goggle, which amplify available light and near infrared in a nighttime scene; and long-wave infrared (IR) sensors, also in common military use, which convert invisible thermal energy into a visual display. Objects viewed by [i.sup.2] and IR sensors generally have the same spatial characteristics but appear to have dramatically different contrast levels (Krebs et al., 1998). Consequently each may offer certain advantages and disadvantages that can be exacerbated or minimized according to environmental conditions.

For example, the resolution of the first-generation infrared sensor--the most common infrared sensor used in the military--is generally poorer than that of [i.sup.2] sensors, with the result that background details of a visual scene are generally more visible in [i.sup.2] than in thermal images (Steele & Perconti, 1997). However, the thermal contrast between heat-emitting objects and their cooler surroundings is typically much greater than the luminance contrast, allowing such objects to be seen more clearly in a thermal image than in a visible image (O'Kane, Crenshaw, D'Agostino, & Tomkinson, 1992). Likewise, atmospheric conditions can affect these two sensors differently. For example, clouds that obscure moonlight and starlight will decrease the strength of the signal reaching an [i.sup.2] sensor and, in turn, reduce contrast within the [i.sup.2] image, but they will leave the thermal contrast between objects unaffected. Conversely, changes in ambient temperature may alter the distribution of thermal contra st across a scene while producing no concurrent change in illumination. Thus the quality and information content of the imagery produced by IR and [i.sup.2] sensors are constrained in materially different ways.

By combining the output of two or more sensors within a composite image, sensor fusion offers a potential method of overcoming the limitations inherent in single-band imagery (e.g., Aguilar et al., 1998; Das & Krebs, 2000; Scribner, Warren, Schuler, Satyshur, & Kruer, 1998; Therrien, Scrofani, & Krebs, 1997; Toet & Walraven, 1996; Waxman et al., 1997). Such processing could enhance the quality of electronically sensed images in at least two ways. First, fusion could simply combine information conveyed by multiple input sources, allowing users to view multiple "images" within a single display and, perhaps, obviating the need to alternate one's gaze between displays either electronically or via eye movements.

More intriguingly, fusion could augment the information conveyed by single sensors with emergent information not available in any of the input images singly but derived from the contrast between input images. …