Geographic Information Systems: Their Use in Environmental Epidemiological Research

Article excerpt


While the mapping of health data is not new to epidemiologists, advances in geographic information system (GIS) technology provide new opportunities for epidemiologists to study associations between environmental exposures and the spatial distribution of disease. In addition to the conduct of ecologic studies in which environmental exposure information is compared with disease rates across regions at the group level, GIS technology can be used to estimate exposures to individuals in cross-sectional, case-control, and cohort studies. Often the most difficult, costly, and time-consuming aspect of environmental health studies is obtaining accurate exposure information. A GIS can combine information contained in existing databases and/or data that can be computerized to estimate, for example, exposure levels to agricultural pesticides of individuals residing or working within defined geographic regions. The computerized estimates of exposure, together with information on the location and occurrence of disease among individuals within the regions, can then be used to suggest and support hypotheses regarding environmental causes of disease. So far, only a few studies incorporating GIS technology have been published in the epidemiologic literature. This is partly due to a lack of familiarity with the technology and partly due to limitations in its use for epidemiologic research.

The purpose of this paper is to provide an overview of some of the capabilities and limitations of GIS technology with regard to its use in environmental epidemiologic research; to illustrate, through practical examples, the use of several functions of a GIS, including automated address matching, distance functions, buffer analysis, spatial query, and polygon overlay; to discuss methods and limitations of address geocoding, often central to the use of a GIS in environmental epidemiologic research; and to emphasize the need for collaborative efforts between epidemiologists, biostatisticians, environmental scientists, GIS specialists, and medical geographers to realize the full potential of GIS technology in future epidemiologic studies.

What is a Geographic Information System?

Essentially, a GIS is a powerful computer mapping and analysis technology that allows large quantities of information to be viewed and analyzed within a geographic context. According to Antenucci et al., a GIS "links nongraphic attributes or geographically referenced data with graphic map features to allow a wide range of information processing and display operations as well as map production, analysis and modelling" (1). These techniques allow the health researcher to go beyond the simple mapping of disease rates within predetermined political boundaries (e.g., county, state).

GISs are used to input, store, manage, analyze, and display data. Many GIS experts believe that a true GIS differs from desk top mapping systems in that it contains a data structure that stores information about topology (i.e., the relationships among geographic features) (2). Certain methods of spatial analysis require a topological data structure, which allows concepts such as adjacency and connectivity, easily visible to humans, to be recognized by a GIS.

Data Storage Formats

Data can be stored in a GIS two ways: in raster format and in vector format. The raster format stores geographic data or graphic images as a matrix of evenly divided grid cells that contain values for an attribute. The position of the cell in the matrix provides information about location. Additional information about attributes is stored within each grid cell. Raster data can be scanned from maps or obtained from photographs or remote sensing space satellites. Satellite images and digital photos are examples of digital data stored in raster format.

Vector data consist of strings of coordinates and usually are represented in a GIS by three types of features: points, lines, or polygons (areas). …