Trends in Police Physical Ability Selection Testing
Hoover, Larry T., Public Personnel Management
Currently police agencies are employing variants of three basic forms of physical ability testing: job simulation exercises, physical agility and/or stamina tests, and norm referenced physical fitness or "wellness" tests. Although job simulation exercises superficially appear most defensible, they lack benchmark standards of minimal performance levels. Physical agility tests can be administered more economically, safely, and conveniently, but generally have substantial adverse impact. Norm referenced wellness tests are gaining in popularity because they solve some of the problems of both simulation exercises and physical agility tests, but are probably least defensible as directly job related. A dominant methodology has yet to emerge from either usage or court decisions.
Issues Relating to Physical Standards Validation
Adverse impact easily occurs with regard to the imposition of physical selection standards. Although such impact is not normally noted among racial groups, it does exist with regard to an important protected class, females. All that is necessary is that a prima facie case of discrimination be established. This can be done by evidence of statistical disparities alone. See Vanguard Justice Society, Inc. v. Hughes, 592 F.Supp. 245,255 (D.C. MD. 1984). Once adverse impact is established, it is up to the employer to demonstrate by a preponderance of the evidence that "any given requirement (has) a manifest relationship to the employment in question" in order to avert a finding of discrimination. There are various ways that plaintiffs can demonstrate that a particular requirement has an adverse effect. See Eison v. City of Knoxville, 570 F.Supp. 11, 33 FEP cases 1141 (D.C. Tenn., 1983); Pina v. City of East Providence, 492 F.Supp. 1240, 31 FEP 230 (D.C.R., 1980).
It is possible to prove that physical tests are job related to the police occupation. In Eison v. City of Knoxville, the police academy did demonstrate that the test, consisting of sit-ups, leg lifts, squat thrusts, pull-ups, and a two-mile run, were related to physical traits deemed necessary in police officers. However, similar cases have been lost. Most important is Burney v. City of Pawtucket, 559 F.Supp. 34 FEP 1274 BCR 1, (1983). The U.S. District Court examined a sex discrimination charge stemming from minimum physical training standards employed by the Rhode Island police training academy. A commission on standards and training fixed standards for admission to and graduation from the academy. The commission was also empowered and directed to establish minimum police training requirements. Rhode Island law requires the successful completion of the basic police training course before a recruit can be certified as a police officer. The district court examined both the pretest and post test minimum physical training requirements. It concluded that the standards had an adverse impact on women.
The pretest involved seven subparts, of which recruits were required to pass five. Recognizing that the physical abilities of women are different from mens', Rhode Island utilized a standard deviation below the mean for men as a cut off point for women. The defendants conceded that the cut off scores established for the pretest were fairly arbitrary, but defended them on the basis that they were also quite lenient.
The court noted that it was not blind to the obvious, the fact that physical decrepitude at some point must inhibit the proper performance of police duties. However, it found that there had been no effort to correlate performance on the physical test used at the academy and occupational performance. The academy defended its post test and graduation requirements on pragmatic arguments, such as "human experience" and "experience of police officers on the job." The rationale that physical training requirements are appropriate because they appear appropriate was found "convenient" by the court, but "totally lacking in legal merit."
It is certainly reasonable to assert that physical agility and stamina are rationally related to required on-the-job behaviors of police officers. However, there is an enormous gap between making that assertion, and upholding a particular set of physical selection standards. In other words, there is a leap of faith in noting that physical exertion is a required part of a police officer's role, and a requisite number of sit-ups be completed within one minute. Thus, standard job task analysis data is insufficient to allow specific physical selection criteria to be developed.
This set of issues has significant import with regard to a decision to select a given type of physical ability examination. While everyone's desire is to screen out candidates who may be unfit for the police service, while having as little adverse impact as possible, there is great diversity of opinion with regard to the technique which best accomplishes this purpose. Schofield (1989) suggests that courts have been relatively lenient in this regard, recognizing the difficulties involved, and thus allowing law enforcement administrators considerable latitude. Nevertheless, court decisions do not provide clear guidance, and are actually frequently contradictory. It is not an overstatement to say that no other aspect of the personnel selection process is fraught with more difficulty or clouded with more ambiguity than physical ability testing. Specification of what physical abilities are actually required for a police patrol officer is difficult. Replication of actual required on-the-job behaviors, such as subduing a subject resisting arrest, are extraordinarily difficult.
There are in essence three fundamental forms of physical ability testing: job simulation exercises, physical agility and/or stamina tests, and physical fitness or "wellness" tests. Combinations of these may also, of course, be employed. Each has particular strengths and weakness with regard to the validation issues previously discussed.
Job Simulation Exercises
At first, job simulation exercises appear to be the best from a validation perspective. However, there are serious difficulties with regard to such examinations for the role of police officer. A job simulation exercise is a physical agility or stamina test which either replicates or simulates as closely as possible actual required on-the-job behaviors. However, tasks which are extremely important in performing the role of police may not be usable pragmatically or safely as a personnel testing operation. Tasks chosen for inclusion in job simulation tests must be safely performed by untrained applicants. Further, they must be practical to administer in terms of time, personnel, and equipment necessary. Finally, it is important that they be reasonably close to actual field conditions in the simulation setting. Given these constraints, agencies have not found it easy to develop reasonable simulation of actual on-the-job required police physical exercises.
Some success in this regard has been realized by governmental agencies testing for the role of fire fighter. The simulations include such exercises as ladder handling, a roof climb, a balance walk, a hose advance, and a dummy carry. Even for this role, however, problems are encountered. For example, the City of Dallas personnel division found that they were able to test for only fourteen of twenty-eight physical tasks included in the role of fire fighter, or 50 percent of the required on-the-job behaviors. (Considine, 1976). In a study addressing the role of correctional officers, six exercises were developed, but several involved generalized physical abilities, such as simple stamina (Hughes, et al., 1989). Specific simulation of "interactive" physical tasks is very difficult.
One of the most extensive efforts to develop job simulation physical exercises was undertaken by the Michigan Law Enforcement Officers' Training Council. Between 1979 and 1983 the Michigan Council engaged in a multi-year, $200,000 funded effort to validate job simulation physical selection exercises. Documentation regarding this effort is available in a series of reports from the Michigan Council. Some success was achieved, but with serious limitations.
The Michigan effort involved a job analysis of police physical skill requirements independent of a general job task analysis. The approach used there involved the distribution of a physical activity questionnaire to officers working in the field. Whenever officers engaged in an incident requiring physical activity, they completed the questionnaire form with regard to that individual incident.
The physical skills required of law enforcement officers were grouped under the auspices of that project into two categories: athletic and defensive. Athletic skills included: lifting/carrying, dragging/pulling, pushing, climbing, running, jumping, and crawling. The defensive skills did not fit into as neat a categorical listing, but were grouped generally under force used by officer (handcuffed, wrestled, applied restraining holds, hit/kicked, displayed firearm, unspecified action, nightstick/blackjack, chemical agent, discharged firearm) and evasive maneuvers (push/shove, pull, block, duck/dodge, unspecified). The Michigan POST Commission was able to develop job simulations for certain of the athletic skills, including running, lifting/carrying, dragging/pulling, pushing, climbing, and jumping. They were less successful in developing simulations of the defensive skills.
Unfortunately, it is the defensive skills which are the most critical. The notion that training in judo, karate, or variations of these martial arts can be subtituted for physical strength and stamina is a myth (Guyor, 1974). Certainly elements of these techniques are applicable to police defensive tactics training. However, certain minimal physical abilities are required to subdue a subject resisting arrest.
An earlier effort by the California Highway patrol had yielded a job simulation test battery which included portions of the physical abilities relevant to defensive skills. The job related physical tasks included a barrier surmount and arrest simulation. The barriers replicated a freeway center traffic divider and the chain link fence bordering California freeways. The test simulated pursuing a suspect over two walls (4'10" and 6'), apprehending and handcuffing the suspect, and returning over both walls to the patrol car. Rather than actually subdue a suspect, applicants had to manipulate an arrest resistor device, involving pulling together and holding two "arms" exerting 40 pounds of resistance left and 65 pounds right. Interestingly the simulation exercises correlated better with supervisor ratings of job performance than general fitness tests which had been administered earlier (Wilmore and Davis, 1979).
It is thus possible to develop certain job related simulation tests which are defensible. However, they are difficult to administer. More importantly, they tend to measure only a portion of the physical ability demands of the role of police patrol officer. One may well be able to drag a 200 pound dummy for twenty feet, but be incapable of subduing a subject resisting arrest. Indeed, Summers (1985) makes a clear distinction between physical fitness and physical agility: Physical fitness is a status measuring physical health. It may be measured by cardiovascular endurance, dynamic strength, or muscular endurance, flexibility, and body composition. It is not physical agility, which is a learned ability to perform a specific motor-related task (p. 13). The degree to which job simulation exercises represent skills which might be taught in an academy is open to considerable debate.
An additional problem is that certain simulations may be conducted which replicate in essence the required job behavior, but lack a benchmark standard of performance. For example, the data gathered by the author in a New York statewide study indicates that a police officer ought to be able to run at least a quarter of a mile. However, there is no standard of performance set with regard to how rapidly the distance needs to be covered. At one extreme a standard of one minute would certainly be seen as unreasonable; we have very few four minute milers as police applicants. At the other extreme a three minute standard is only a brisk walk, clearly not sufficient to capture a fleeing suspect. Establishing a specified figure within what most individuals would regard as a reasonable range - one to three minutes - is, however, problematic. There is absolutely nothing in job task analysis methodology which will give us guidance in this respect. At some point one must simply fall back upon the concept of "reasonableness", which is generally premised upon population norms. That is, it is reasonable to assert that a police officer ought to be able to keep up with an average person fleeing on foot. If that average pace is an eight minute mile, which data from the Aerobics Institute in Dallas indicates that it is, then it can be regarded as reasonable to require police applicants to run a quarter of a mile in two minutes. Nevertheless, one must recognize that there is some departure from the job task data itself in establishing that standard.
The standards ultimately implemented by the Michigan POST Commission are premised upon job simulation exercises but employ a relative standard of performance (see Table 1). The percentile performance levels represent those exhibited by all applicants to date since the physical agility screening program was implemented. Percentile scores are converted to stanines. The minimum passing score of 30 represents "average" performance, i.e., the 50th percentile score averaged across six exercises. The cut-off is thus set relative to the average performance of all police applicants, differentiated by male/female categories, rather than some absolute standard. Michigan simply found it impossible to identify an absolute standard which could be defended in litigation. [Tabular Data I Omitted]
Physical Agility Tests
Physical agility/stamina tests offer two distinct advantages over job simulation tests. First, they can be administered more economically, more safely, and more conveniently. One does not need specialized equipment or facilities. Further, there is co-mingling of physical abilities per se and job related skills. (For example, certain techniques with regard to a dummy drag work better than others. Someone who has either been trained or practiced the exercises may perform better even though their level of physical stamina is lower.)
Physical agility examinations usually consist of a set of exercises designed to measure general physical strength and stamina. Upper body strength may be measured by a combination of push-ups, pull-ups, weight lifts, or rope climb. Abdominal strength is usually measured using sit-ups or a variant thereof. General agility may be measured by a wall climb, confined crawl, balance walk, and broad jump. Stamina is usually measured by a timed run. In combination, these tests are certainly a reasonable measure of general physical condition.
There is, however, a significant problem with regard to using such tests as an employment selection device. It is one thing to assert that police officers must perform certain tasks which involve physical strength, agility, and stamina. It is wholly another to translate these required job behaviors to specific cutoff scores on physical agility examinations. Will ten push-ups suffice to qualify one to perform the tasks of a police officer requiring upper body strength, or will it take fifteen? How fast must one crawl through a confined space? How far must one be able to jump? The basic validation problem encountered here is that of construct validity. Physical agility tests do not measure direct simulations of required on-the-job behaviors, but rather a construct, general physical condition. There is some evidence which indicates that there is relatively little relationship between performance on such traditional measures of physical condition and performance on functional tasks analogous to those performed by police (Considine, 1976).
Complicating the situation is the fact that females perform substantially less well than males on traditional physical agility examinations. In particular, females have substantially less upper body strength than males. As a consequence, the adverse impact is enormous. Passing rates for females on such tests do not even approach the 80 percent standard set by EEOC. Given the circumstance, courts are prone to require conclusive demonstration that such tests are (1) an absolute business necessity, and (2) that there is no alternate showing technique. As a rule, it has been general physical agility tests which have faired the worse when challenged in litigation.
Physical Fitness Norms
A relatively new approach to setting physical standards is to employ norm referenced physical fitness measures. The test consists of a series of physical exercise measures which closely parallel most physical agility/stamina tests, although there is some variation. The Institute for Aerobics Research in Dallas, Texas developed this technique. The Aerobics Institute was founded by the well known author on the subject of physical fitness, Kenneth Cooper. The Aerobics Institute has developed a series of tests measuring five elements of physical fitness. They are:
- 1.5 mile run (time)
2.Body fat norms
- based upon skin folds
- sit and reach test
4.Abdominal muscular endurance norms
- one minute sit-up test
5.Relative strength norms
- one repetition maximum bench press ratio score is weight pushed
in pounds divided by body weight in pounds).
By testing a large number of individuals in terms of performance on these tests, the Aerobics Institute has published a set of performance standards in terms of age and gender. The norms are published in terms of a percentile performance level for a given gender and age group. Thus, one can ascertain what the 50th percentile performance level is (half the population above, half the population below) for males age thirty to thirty-nine.
The Aerobics Institute asserts that the tests they have devised constitute a measure of general physical fitness, or well-being. Indeed, the tests are often referred to as "wellness measures". One might argue that this is merely making a semantic distinction between the tests developed by the Aerobics Institute and traditional physical agility and stamina measures which were intended to ascertain one's physical condition. The Institute asserts, however, that there is a clear distinction.
Using the Aerobics Institute norm referenced test offers several advantages from a validation perspective. First, it can be argued that adverse impact will not exist. Although data differentiated by race does not exist, one would not anticipate any adverse impact in this regard. Since the data are differentiated by age and sex, adverse impact is by definition eliminated in this regard. The argument stated simply is that since everyone is being compared only to their age and gender category, adverse impact cannot exist. The coronary argument is that norm referenced standards do discriminate, since some individuals are selected in terms of lower absolute criteria than others. Thus, litigation could ensue with regard to reverse discrimination. Nevertheless, in the totality of issues involved, using the norm referenced standards appears to have the least potential for engendering civil rights litigation.
The tests are clearly not as job related as simulation tests might be. However, one can argue that there is a prima facie relationship. If it is documented that police officers do have to run as part of their job duties, that they have to push heavy objects, that they have to lift and carry heavy objects, and that they have to be reasonably agile in order to engage in personal defense activities, documented linkage will exist. The Aerobics Institute tests measure these types of characteristics. A more problematic issue would appear to be that of establishing a cut off score.
The percentile performance levels published by the Aerobics Institute are derived from a particular reference group - those individuals who have appeared at the Dallas center for assistance in achieving physical fitness. Whether the standards are representative of national norms is open to considerable debate. Data gathered by the Illinois Local Governmental Law Enforcement Officers Training Board, the Illinois POST Commission, indicates that the Institute's norms may be high. Thus, one must be very cautious in selecting a required percentile level on the basis of current published data.
An additional issue in this regard is which percentile level to select, regardless of the accuracy of the normative data. The natural inclination appears to be to use the 50th percentile level as a standard. The selection of this level is premised simply upon a test of reasonableness, i.e., it is reasonable to assert that police officers ought to be as physically fit as the average of the population. One can offer arguments, however, that as a selection standard the percentile level ought to be set lower, with the intent that conditioning during a basic training academy will raise the standard of performance.
The Illinois POST Commission implemented such a two-tiered system in July, 1987, (I.L.G.L.E.O.T.B., 1988). Police applicants are required to perform at the 30th percentile level across five of the tests promulgated by the Aerobics Institute. By the completion of academy training, recruits must perform at the 50th percentile level. Documentation is available from the Illinois Local Governmental Law Enforcement Officers Training Board regarding their program.
Further, use of the Aerobics Institute norms was recently upheld in U.S. v. City of Wichita Falls, 704 F. Supp. 709 (N.D. Tex., 1988). Wichita Falls employed the Aerobics Institute battery as an initial screening device. The court opinion specifically noted both the issue of the representativeness of the Aerobics Institute sample, and the use of the 50th percentile level as a cut-off. The difficulty of offering incontrovertible defense regarding both issues was acknowledged in upholding the test.
However, consistent with maintaining the ambiguity which has beset this area for years, the U.S. Seventh Circuit Court of Appeals just remanded a case involving relative standards. The Evanston Fire Department employed a physical agility test which requires performance within one standard deviation of the mean. The Seventh Circuit criticized the City for employing a relative standard, since "the ability" to perform fire fighting tasks adequately depends not on relative but on absolute test performance." Evans v. City of Evanston, Docket Nos. 88-2928, 88-2995, July 27, 1989. Thus, administrators are left once again with "Who knows?"
It should be noted at this point another problem which has surfaced as a result of the Illinois experience. It appears that litigation will ensue between the City of Chicago and the Illinois POST Commission. The City of Chicago asserts that the physical fitness standards, although not discriminatory in intent or direct application, have an aggregate adverse impact upon their efforts to recruit members of protected classes. Specifically, the City of Chicago asserts that the standards exclude by definition a high proportion of the population, either 30 percent or 50 percent depending upon one's perspective. The City further asserts that this exclusionary effect hinders the recruitment of members of protected classes who would otherwise meet minimum selection criteria. The City maintains, of course, that the physical fitness standards may be job related, but are not necessarily job requirements.
Yet another problem with the Aerobic Institute standards is the norm referenced data is not aggregated. That is, the Aerobic Institute has gathered performance data with regard to the ability of the general population to perform each given task. However, there is no data which tells us what proportion of the population can pass all of the tests involved, or any subset, at any given percentile level. Thus an individual might be able to perform the walk/run at the 50th percentile level, but only be able to perform the sit-up test at the 30th percentile level. Again, preliminary data from the State of Illinois indicates that considerable attrition occurs whenever a percentile level is stipulated for a set of tests. Thus, if the 30th percentile level is set for each test, a considerably smaller proportion of the population than 30 percent will be able to pass five tests at the 30th percentile level. An individual who is agile in one respect may not be equally agile in terms of the other tests. Indeed, one could argue that several different tests would not be necessary if they were all measuring the same thing, one could simply implement any one of them. The Aerobic Institute asserts, rightfully so, that several tests are required to measure general level of fitness. Thus variation is certainly expected.
In summary, like the other two types of physical ability testing described, norm referenced standards have both distinct advantages and disadvantages. Their implementation should not be seen as a quick cure-all for the problems besetting the promulgation of physical ability standards. While some problems are solved, others are created.
Each of the three types of physical ability testing approaches are currently being used in the field. Each has strengths and weaknesses. Each has withstood some court challenges, while being declared invalid in others. While some agencies are experimenting with combinations of the techniques, hybrid approaches do not in and of themselves solve all the validation problems. A dominant model has yet to emerge from the physical agility testing chaos created by the imposition of validation requirements.
Burney v. City of PawtucketI, 559 F. Supp. 1089, 34 FEP 1274 BCR 1, (1983). Considine, W. et al. Developing a physical test battery for screening fire fighter applicants. Public Personnel Management, 5, 7-17, 1976. Eison v. City of Knoxville, 570 F. Supp. 11, 33 FEP 1141 (D.C. Tenn., 1983). Equal Employment Opportunity Commission. Uniform guidelines on employee selection procedures. Washington, D.C.: GPO, 1978. Evans v. City of Evanston, Docket Nos. 88-2928, 88-2995, July 27, 1989. Guyor, J.R. Spokane learns from experience: A police officer physical fitness program in which men and women compete equally. Public Personnel Management, 3, 10-18, January-February, 1974. Hughes, M.A., R.A. Ratliff, J.L. Purswell and J. Hadwiger. A content validation methodology for job related physical performance tests. Public Personnel Management, 18, 4, 487-504, Winter, 1989. Illinois Local Governmental Law Enforcement Officers Training Board. Facts about Illinois law enforcement physical fitness standards and programs. Springfield: ILGLEOTB, 1988. Michigan Law Enforcement officers Training Council. A job analysis of police physical skills requirements. Lansing: M.L.E.O.T.C., 1979. Pina v. City of East Providence, 492 F.Supp. 1240, 31 FEP 230 (D.C. R., 1980). Schofield, D.L. Establishing health and fitness standards: Legal considerations. FBI Law Enforcement Bulletin, 58, 25-31, June, 1989. Summers, W.C. Title VII challenges to physical fitness requirements. The Police Chief, 13, February 1985. United States v. City of Wichita Falls, 704 F.Supp. 709 (N.D. Tex., 1988). Vanguard Justice Society v. Hughes, 592 F.Supp. 245, 255 (D.C. MD., 1984). Wilmore, J.H. and J.A. Davis. Validation of a physical abilities field test for the selection of state traffic officers. Journal of Occupational Medicine, 21, 33-40, January, 1979. Larry T. Hoover Professor Hoover has extensive experience in law enforcement selection and certification. He has directed police job analysis projects in three states, Illinois, New York, and Texas, which provided the foundation for state licensing examinations. A past president of the Academy of Criminal Justice Sciences, he has several publications addressing police personnel issues.…