The Use of the Range and Mean Deviation in Interpreting the Standard Deviation

By Rhiel, G. Steven | Akron Business and Economic Review, Fall 1990 | Go to article overview

The Use of the Range and Mean Deviation in Interpreting the Standard Deviation


Rhiel, G. Steven, Akron Business and Economic Review


The Use of the Range and Mean Deviation in Interpreting the Standard Deviation

With the emergence of sophisticated MIS systems, today's businessmen and women have more access to statistical data than ever before. Many of these individuals have difficulty interpreting and, consequently, using many of the statistics provided on computer printouts. The standard deviation(1) is one of these statistics. Although it is commonly used, it is difficult to interpret because of its mathematical complexity.

Lay people, in particular, have a difficult time understanding the standard deviation.(2) For example, the manager of a credit union who received reports that contain the mean and standard deviation of loan amounts may have problems explaining these statistics to members of the credit union governing board. If in a particular report the mean and standard deviation of the loans were $5000 and $400, respectively, the average loan of $5000 could be easily understood by the board members, but the standard deviation of $400 may not - even though it may be very important in analyzing the overall loan situation. One can imagine many situations in business similar to this where an intuitive understanding of the standard deviation may be useful.

One method to facilitate the interpretation of the standard deviation is to relate it to the range. For an infinite, normal population, three standard deviation units above and below the mean encompass 99.73 percent of the distribution, which results in the range equaling approximately six standard deviations. A similar technique for interpreting the standard deviation for a sample would be very useful, since the sample standard deviation is commonly found on business reports.

According to McNemar [2], for samples from the normal distribution, the range is equal to five standard deviations when n (the sample size) is 50, six standard deviations when n is 200, and seven standard deviation units when n is 1000. Pearson and Stephens [4] provide 12 ratios for the normal distribution (mathematically derived) for sample sizes from three to 100. For a sample size of fifty, Pearson and Stephens' ratio is 4.4212 compared with McNemar's 5.000.

Information on the relationship (or ratio) of the range to standard deviation for samples from non-normal distributions is scarce. Baker [1] compared ratios from the normal distribution to ratios from the platykurtic-bimodal and skewed-bimodal distributions for sample sizes 64 and 100. He found that the ratios for the platykurtic-bimodal distribution are smaller than those for the normal distribution: 4.1349 to 4.8272 when n is 64 and 4.4864 to 5.1214 when n is 100. For samples from the skewed-bimodal distribution, the ratios differ minimally from those for the normal distribution. Additional research to determine the extent to which these ratios vary when the population shape becomes non-normal may be critical in establishing the range as an aid in explaining the standard deviation.

Another approach to interpreting or explaining the standard deviation (S) is to describe it in terms of the mean deviation (MD). The mean deviation is easily understood since it is the average of the absolute deviations about the mean or, more plainly stated, the average amount the observations vary from the mean. Because of its simplicity, if the sample mean deviation is approximately equal to the sample standard deviation, regardless of the sample size or population shape, the meaning of the standard deviation to lay people would be clearer.

McNemar [2] states that the relationship of the mean deviation to the standard deviation for the normal distribution is MD = .798S. However, no research is available to substantiate whether this is true for all sample sizes, whether from normal or non-normal distributions.

The purpose of this study is to determine the relationship of the range (W) and the mean deviation (MD) to the standard deviation (S) for various sample sizes from various shaped distributions. These relationships are used to facilitate the interpretation or explanation of the standard deviation.

COMPUTER SIMULATION

The relationship of the sample range and sample mean deviation to the sample standard deviation is investigated by using computer simulation techniques. (Computer simulation is used because it is mathematically infeasible to derive these relationships theoretically.) Sampling distributions for the sample range (W), the sample mean deviation (MD), and the sample standard deviation (S) for various sample sizes from each of nine distributions are computer generated. One thousand simulations are used to build each sampling distribution.

These sampling distributions are generated for sample sizes two to 10 by increments of two, sample sizes 10 to 60 by increments of 10, sample sizes 60 to 100 by increments of 20, sample sizes 200 to 1000 by increments of 150, and for sample sizes 1500 and 2000. The ratios of the mean sample-range [Mathematical Expression Omitted] and mean sample-mean-deviation [Mathematical Expression Omitted] to the mean sample-standard-deviation [Mathematical Expression Omitted] are determined for each of the sample sizes from the nine distributions.

The algorithm for generating the random numbers used in the simulation is stored in the DEC computer library. This algorithm is taken from an article by Payne, Rabung, and Bogyo [3].

DEFINITIONS

The remaining sections of this article contain several technical terms that are used to describe distribution shape. The following definitions clarify these terms.

Skewness - A distribution is either symmetrical or asymmetrical. Skewness is

           the property of asymmetry of a distribution. One measure of the
           degree of skewness is [beta.sub.1]. If [beta.sub.1] [is equal to] 0,
the distribution is
           symmetrical.
           If [beta.sub.1] > 0, the distribution is skewed to the right (i.e., t
he distribution
           has a large number of observations clustered toward the left
           with a long tail to the right). If [beta.sub.1] < 0, the distribution
 is skewed
           to the left.

Kurtosis - Kurtosis is a measure of the peakedness or flatness of a distribution.

           One measure of the degree of kurtosis is [beta.sub.2]. If [beta.sub.2
] = 3, the
           distribution is neither peaked nor flat topped ([beta.sub.2] = 3 for
the normal
           distribution). If [beta.sub.2] > 3, the distribution is peaked with l
ong tails.
           If [beta.sub.2] < 0, the distribution is flat with short tails.

Leptokurtic - Leptokurtic is a condition of kurtosis resulting in a peaked distribution

([beta.sub.2] > 3).

Platykurtic - Platykurtic is a condition of kurtosis resulting in a flat distribution

([beta.sub.2] < 3).

THE DISTRIBUTIONS SAMPLED

The nine distributions utilized in this study are described in this section in terms of their mean ([mu]), standard deviation ([sigma]), skewness ([beta.sub.1]), and kurtosis ([beta.sub.2]). These distributions were chosen for this study in order to ensure a wide range of kurtosis, since kurtosis controls the distribution of the range (see Singh, [6]).

1. Normal Distribution (N): Sample values from the normal distribution

are generated using algorithms obtained from Pritscher [5] such that

[mu] = 0.000, [sigma] = 1.000, [beta.sub.1] = 0.000, and [beta.sub.2] = 3.000.

2. Extremely Leptokurtic Distribution (EL): Sample values are generated

from a Chi Square distribution with 2 degrees of freedom such that

[mu] = 2.000, [sigma] = 2.000, [beta.sub.1] = 2.000, and [beta.sub.2] = 9.000.

3. Leptokurtic Distribution (L): Sample values are generated from a Chi

Square distribution with 4 degrees of freedom such that [mu] = 4.000,

[sigma] = 2.878, [beta.sub.1] = 1.410, and [beta.sub.2] = 6.000.

4. Slightly Leptokurtic Distribution (SL): Sample values are generated from

a Chi Square distribution with 8 degrees of freedom such that [mu] =

8.000, [sigma] = 4.000, [beta.sub.1] = 4.000, and [beta.sub.2] = 5.400.

5. Slightly Platykurtic Distribution (SP): Sample values are generated from

a slightly platykurtic distribution such that [mu] = 3.200, [sigma] = 1.120, [beta.sub.1] =

0.000, and [beta.sub.2] = 2.7263.

6. Platykurtic Distribution (P): Sample values are generated from a platykurtic

distribution such that [mu] = 3.142, [sigma] = 1.367, [beta.sub.1] = 0.000,and

[beta.sub.2] = 2.194.

7. Bimodal, Platykurtic Distribution (B): Sample values are generated

from a bimodal distribution such that [mu] = 1.571, [sigma] = 0.746, [beta.sub.1] = 0.000,

and [beta.sub.2] = 1.932.

8. Uniform, Platykurtic Distribution (UP): Sample values are generated

from a uniform distribution such that [mu] = 0.500, [sigma] = 0.289, [beta.sub.1] = 0.000,

and [beta.sub.2] = 1.800.

9. U-shaped, Platykurtic Distribution (US): Sample values are generated

from a U-shapred distribution such that [mu] = 0.000, [sigma] = 0.647, [beta.sub.1] =

0.000, and [beta.sub.2] = 1.545.

INVESTIGATIONS

Investigation 1: The Relationship of W to S

The ratio of the mean sample-range [Mathematical Expression Omitted] to the mean sample-standard-deviation [Mathematical Expression Omitted] ratios) for various sample sizes from each of the nine distributions are presented in Table 1. The [Mathematical Expression Omitted] ratios express how many mean standard deviations equal the mean range. For example, 4.517 standard deviations equal the range for a sample of size 50 from the normal distribution. The [Mathematical Expression Omitted] ratios for the normal distribution vary vrom 1.4312 for sample size two to 6.898 for sample size 2000. The [Mathematical Expression Omitted] ratios for the normal distribution in this study are in agreement to the second decimal place with those of Pearson and Stephens [4].

As the distributions become leptokurtic, the [Mathematical Expression Omitted] ratios decrease minimally from the [Mathematical Expression Omitted] ratios for the normal distribution for the smaller sample sizes and increase substantially for the larger sample sizes. The [Mathematical Expression Omitted] ratios consistently decrease as the distribution changes from normal to platykurtic. However, the decrease for the smaller sample sizes is not as extensive as the decrease for the larger sample sizes. [Tabular Data Omitted]

Investigation 2: The Relationship of MD to S

Table 2 contains the ratio of the mean sample-mean-deviation to the mean sample-standard-deviation [Mathematical Expression Omitted] for the various sample sizes for the nine distributions. The [Mathematical Expression Omitted] ratios can be interpreted as: the mean deviation is that proportion of the standard deviation; or, by moving the decimal point two places to the right, the mean deviation is that percent of the standard deviation. From Table 2, for a sample size of 750 from the normal distribution, the mean deviation is 79.7 percent of the standard deviation. If a sample of size 100 were drawn from a U-shaped distribution, the mean deviation would be 90 percent of the standard deviation.

The [Mathematical Expression Omitted] values are consistently between .75 and .80 (excluding sample size 2) for the leptokurtic and normal distributions. With the platykurtic distributions, the [Mathematical Expression Omitted] ratios range from .75 to .90 (excluding sample size 2). For samples of size 1000 or more from the normal distribution, the [Mathematical Expression Omitted] ratios from this research are all .798, the same as McNemar's [2] ratio. [Tabular Data Omitted]

CONCLUSIONS

Interpreting the sample standard deviation by stating that six standard deviations equal the range is inappropriate for most cases. The ratio of the range to standard deviation varies considerably depending on sample size and distribution shape. The instability of the W/S ratios makes them difficult to use to clarify the standard deviation when the distribution shape is not known.

One suggestion for using the range (W) to interpret the standard deviation (S) is to use the W/S ratios for samples from the normal distribution. Several of these are reported in McNemar's text [2]. However, McNemar's ratios appear to be erroneous and should be corrected as follows: the ratio of the range to standard deviation for samples from the normal distribution is approximately five when n = 100 (instead of n = 50), six when n = 500 (instead of n = 100), and seven when n = 2000 (instead of n = 1000). In addition, it may be worthwhile to include the following ratios:three when n = 10 and four when n = 30. If additional sample sizes are desired, they can be obtained from Table 1. The above values can be used for a general interpretation of the standard deviation.

For a specific interpretation of the standard deviation, let us use the example in the introduction concerning the credit union. If the mean and standard deviation of $5000 and $400, respectively, were calculated from a sample of 100 loans, the manager could explain to the governing board that if the distribution of loans is normal, five standard deviations equal the range of the loans (or the standard deviation is one-fifth of the range of the loans).(3)

The use of the range for interpreting the sample standard deviation provides a vast improvement over using the ratio of range to standard deviation of one to six, as is often done in practice.

The mean deviation should be extremely useful as an aid in interpreting (clarifying) the standard deviation. The mean deviation is 75 to 90 percent of the standard deviation (see Table 2). The stability of this relationship as the shape of the population varies should render the mean deviation as very helpful in clarifying the standard deviation. An explanation similar to the following could be used to explain the standard deviation by using the mean deviation:if the mean deviation and the standard deviation are both calculated from a set of data, the mean deviation, which is the average distance of the observations from the mean, is 10 to 25 percent less than the standard deviation.

In returning to the credit union example, the standard deviation of 400 could be interpreted by saying that the loan amounts vary from the mean of $5000 by slightly less than an average of $500. More specifically, if ten loans were used in determining these statistics and the distribution of loans was normal, the average amount the loans deviate from the mean loan is approximatley .775 ($500) = $387.50 (calculated from .775S = MD).(4)

In summary, the ratio of the range to standard deviation provides an improved method of interpreting the sample standard deviation over using the ratio of the range to standard deviation for the normal, infinite population. The author suggests using several W/S values for samples from the normal distribution for interpeting the standard deviation. The mean deviation is a measure that gives businessmen and women a means of comparing the standard deviation with a less abstract measure of dispersion, thus making it easier to understand. (1) The standard deviation referred to in this study is the root-mean-square estimator of [sigma].

(2) The author has taught statistics to graduate and undergraduate business students for twelve years and has observed the difficulty people have in understanding and interpreting the standard deviation. (3) This ratio is subject to sampling error, and, for any particular sample, the W/S ratio in the table may differ from that for the sample. (4) The ratio of the mean deviation to standard deviation is subject to sampling error, and, for a particular sample, the MS/S value from the table may differ from that for the sample.

REFERENCES

[1.] Baker, G.A. "Distribution of the Ratio of Sample Range to Sample Standard Deviation for Normal

and Combinations of Normal Distributions." Annals of Mathematical Statistics, 17 (1946),

366-69. [2.] McNemar, Quinn. Psychological Statistics, 4th ed., New York: John Wiley and Sons, Inc., 1969. [3.] Payne, W. H., J. R. Rabung, and T. P. Bogyo. "Coding the Lehmer Pseudo Random Number

Generator." Communications of the ACM, 12, 2 (February, 1969), 85-86. [4.] Pearson, E. S., and M.A. Stephens. "The Ratio of Range to Standard Deviation in the Same Normal

Sample." Biometrika, 51, parts 3 and 4 (December, 1964), 484-87. [5.] Pritscher, A.A.B. The Gasp IV Simulation Language. New York: John Wiley and Sons, Inc., 1974. [6.] Singh, C. "Moments of the Range of Samples from Nonnormal Populations." Journal of the

American Statistical Association, 71, number 356 (December, 1976), 988-91.

G. STEVEN RHIEL is Associate Professor of Management Information Systems/Decision Sciences at Old Dominion University.

The rest of this article is only available to active members of Questia

Sign up now for a free, 1-day trial and receive full access to:

  • Questia's entire collection
  • Automatic bibliography creation
  • More helpful research tools like notes, citations, and highlights
  • A full archive of books and articles related to this one
  • Ad-free environment

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

The Use of the Range and Mean Deviation in Interpreting the Standard Deviation
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    For full access in an ad-free environment, sign up now for a FREE, 1-day trial.

    Already a member? Log in now.