# information theory

information theory or communication theory, mathematical theory formulated principally by the American scientist Claude E. Shannon to explain aspects and problems of information and communication. While the theory is not specific in all respects, it proves the existence of optimum coding schemes without showing how to find them. For example, it succeeds remarkably in outlining the engineering requirements of communication systems and the limitations of such systems.

In information theory, the term *information* is used in a special sense; it is a measure of the freedom of choice with which a message is selected from the set of all possible messages. Information is thus distinct from meaning, since it is entirely possible for a string of nonsense words and a meaningful sentence to be equivalent with respect to information content.

**Measurement of Information Content**

Numerically, information is measured in bits (short for *binary digit*; see binary system). One bit is equivalent to the choice between two equally likely choices. For example, if we know that a coin is to be tossed but are unable to see it as it falls, a message telling whether the coin came up heads or tails gives us one bit of information. When there are several equally likely choices, the number of bits is equal to the logarithm of the number of choices taken to the base two. For example, if a message specifies one of sixteen equally likely choices, it is said to contain four bits of information. When the various choices are not equally probable, the situation is more complex.

Interestingly, the mathematical expression for information content closely resembles the expression for entropy in thermodynamics. The greater the information in a message, the lower its randomness, or
"noisiness,"
and hence the smaller its entropy. Since the information content is, in general, associated with a source that generates messages, it is often called the entropy of the source. Often, because of constraints such as grammar, a source does not use its full range of choice. A source that uses just 70% of its freedom of choice would be said to have a relative entropy of 0.7. The redundancy of such a source is defined as 100% minus the relative entropy, or, in this case, 30%. The redundancy of English is estimated to be about 50%; i.e., about half of the elements used in writing or speaking are freely chosen, and the rest are required by the structure of the language.

**Analysis of the Transfer of Messages through Channels**

A message proceeds along a channel from the source to the receiver; information theory defines for any given channel a limiting capacity or rate at which it can carry information, expressed in bits per second. In general, it is necessary to process, or encode, information from a source before transmitting it through a given channel. For example, a human voice must be encoded before it can be transmitted by telephone. An important theorem of information theory states that if a source with a given entropy feeds information to a channel with a given capacity, and if the source entropy is less than the channel capacity, a code exists for which the frequency of errors may be reduced as low as desired. If the channel capacity is less than the source entropy, no such code exists.

The theory further shows that noise, or random disturbance of the channel, creates uncertainty as to the correspondence between the received signal and the transmitted signal. The average uncertainty in the message when the signal is known is called the equivocation. It is shown that the net effect of noise is to reduce the information capacity of the channel. However, redundancy in a message, as distinguished from redundancy in a source, makes it more likely that the message can be reconstructed at the receiver without error. For example, if something is already known as a certainty, then all messages about it give no information and are 100% redundant, and the information is thus immune to any disturbances of the channel. Using various mathematical means, Shannon was able to define channel capacity for continuous signals, such as music and speech.**Bibliography**

See C. E. Shannon and W. Weaver, *The Mathematical Theory of Communication* (1949); M. Mansuripur, *Introduction to Information Theory* (1987); J. Gleick, *The Information: A History, a Theory, a Flood* (2011).