11   The Transformation of Entropy in Information Theory

 
In information theory the size of ignorance or uncertainty  in a certain amount of information is measured by determining the number of binary questions one would ask in order to know everything. For example, consider that a modem has received 1 character in 8-bit ASCII code. 256 different characters are possible. If all characters occur with the same probability, 8 yes-no questions are necessary on average to identify the character. So the entropy of this information is

therefore for the  Entropy of this information   Sinf jlog2j(256)  =  8

If one knows in addition that the most significant bit is 0, the entropy decreases immediately to 7, because it must be a character in the range of 0 to 127. And if we know that the character 'A' = 65 (dec) = 0100'0001 (bin) has been received, the entropy decreases to  0 = log2j(1).

Assuming that all cases are equally likely, entropy measures the number of possibilities given our knowledge of a particular situation that still come into question. The measure is taken as a base 2 logarithm. The entropy is thus a quantitative indication of our 'ignorance' or 'uncertainty' or our lack of information.

The fundamental unit of entropy in information theory is the bit, where 8 bits are grouped together to form a byte. Physically, the entropy of the information is unitless. The unit 'bit' has a function similar to that which 'radian' has to angle measurement. A hard disk of size 2 tebibyte (knowing nothing about its content) demonstrates an entropy of size log2j(2·28·240) = 49  (with a 2-terabyte hard drive, it is  log2j(2·28·1012) ≈ 48.863 ...). Thanks to the logarithmic function, the entropy does not increase astronomically even with huge amounts of information.

If I have left my car key in one of 8 school rooms I visited today, the entropy is log2 (8) = 3. If I also forgot my hat in one of these places, the number of possible situations increases to 8·8 = 64 and the entropy increases to 6. The entropy of N objects that can be located in m places independently, is
N · log2 (m). Entropies simply add. However, this does not apply to the house key on my keychain, because I have not left it independently of the car key, since it is attached to the same key chain.

The macro-state "my car key is in one of the rooms I visited today at school" is my current state of knowledge. It includes 8 possible micro-states corresponding to a maximum level of knowledge. The entropy of the macro-state is the logarithm of the number of micro-states, which belong to this macro-state.

We now have much of the terminology in place for the different kinds of entropy pertinent to physics. It should also be clear that the entropy of the information does not change when one examines from a fast-moving spacecraft either the school house with the misplaced hat or the aforementioned
2-terabyte hard drive. The number of micro-states is independent of the relative movement. So the following transformation formula holds for the entropy of information

Sinf' jSinf

A bar code still contains the same amount of information when subjected to a length contraction in whichever direction. This reminds me of a MAD cover illustration of the 70s: Alfred E. Neumann lowering the price by giving a barcode a 'trim' with a lawn mower ...