This term I’m happy to be lecturing error correcting codes, particularly because I’ve now cleared up a misconception I somehow managed to develop in my undergraduate days about the relationship between the Hamming and Shannon theories of coding.
Let be a code (not necessarily linear). For let be the number of positions in which and have different entries; this is the Hamming distance between and . Let denote the minimum Hamming distance between two elements of .
When is used as a code in the Hamming setup of nearest neighbour decoding, we can guarantee to correct up to errors, where . (For the balls of radius about different are disjoint if and only if .) If each bit is corrupted with probability then a typical codeword is corrupted in places when it is transmitted. So to have any hope of communicating reliably when is near to , we need to be near to . However one of the Plotkin bounds says that if then
so such codes are necessarily rather small. Indeed, if . So the Plotkin bound shows that whenever , it is impossible to reliably communicate using codes given by the Hamming setup, except by using a trivial two-word codebook.
On the other hand, Shannon’s Noisy Coding Theorem says that it is possible to communicate reliably at any rate , where is Shannon's entropy function. (All logs in this post are to base of course.) I won't try to define rate here, but it is certainly the case that if a code can be used to communicate reliably at rate , then . This seems to contradict what was just said.
The resolution to the paradox is that the high-rate codes whose existence is guaranteed by Shannon's Noisy Coding Theorem do not satisfy the Hamming criterion of high minimum distance. This is allowable because in the Shannon setup the decoder is not specified, and all we need is good performance on average. These remarks show that this extra freedom makes a really big difference.