I thought about asking this question on CS theory stack exchange, but it is not research level, and the closest SE is math. Please migrate this question if it's off topic.
I'm reading Information theory, inference and learning algorithms, by David J. MacKay, and I'm confused about the way he treats code rate. If I understand correctly, the rate of a code answers the question
How many bits of information can we send through a lossy channel, if we encode the message in $N$ bits?
the answer is $K=RN$, and clearly $R\leq 1$. While discussing perfect error correcting codes, the book contains the following statement
We will show in several ways that useful perfect [$t$-error correcting] codes do not exist (here, 'useful' means 'having large block length $N$, and rates close neither to $0$ nor to $1$')
Question: why is it bad for a code to have a rate close to $1$? It seems to me that the closer the rate is to $1$ the better, since that way we don't need many additional bits to encode a message. What am I missing?
It depends on the application.
Anyway, the game to be played is to design the system to cope with a foreseeable probability of a received bit being in error, and also be able to transmit data at a prescribed rate. We usually cannot guarantee transmission free of errors, but most applications don't need that. For example, when a Europe wide system for broadcasting a TV signal was being updated a bit over ten years ago (yours truly was there), the prescribed quality of service was that after the error-correcting-code has done its magic, the system should have a bit error probability at most $10^{-11}$. That may sound like asking a lot, but consider the following. The data stream consists of compressed video. A single errorneous bit will cause the decompression to fail (how severely depends on exactly where the error is), producing a small a black block on the TV screen. A typical HD-quality stream has 10 million bits per second. So if we allowed a bit error probability of $10^{-10}$, then there would be a black block on the screen every 1000 seconds, that is every fifteen minutes. It was decided that the consumers would not be happy with that.
IIRC the raw (=uncoded) bit error rate in such a brodcast channel was up to something like 5-15 percent for some receivers to be served (obviously not to those living close to the mast). So BCH-codes simply would not work. The so called LDPC-codes studied extensively by MacKay and others, however, do get the job done. This is because, unlike the algebraically more interesting BCH-codes, they are designed to take reliability information about individual received bits into account. A well designed receiver can recognize many of the errorneously received bits as unreliable (so called soft errors), and LDPC-codes take that into account without any penalty to the decoding complexity.