Does the entropy of a ChatGPT sentence match that of the output?

155 Views Asked by At

In this tweet, Nassim Taleb states the following:

If a chatbot writes a complete essay from a short prompt, the entropy of the essay must be exactly that of the initial prompt, no matter the length of the final product.

He then continues:

If the entropy of the output > prompt, you have no control over your essay. In the Shannon sense, with a temperature of 0, you send the prompt and the reveivers will recreate the exact same message. In a broader sense, with all BS being the same ornament, receivers get the same message in different but equivalent wrapping.

I have some intuitive feeling of understanding this. It reminds me of this joke about ChatGPT I found somewhere on Twitter, which was dubbed "The ChatGPT future of job applications":

enter image description here

If used in a specific way, ChatGPT can be viewed as a translator between different forms of the same message. It can translate not only between languages, but also between different styles of writing. Thus, the information content (and thus information entropy) stays the same.

However, I have trouble to really understand Taleb's statement. If I ask ChatGPT "Who is the current President of the United States?", it will answer "The current President of the United States is Joseph Biden." - in this case it should be absolutely clear that the output has more information in it. Maybe Taleb's statement is meant from the point of view of a world in which everone has access to an encyclopedia - in that case, the output has the same entropy, because everyone can just look up who the current POTUS is.

This answer tweet argues that each ChatGPT answer must have lower entropy than the prompt, since

Any expanding done on the part of the LLM is merely resolving the uncertainty by providing specific instances of that which is stated generally by the prompt.

These specific instances, by definition, have lower entropy by virtue of their specificity. Thus it would take all possible essays that could be written to match the entropy of the original prompt.

This makes the prompt the entity with the highest possible information content.

Again, this somehow makes sense, but only if you assume some kind of all-knowing, because if the "receiver" does not have access to an encyclopedia, the ChatGPT output surely has more information in it than just the prompt.

So, I'm looking for an answer that explains the relationship between the entropy of a ChatGPT prompt and that of its output. Under what exact conditions is the entropy of the prompt higher/equal/lower than/to that of the output?