Self-information, Entropy and Cross Entropy
Q: What's the relationship between cross entropy, entropy and self-information
In Information Theory, these three concepts form a hierarchy. You can think of them as building blocks: Self-Information is the atom, Entropy is the molecule built from those atoms, and Cross-Entropy is how that molecule interacts with a different molecule.
Here is the relationship in a nutshell, followed by the deep dive: $$\text{Cross-Entropy} = \text{Entropy} + \text{KL Divergence}$$
The Atom: Self-Information
Before we can understand the average, we must understand the individual unit. Self-information (or “surprisal”) measures the surprise associated with a single outcome.