8.2 Fundamentals of Information Theory

🎓 Intended learning outcomes

At the end of this lesson, the student is expected to:

Be able to define the fundamental concepts of information theory, including the information content, the entropy, the cross entropy and the relative entropy
Understand the intuitive explanations of entropy, cross entropy and relative entropy from the perspective of coding theory
Be familiar with the computations involved in calculating information content, entropy, cross entropy, and relative entropy in practice
Be able to recognise and explain the connection between the binary cross entropy loss and the cross entropy between two densities
Be able to define divergence functions and in particular, the Kullback-Leibler divergence (KLD)
Be aware of the decomposition of the KLD into a negative entropy and a cross entropy term
Understand the asymmetric nature of the KL divergence and be able to mathematically show why it happens
Motivate the minimisation of the KL divergence as an alternative formulation of maximum likelihood density estimation

What is information?

We will begin with an essential question: what is information? More precisely, how do we quantify information?

Picture this: You’re sitting in the KTH library on a Wednesday night, trying to find the last bugs in your solutions to the expectation maximisation exercise, but you just can’t figure it out. You decide to take a break and check your e-mails on your phone. This is what you see:

Untitled

Question: Which of these e-mails would you open first? That is, which e-mail is likely to contain the most information, based on the previews? Why?

Table of contents

🎓 Intended learning outcomes

What is information?