At the end of this lesson, students are expected to:
Previously, we dived pretty deep into learning theory, which gave some theoretical framework for understanding generalization, but it didn’t really provide much in terms of what we can do in practice to help our models generalize and test to see if we’ve been successful. In this lesson, we will look at generalization from a practical standpoint.
We learned in 3.3 Learning Theory that, to improve generalization in certain situations, we may want to limit the expressivity of our hypothesis class $\mathcal{H}$. There are a number of ways to do this, but one of the simplest and most effective is regularization. In essence, we add a term, or penalty, to the objective function (a.k.a. the average loss $L$ or empirical risk $R_{\text{emp}}$) that imposes an additional cost which limits expressivity of the model. The added term discourages overly complex or extreme solutions by limiting the numeric range of the parameters.
For example, if we are trying to solve a regression problem with square loss