3.5 Generalization in Practice

🎓 Intended learning outcomes

At the end of this lesson, students are expected to:

Explain the purpose of regularization and how it is implemented
Explain when $L1$ or $L2$ regularization might be more desirable
List and explain the various forms of implicit and explicit regularization
Motivate the need for a validation set, and explain how it relates to model selection and hyperparameter tuning
Explain model selection and hyperparameter tuning
Describe and be able to implement cross validation and when it may be useful

Previously, we dived pretty deep into learning theory, which gave some theoretical framework for understanding generalization, but it didn’t really provide much in terms of what we can do in practice to help our models generalize and test to see if we’ve been successful. In this lesson, we will look at generalization from a practical standpoint.

🏦 Regularization ★★★

We learned in 3.3 Learning Theory that, to improve generalization in certain situations, we may want to limit the expressivity of our hypothesis class $\mathcal{H}$. There are a number of ways to do this, but one of the simplest and most effective is regularization. In essence, we add a term, or penalty, to the objective function (a.k.a. the average loss $L$ or empirical risk $R_{\text{emp}}$) that imposes an additional cost which limits expressivity of the model. The added term discourages overly complex or extreme solutions by limiting the numeric range of the parameters.

For example, if we are trying to solve a regression problem with square loss

Untitled

Table of Contents

🎓 Intended learning outcomes

🏦 Regularization ★★★