Table of Contents
๐ Intended learning outcomes
At the end of this lesson, students are expected to:
- Describe what a test set is and what its purpose is
- Recognize underfitting/overfitting and relate it to model expressiveness
- Explain the consequences of overfitting and underfitting
- Understand and explain the i.i.d. assumption and its role in understanding generalization
- Explain what a learning curve is and its various parts
- Understand and explain how the terms error and loss are related
- Describe the generalization gap, be able identify it, and relate it to underfitting and overfitting
- Be able to diagnose a ML modelโs generalization ability by inspecting the learning curves
In this lesson, we will discuss the basics of generalization which, as we learned in the last lesson, is how well an ML model is able to predict outcome values for previously unseen data.
๐ฅ Test data & the return of the avocado โ
โ
โ
Letโs return to our avocado example from the Introduction and look again at the linear regression model we fit to your 50 avocados. We went with a not-too-complex polynomial and we were pretty happy with our results. When we applied our model to test data โ the new unseen avocados from customers โ we were pleased with the performance (and so were the customers)!