Table of contents
🎓 Intended learning outcomes
At the end of this lesson, the student is expected to:
- Understand that there exist use-cases in machine learning where model interpretability is more important than getting the best possible performance
- Recognise the similarities between flowcharts and decision trees
- Be able to give a formal definition of decision trees as recursive, greedy algorithms
- Understand how decision trees can be visualised both as graphs and as partitions of the feature space
- Understand how decision trees can be used for both classification and regression problems
- Define questions and question sets for numerical and categorical features
- Explain the learning algorithm used for fitting decision tree models
- Understand and be able to use the loss functions based on mean squared error, information gain and Gini impurity for building decision trees
- Be able to apply simple stopping criteria as regularisation techniques for decision-tree learning
- Be aware of at least one problem that is particularly ill-suited for decision-tree learning
This lesson introduces decision trees, which are a simple and intuitive model for classification and regression. Decision trees are very popular in applied machine learning, and are used in state-of-the-art approaches for certain types of problems. They also provide another good illustration of how concepts from information theory are useful in machine learning.
Motivating example: Ice cream preferences
Imagine that you are a data scientist, and a new customer, Märta approaches you with the following problem. Märta, an old lady, is the owner of a recently opened ice cream shop, offering over 100 varieties of ice cream. She would like to add a shelf for special “limited edition” and “recently released” popsicles to her store, where the selection would change every month. Therefore, to get to know her customers’ preferences in popsicles, she’s been selling a very wide variety of them since the shop opened.