At the end of this lesson, students are expected to:
To end our module on ML & optimization, let’s look back and see if we can find some common lessons. Consider the examples with Pena the English Setter, your old trusty refrigerator, the biodiversity climate crisis, Cinnamon the cat, and the laundry problem. What did these examples all have in common?
First of all, we collected or had access to a dataset in each of these problems. In each of these cases, our goal was to make predictions about something related to the data. In the case of Pena, the fridge, biodiversity, and Cinnamon, the problem was supervised, we had access to some labels corresponding to each data point our aim was to predict such labels given some new data points. In the laundry case the problem was unsupervised, we did not have a set of labels and our task was to predict some reasonable labels anyway.
From the perspective of this module, the most important connection between all of these problems is that they were all formulated in terms of some kind of model, including a function $h$, that approximates a real function mapping the input $\mathcal{X}$ to the output $\mathcal{Y}$. At the heart of the model is an optimization problem given by the average loss/objective function $L$ which quantifies the model’s mistakes. $L$ is written in terms of some parameters $\theta$ that dictate how the inputs $x_i$ are mapped to outputs $\hat{y_i}$. Searching for parameters $\theta^*$that minimize $L$ gets us a model that makes good predictions. This is called learning the parameters or training the model, and we use optimization algorithms to accomplish this.
<aside> ⚠️ Important — not all ML methods use optimization, although a great many do. The diagram above is from an optimization perspective, so we’ve placed optimization at the heart of the ML model.
</aside>