4.4 Architectures and Innovations

🎓 Intended learning outcomes

At the end of this lesson, students are expected to:

Understand the vanishing gradient problem and how it can be affected by different activation functions
Know and understand the ReLU activation function and its variants, as well as the other common activation functions
Understand convolutions, how they can be used in neural networks, and why they are suited to images
Describe the degradation problem, the insight that led to residual connections, and what is a residual connection
Understand and describe what pretraining and transfer learning is in the context of neural networks
Understand and be able to use data augmentation to improve generalization.
Know what an autoencoder is, how it is structured, and how it can be used to learn good features from an unlabeled dataset
Describe internal covariate shift, and how normalization techniques like batch norm help alleviate the problem
Understand and be able to implement dropout in neural networks.

Up until this point in the module, we have only covered standard fully-connected feed-forward neural networks. In this sense, our knowledge of neural networks is stuck in 1986. While understanding basic fully-connected feed-forward neural networks is essential, there have been many exciting advances in recent years. In this lesson, we will cover some more modern components and architectures of neural networks.

Table of Contents

🎓 Intended learning outcomes