Amazon Web Services AWS Certified Machine Learning Engineer-Associate MLA-C01 Question # 14 Topic 2 Discussion

MLA-C01 Exam Topic 2 Question 14 Discussion:

Question #: 14

Topic #: 2

An ML engineer has trained a neural network by using stochastic gradient descent (SGD). The neural network performs poorly on the test set. The values for training loss and validation loss remain high and show an oscillating pattern. The values decrease for a few epochs and then increase for a few epochs before repeating the same cycle.
What should the ML engineer do to improve the training process?

Introduce early stopping.

Increase the size of the test set.

Increase the learning rate.

Decrease the learning rate.

Get Premium MLA-C01 Questions

Explanation

In training neural networks using Stochastic Gradient Descent (SGD), the learning rate is a critical hyperparameter that influences the convergence behavior of the model. Observing oscillations in training and validation loss suggests that the learning rate may be too high, causing the optimization process to overshoot minima in the loss landscape.

Understanding the Impact of Learning Rate:

High Learning Rate:A high learning rate can cause the model parameters to update too aggressively, leading to oscillations or divergence in the loss function. This manifests as the loss decreasing for a few epochs and then increasing, repeating this cycle without stable convergence.

Low Learning Rate:A low learning rate results in smaller parameter updates, allowing the model to converge more steadily to a minimum, albeit potentially at a slower pace.

Recommended Action:

Decreasing the learning rate allows for more precise adjustments to the model parameters, facilitating smoother convergence and reducing oscillations in the loss function. This adjustment helps the model settle into minima more effectively, improving overall performance.

Supporting Evidence:

Research indicates that large learning rates can lead to phenomena such as "catapults," where spikes in training loss occur due to aggressive updates. Reducing the learning rate mitigates these issues, promoting stable training dynamics.

References:

Catapults in SGD: Spikes in the Training Loss and Their Impact on Generalization Through Feature Learning

Lecture 7: Training Neural Networks, Part 2 – Stanford University

Conclusion:

To address oscillating training and validation loss during neural network training with SGD, decreasing the learning rate is an effective strategy. This adjustment facilitates smoother convergence and enhances the model's performance on the test set.

Actual exam question for Amazon Web Services MLA-C01 exam by Nyx3383 at Aug 6, 2025, 7:48:35 PM

Contribute your Thoughts:

Chosen Answer: A B C D
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.

New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Amazon Web Services AWS Certified Machine Learning Engineer-Associate MLA-C01 Question # 14 Topic 2 Discussion

Amazon Web Services AWS Certified Machine Learning Engineer-Associate MLA-C01 Question # 14 Topic 2 Discussion

Correct Answer:

Options Selected by Other Users:

Contribute your Thoughts:

New Year Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Amazon Web Services AWS Certified Machine Learning Engineer-Associate MLA-C01 Question # 14 Topic 2 Discussion

Amazon Web Services AWS Certified Machine Learning Engineer-Associate MLA-C01 Question # 14 Topic 2 Discussion

Correct Answer:

Options Selected by Other Users:

Contribute your Thoughts:

Awaiting moderator approval