Another big takeaway is that training for more epochs leads to more memorization.
Should be expected. It's overfitting.
Should be expected. It's overfitting.