The most important rule in machine learning: never evaluate your model on the data it was trained on. A model that memorizes the training data will look perfect on it but fail on new data. This is called overfitting.
Split your data into two parts:
The test set simulates "new, unseen data." If the model performs well on it, you can be more confident it will generalize.
Overfitting happens when a model learns the noise in the training data, not just the signal. Signs:
The opposite problem is underfitting — the model is too simple to capture the pattern.
Notice how degree 12 has nearly zero training error but very high test error — that's overfitting. The model memorized the training points but can't generalize.