Lycaeum — Education & Interview Prep for AI, ML & Quant

Bias-Variance Tradeoff

Every prediction error can be decomposed into three components:

Error = Bias^2 + Variance + Irreducible Noise

Understanding this tradeoff is fundamental to building good models.

What is Bias?

Bias measures how far off the model's average prediction is from the true value. High bias means the model makes strong assumptions and misses patterns.

A linear model fit to curved data has high bias — it systematically underpredicts or overpredicts

High bias → underfitting

What is Variance?

Variance measures how much the model's predictions change if you train it on different data. High variance means the model is too sensitive to the specific training data.

A high-degree polynomial has high variance — small changes in training data cause wildly different predictions

High variance → overfitting

Python

Loading editor...

Loading Python runtime...

The Tradeoff

Simple models (low degree) → High bias, low variance → underfitting

Complex models (high degree) → Low bias, high variance → overfitting

Best model → Right amount of complexity that minimizes total error

You can't minimize both simultaneously — reducing bias increases variance and vice versa. The goal is to find the sweet spot.

How to Manage It

More data → reduces variance without increasing bias (best solution)

Regularization → adds a penalty for model complexity (reduces variance)

Cross-validation → estimates the right complexity level

Ensemble methods → combine multiple models to reduce variance (Random Forest, Bagging)

Key Takeaways

Prediction error = bias^2 + variance + noise

Bias = systematic error from wrong assumptions (underfitting)

Variance = sensitivity to training data (overfitting)

The tradeoff: simpler models have more bias, complex models have more variance

More data and regularization help manage the tradeoff

Model Evaluation Metrics