Lycaeum — Education & Interview Prep for AI, ML & Quant

Unsupervised Learning

In supervised learning, you have labels. In unsupervised learning, you don't — you just have data, and the goal is to find hidden structure or patterns.

Common tasks:

Clustering — Group similar data points together (customer segments, document topics)

Dimensionality reduction — Compress high-dimensional data while preserving structure (PCA, t-SNE)

Anomaly detection — Find unusual data points (fraud detection, defect detection)

Clustering with K-Means

You already know K-Means from the problem bank! Let's see it in action on a simple dataset.

Run the code to cluster 2D points into 3 groups:

Python

Loading editor...

Loading Python runtime...

Notice that we never told the algorithm which points belong to which group — it figured out the structure on its own. That's the power of unsupervised learning.

When to Use Unsupervised Learning

You have lots of data but no labels (labels are expensive to create)

You want to explore and understand your data before building a supervised model

The task is inherently about finding structure (market segmentation, topic modeling)

Supervised vs. Unsupervised: A Comparison

	Supervised	Unsupervised
Data	Labeled (X, y)	Unlabeled (X only)
Goal	Predict y for new X	Find patterns in X
Evaluation	Compare predictions to true labels	Harder — domain knowledge needed
Examples	Classification, regression	Clustering, PCA, anomaly detection

Key Takeaways

Unsupervised learning finds patterns in data without labels

Clustering groups similar points together — K-Means is the simplest approach

Unsupervised learning is useful for exploration, segmentation, and when labels aren't available

Evaluation is harder than supervised learning because there's no "correct answer" to compare against

Supervised Learning