GitHub

🛡️ λ-Guard

Overfitting detection for Gradient Boosting — no validation set required

Understand when boosting stops learning signal and starts memorizing structure.

---

❓ Why λ-Guard

In Gradient Boosting, overfitting usually appears after the real problem has already started.

Before validation error increases, the model is already:

splitting the feature space into extremely small regions
fitting leaves supported by very few observations
becoming sensitive to tiny perturbations

The model is not improving prediction anymore.

It is learning the shape of the training dataset.

λ-Guard detects that moment.

🧠 The intuition

A boosting model learns two different things at the same time:

Component| What it does Geometry| partitions the feature space Predictor| assigns values to each region

Overfitting happens when:

«the geometry keeps growing but the predictor stops gaining real information.»

So λ-Guard measures three signals:

📦 capacity → how complex the partition is
🎯 alignment → how much signal is extracted
🌊 stability → how fragile predictions are

🧩 Representation (the key object)

Every tree divides the feature space into leaves.

We record where each observation falls and build a binary matrix Z:

Z(i,j) = 1 if sample i falls inside leaf j Z(i,j) = 0 otherwise

Rows → observations Columns → all leaves across all trees

Think of Z as the representation learned by the ensemble.

Linear regression → hat matrix H Boosting → representation matrix Z

📦 Capacity — structural complexity

C = Var(Z)

What it means:

low C → the model uses few effective regions
high C → the model fragments the space

When boosting keeps adding trees late in training, C grows fast.

🎯 Alignment — useful information

A = Corr(f(X), y)

(or equivalently the variance of predictions)

high A → trees add real predictive signal
low A → trees mostly refine boundaries

Important behavior:

«After some number of trees, alignment saturates.»

Boosting continues building structure even when prediction stops improving.

🌊 Instability — sensitivity to perturbations

We slightly perturb inputs:

x' = x + ε ε ~ Normal(0, σ²)

and measure prediction change:

S = average |f(x) − f(x')| / prediction_std

low S → smooth model
high S → brittle model

This is the first thing that explodes during overfitting.

🔥 The Overfitting Index

λ = ( C / (A + C) ) × S

Interpretation:

Situation| λ compact structure + stable predictions| low many regions + weak signal| high unstable predictions| very high

λ measures:

«how much structural complexity is wasted.»

(You can normalize λ to [0,1] for comparisons.)

🧪 Structural Overfitting Test

We can also check if specific training points dominate the model.

Approximate leverage:

H_ii ≈ Σ_trees (learning_rate / leaf_size)

This behaves like regression leverage.

We compute:

T1 = mean(H_ii) # global complexity T2 = max(H_ii)/mean(H_ii) # local memorization

Bootstrap procedure

repeat B times: resample training data recompute T1, T2

p-values:

p1 = P(T1_boot ≥ T1_obs) p2 = P(T2_boot ≥ T2_obs)

Reject structural stability if:

p1 < α OR p2 < α

📊 What λ-Guard distinguishes

🧭 When to use

monitoring boosting while trees are added
hyperparameter tuning
small datasets (no validation split)
diagnosing late-stage performance collapse

🧾 Conceptual summary

Z → learned representation C → structural dimensionality A → extracted signal S → smoothness λ → structural overfitting

Overfitting = structure grows faster than information.

📜 License

MIT (edit as needed)

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
doc		doc
LambdaGuard_algo.py		LambdaGuard_algo.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

faberBI/LambdaGuard

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages