Submitting the form below will ensure a prompt response from us.
In machine learning, models often face a common problem: they perform well on training data but poorly on unseen data. This phenomenon is known as overfitting. One of the most effective techniques to overcome it is Regularization.
Regularization is a technique that adds a penalty to the loss function to discourage the model from becoming too complex. By penalizing large weights, regularization helps build simpler models that generalize better on new data.
Type | Penalty Term | Effect |
---|---|---|
L1 (Lasso) | `λ * Σ | w |
L2 (Ridge) | λ * Σ(w²) | Shrinks weights uniformly |
Elastic Net | `λ1 * Σ | w |
Let’s walk through an example using scikit-learn.
Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
Python
X, y = make_regression(n_samples=100, n_features=20, noise=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Python
# No Regularization
lr = LinearRegression().fit(X_train, y_train)
# L2 Regularization (Ridge)
ridge = Ridge(alpha=1.0).fit(X_train, y_train)
# L1 Regularization (Lasso)
lasso = Lasso(alpha=0.1).fit(X_train, y_train)
Python
models = {'Linear Regression': lr, 'Ridge': ridge, 'Lasso': lasso}
for name, model in models.items():
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"{name} MSE: {mse:.2f}")
Python
plt.figure(figsize=(12, 6))
plt.plot(lr.coef_, label='Linear Regression', marker='o')
plt.plot(ridge.coef_, label='Ridge (L2)', marker='x')
plt.plot(lasso.coef_, label='Lasso (L1)', marker='s')
plt.title("Model Coefficients Comparison")
plt.xlabel("Feature Index")
plt.ylabel("Coefficient Value")
plt.legend()
plt.grid(True)
plt.show()
Submitting the form below will ensure a prompt response from us.