Get in Touch With Us

Submitting the form below will ensure a prompt response from us.

Regularization in Machine Learning: Concept and Code Example

In machine learning, models often face a common problem: they perform well on training data but poorly on unseen data. This phenomenon is known as overfitting. One of the most effective techniques to overcome it is Regularization.

What is Regularization?

Regularization is a technique that adds a penalty to the loss function to discourage the model from becoming too complex. By penalizing large weights, regularization helps build simpler models that generalize better on new data.

Why Use Regularization?

  • Reduces overfitting
  • Improves generalization
  • Encourages simpler models
  • Helps in feature selection (especially with L1)

Types of Regularization

Type Penalty Term Effect
L1 (Lasso) `λ * Σ w
L2 (Ridge) λ * Σ(w²) Shrinks weights uniformly
Elastic Net `λ1 * Σ w

Code Example: Ridge and Lasso Regularization

Let’s walk through an example using scikit-learn.

Step 1: Import Libraries

Python

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

Step 2: Generate Dataset

Python

X, y = make_regression(n_samples=100, n_features=20, noise=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 

Step 3: Train Models

Python

# No Regularization
lr = LinearRegression().fit(X_train, y_train)

# L2 Regularization (Ridge)
ridge = Ridge(alpha=1.0).fit(X_train, y_train)

# L1 Regularization (Lasso)
lasso = Lasso(alpha=0.1).fit(X_train, y_train)

Step 4: Evaluate Models

Python

models = {'Linear Regression': lr, 'Ridge': ridge, 'Lasso': lasso}

for name, model in models.items():
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print(f"{name} MSE: {mse:.2f}")

Step 5: Visualize Coefficients

Python

plt.figure(figsize=(12, 6))
plt.plot(lr.coef_, label='Linear Regression', marker='o')
plt.plot(ridge.coef_, label='Ridge (L2)', marker='x')
plt.plot(lasso.coef_, label='Lasso (L1)', marker='s')
plt.title("Model Coefficients Comparison")
plt.xlabel("Feature Index")
plt.ylabel("Coefficient Value")
plt.legend()
plt.grid(True)
plt.show()

Takeaways

  • Linear Regression uses all features without penalty.
  • Ridge (L2) regularization shrinks weights to reduce overfitting but doesn’t eliminate them.
  • Lasso (L1) not only shrinks weights but can zero out irrelevant features, making it useful for feature selection.
About Author

Jayanti Katariya is the CEO of Moon Technolabs, a fast-growing IT solutions provider, with 18+ years of experience in the industry. Passionate about developing creative apps from a young age, he pursued an engineering degree to further this interest. Under his leadership, Moon Technolabs has helped numerous brands establish their online presence and he has also launched an invoicing software that assists businesses to streamline their financial operations.

Related Q&A