Regression Model

#machine-learning

Created on Aug 03, 2025, Last Updated on Sep 14, 2025, By a Developer

Linear Regression

There are multiple error metrics available:

Mean Absolute Error (MAE)
Mean Percent Absolute Error (MAPE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)

Least Squares Method is usually used to estimate the coefficient/weights of the model. The name indicate that it calculate the weights when MSE is zero.

Polynomial Regression

Linear regression would rarely be sufficient, the feature relationship would be non-linear in a lot of cases. Instead of using X as X in feature list, higher power of feature can be used, such as X**2, X**3.

Apart from using power of single feature, combination of different feature is also useful in a lot of cases.

Measurement

While the model performance is still measurable using Variance and Bias, different from Deep Learning where model not really care much about features, classic regression model put a lot of emphasis on feature selection.

Model Performance

To measure how good the model fits the training data, R-Squared is one option.

Coefficient Significance

To measure if the coefficient(s)/weight(s) are significant:

Standard Error (SE) describes the level of spread of the sample.
t-value: t-value indicate how strong the coefficient is compared to its uncertainty.
p-value: If the true coefficient were zero, how often would I see a t-value this extreme. A big p-value (usually > 0.05) indicate the coefficient is consistent with noise, meaning the it is not significant.

Correlated Features

Correlated features means multiple features containing redundant information, having confounding information (having same cause), or having causality (causing each other indirectly).

Variance Inflation Factor (VIF) is a metrics to determine Collinearity among features. Where means the value for all other features meaning . When VIF having a big value (usually bigger than 5 or 10), meaning the feature has a collinearity with other features.

Feature Selection

Forward Selection: Start from zero feature, and keep adding features that maximize R-Squared.
Backward Selection: Start from all features, and keep removing features with max p-value until reaching some threshold.
Mixed Selection: Combine of both above.

Table of Content

Linear Regression

Polynomial Regression

Measurement

Model Performance

Coefficient Significance

Correlated Features

Feature Selection

ZANE.C

Regression Model

Regression Model

Linear Regression

Polynomial Regression

Measurement

Model Performance

Coefficient Significance

Correlated Features

Feature Selection

Table of Content