Regression Model
Regression Model
Created on Aug 03, 2025, Last Updated on Sep 14, 2025, By a Developer
Linear Regression
There are multiple error metrics available:
- Mean Absolute Error (MAE)
- Mean Percent Absolute Error (MAPE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
Least Squares Method is usually used to estimate the coefficient/weights of the model. The name indicate that it calculate the weights when MSE is zero.
Polynomial Regression
Linear regression would rarely be sufficient, the feature relationship would be non-linear in a lot of cases. Instead of using X as X in feature list, higher power of feature can be used, such as X**2, X**3.
Apart from using power of single feature, combination of different feature is also useful in a lot of cases.
Measurement
While the model performance is still measurable using Variance and Bias, different from Deep Learning where model not really care much about features, classic regression model put a lot of emphasis on feature selection.
Model Performance
To measure how good the model fits the training data, R-Squared is one option.
Coefficient Significance
To measure if the coefficient(s)/weight(s) are significant:
- Standard Error (SE) describes the level of spread of the sample.
- t-value:
t-value indicate how strong the coefficient is compared to its uncertainty. - p-value: If the true coefficient were zero, how often would I see a t-value this extreme. A big p-value (usually > 0.05) indicate the coefficient is consistent with noise, meaning the it is not significant.
Correlated Features
Correlated features means multiple features containing redundant information, having confounding information (having same cause), or having causality (causing each other indirectly).
Variance Inflation Factor (VIF) is a metrics to determine Collinearity among features. Where
Feature Selection
- Forward Selection: Start from zero feature, and keep adding features that maximize R-Squared.
- Backward Selection: Start from all features, and keep removing features with max p-value until reaching some threshold.
- Mixed Selection: Combine of both above.