ZANE.C

Machine Learning Basic

Machine Learning Basic

Created on Aug 02, 2025, Last Updated on Sep 02, 2025, By a Developer

Concepts


Supervised Learning & Unsupervised Learning


The key difference between Supervised Learning and Unsupervised Learning is that Supervised Learning is provided with both data and label, while Unsupervised Learning is only provided with data. Supervised learning model is trying to figure out the rule from input to output, while Unsupervised is trying to find some characteristics from the given data.

Gradient Descent


Based on the model evaluation result, we calculate the gradient of the loss function at current parameters. And modify the each parameter by learning_rate * gradient

Regression & Classification Problem


Regression tries to derive a value from the given input, while classification is literally put input into bucket(s).

Example:

  • Using location and size of a house to predict its price -> Regression.
  • Using Lung CT image to tell the disease the lung might have.

Model Performance


Activation Function


  • Why we need Activation Function?
    • Without Activation function, each single layer of Neural Network degrade to a linear regression, while the entire model degrade to polynomial regression.
  • Activation Function Options:
    • Linear / None
      • Only on output linear in some cases. Very rare.
    • Sigmoid
      • Binary Classification only
    • Tanh
      • Alternative to Sigmoid
    • Relu
      • Default choice
    • Softmax:
      • Multi-class Classification

Data Set


  • Training Set: The subset of data used to train the model.
  • Cross Validation Set / Validation Set - The subset of data used for evaluating the model during training, but not participate in backward propagation.
  • Test Set - The subset of data won’t be exposed to the model until training finish. Used to measure the performance of the model.

Forward Propagation & Backward Propagation


  • Forward Propagation - Referring to data passed in input-to-output direction. Forward Propagation is almost just another name
  • Backward Propagation - Referring to data passed in input-to-output direction. Backward Propagation is the process of computing gradients of the loss with respect to each model parameter, and using the gradient to update model weights.
    • By applying the chain rule of calculus, we can only focus the derivatives for each layer given their input and output, and derive the loss-layer gradient of each parameter, without worrying too much about the complexity of the entire model.
    • And we delegate the derivatives of the complex function to multiple small functions, layer by layer.
    • Chain Rule of Calculus, For any , we have:

Models


Techniques


Libraries


© 2024-present Zane Chen. All Rights Reserved.