Technique details for machine learning

1. Why and when do we need to standardization for machine learning?

Normalization typically means rescales the values into a range of [0,1]. Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).

The short answer is that you generally need to do some kind of gradient descent to train your model, and this will rely on selecting initial conditions. Poor initial conditions will lead to poor convergence results. As an example, consider a single neuron in a neural network σ(ax+b) with only one feature, where say σ is a ReLu. In this case, if you initialize a,b to be N(0,1), they will have magnitudes on the order of 1. Now imagine your inputs are data with x being around −100. Most of your activations will be zero and hence gradient descent becomes impossible. You could remedy this with a leaky ReLu, but then the training would be slow. Whereas if you normalize your data to be commensurate with your initial conditions, you'll have non zero gradients.

References:

  1. Why and When Do We Need Standardization for Machine Learning
  2. Normalization vs Standardization — Quantitative analysis

 

上一篇:电脑微信双开


下一篇:bat常用命令