AI/machine learning

Data Transformation: Standardization vs. Normalization

Increasing accuracy in models is often obtained through the first steps of data transformations. This guide explains the difference between the key feature-scaling methods of standardization and normalization and demonstrates when and how to apply each approach.

standardnormal.jpg

Data transformation is one of the fundamental steps in data processing. This article explains the following key aspects of the technique called feature scaling:

  • The difference between standardization and normalization
  • When to use standardization and when to use normalization
  • How to apply feature scaling in Python

What does Feature Scaling mean?

In practice, different types of variables are often encountered in the same data set. A significant issue is that the range of the variables may differ a lot. Using the original scale may put more weight on the variables with a large range. In order to deal with this problem, the technique of features rescaling need to be applied to independent variables or features of data in the step of data preprocessing. The terms "normalization" and "standardization" are sometimes used interchangeably, but they usually refer to different things.

The goal of applying feature scaling is to make sure features are on almost the same scale so that each feature is equally important and make it easier to process by most machine-learning algorithms. 

Standardization

The result of standardization (or Z-score normalization) is that the features will be rescaled to ensure the mean and the standard deviation are 0 and 1, respectively. 

This technique to rescale features value with the distribution value between 0 and 1 is useful for the optimization algorithms, such as gradient descent, that are used within machine-learning algorithms that weight inputs (e.g., regression and neural networks). Rescaling is also used for algorithms that use distance measurements, for example, K-nearest-neighbours (KNN).

Max/Min Normalization

Another common approach is the so-called max/min normalization (min/max scaling). This technique is to re-cales features with a distribution value between 0 and 1. For every feature, the minimum value of that feature gets transformed into 0 and the maximum value gets transformed into 1.

Read the full story here.