Skip to content

Normalization layer in ANNdotNET

Bahrudin Hrnjica edited this page Nov 23, 2018 · 7 revisions

Simple said, data normalization is set of tasks which transform values of any feature in a data set into predefined number range. Usually this range is [-1,1] , [0,1] or some other specific ranges. Data normalization plays very important role in ML, since it can dramatically improve the training process, and simplify settings of network parameters.

There are two main types of data normalization:

  • MinMax normalization - which transforms all values into range of [0,1],
  • Gauss Normalization or Z score normalization, which transforms the value in such a way that the average value is zero, and standard deviation is 1.

Beside those types there are plenty of other methods which can be used. Usually those two are used when the size of the data set is known, otherwise we should use some of the other methods, like log scaling, dividing every value with some constant, etc. But why data need to be normalized? This is essential question in ML, and the simplest answer is to provide the equal influence to all features to change the output label. More about data normalization and scaling can be found on this:link.

As can be observed, the Normalization layer is placed between input and first hidden layer. Also the Normalization layer contains the same neurons as input layer and produced the output with the same dimension as the input layer.

In order to implement Normalization layer, the following requirements must be met:

  • calculate average \mu

  • and standard deviation \sigma

  • in training data set as well find maximum and minimum value of each feature.

  • this must be done prior to neural network model creation, since we need those values in the normalization layer.

  • within network model creation, the normalization layer should be define after input layer is defined.

Calculation of mean and standard deviation for training data set

Before network creation, we should prepare mean and standard deviation parameters which will be used in the Normalization layer as constants. Hopefully, the CNTK has the static method in the Minibatch source class for this purpose MinibatchSource.ComputeInputPerDimMeansAndInvStdDevs. The method takes the whole training data set defined in the minibatch and calculate the parameters.

//calculate mean and std for the minibatchsource
// prepare the training data
var d = new DictionaryNDArrayView, NDArrayView\>\>();
using (var mbs = MinibatchSource.TextFormatMinibatchSource(
trainingDataPath , streamConfig, MinibatchSource.FullDataSweep,false))
{
 d.Add(mbs.StreamInfo("feature"), new Tuple(null, null));
 //compute mean and standard deviation of the population for inputs variables
 MinibatchSource.ComputeInputPerDimMeansAndInvStdDevs(mbs, d, device);
}

Now that we have average and std values for each feature, we can create network with normalization layer. ANNdotNET library supports multiple group of features. For example, the input variables can consist of categorical and numerical features. In case of categorical features, normalization should not be applied,so normalization layer should be created only for numerical group of features.

The following example shows two groups of features Item and Sale. |Item 1 0 0 0 0 0 0 0 0 0 |Sale 18 10 14 6 5 18 14 6 4 19 12 20 19 12 2 6 16 9 13 10 2 5 5 5 6 6 10 9 12 4 |item_cnt 3

As can be seen we have Item which is typical categorical variable represented with One-Hot encoding vector, and Sale features which is numerical group of features.

The MLConfig file should be defined as:

  • features:|Item 10 0 |Sale 30 0
  • labels:|item_cnt 1 0

Based on the above sample, we should define normalization only for Sale feature. So, the training parameters may be defined as:

training: |Type:default|BatchSize: 480|Normalization:Sale|Epochs:1000|SaveWhileTraining: 1|RandomizeBatch: 1|ProgressFrequency: 1|FullTrainingSetEval:1

As can be seen Normalization contains only features group which is numerical.For example in case there are two numerical features group Sales and Prices,normalization would be defined as: |Normalization:Sale;Prices.

Note: features group are separated with semicolon.