BatchNorm
BatchNorm need to
LayerNorm
LayerNorm is mostly used in NLP. Because the length of the sentences is not always same, so batchnorm is not suitable. LayerNorm normalizes the input along the word-dimention.
Noted that the input is [bs, length,embeddings]. LayerNorm normalizes the embeddings for each words.
|
|
to be continue…