PReLU layers have large number of trainable parameters #2

joseph-sch · 2019-12-03T06:46:06Z

In your current implementation, each PReLU layer has as many trainable parameters as the number of elements in the input (HxWxC) whereas in other implementations (deepinsight's insightface or TMaysGGS's MobileFaceNet-Keras have only one trainable parameter per channel.

Indeed, in Keras, the default argument for the shared_axes parameter of PReLU ("the axes along which to share learnable parameters for the activation function") is None. So you have to specify shared_axes=[1,2] in the calls to PReLU in order to keep the number of trainable parameters reasonable and match other implementations.

Reference: https://keras.io/layers/advanced-activations/

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PReLU layers have large number of trainable parameters #2

PReLU layers have large number of trainable parameters #2

joseph-sch commented Dec 3, 2019

PReLU layers have large number of trainable parameters #2

PReLU layers have large number of trainable parameters #2

Comments

joseph-sch commented Dec 3, 2019