You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use a self defined Resnet34 based on keras application resnet, which is similar with the MXNet version. Other parameters is almost a mimic of the MXNet version.
MXNet SGD behaves different with tfa SGDW, detail explains here the discussion. It's mathematically adding l2 regularizer works same with MXNet SGD weight_decay with momentum, as long as applying wd_mult.
In my test, MXNet wd_mult is NOT working if just added in mx.symbol.Variable, has to be added by opt.set_wd_mult.
Have to train 1 epoch to warmup first, maybe caused be the initializer.
The difference in training accuracy is that the MXNet version calculating accuracy after applying arcface conversion, mine is before.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
The original MXNet version has a self defined resnet which is different with keras build-in version.
Resnet50
case , blocks number changes from[3, 4, 6, 3]
to[3, 4, 14, 3]
.bias
fromConv2D
layers.PReLU
instead ofrelu
.strides=1
instead ofstrides=2
in the firstConv2d
layer.Original MXNet version Train
Resnet34
onCASIA
dataset.CASIA
dataset contains490623
images belongs to10572
classes, forbatch_size = 512
, means959 steps
per epoch.epochs = [20, 30]
, means--lr-steps '19180,28770'
.Keras version
Resnet34
based on keras application resnet, which is similar with the MXNet version. Other parameters is almost a mimic of the MXNet version.MXNet SGD
behaves different withtfa SGDW
, detail explains here the discussion. It's mathematicallyadding l2 regularizer
works same withMXNet SGD weight_decay with momentum
, as long as applyingwd_mult
.MXNet wd_mult
is NOT working if just added inmx.symbol.Variable
, has to be added byopt.set_wd_mult
.arcface
conversion, mine is before.Results This result is just showing
Keras
is able to reproduceMXNet
accuracy using similar strategy and backbone.Beta Was this translation helpful? Give feedback.
All reactions