Training is so slow, what's wrong? #10

asjiangh · 2019-09-21T09:42:13Z

Hi, I'm doing a 4 classes classification mission.
I use pre-trained VGG19 as the base model, and add ArcFace after it.
I trained the model for a while, but the training speed becomes slower, and the accuracy of training set doesn't grow at all. I can't figure out the reason. Please help.(I have trained it with softmax before, everything is well.)
(Trained on GTX 1080 Ti)
This is a part of my training process:

This is the key part of my code:

num_classes = 4
input_shape = (400,300,3)
batch_size=32
base_model = VGG19(include_top=False, weights='imagenet', input_shape=input_shape)
x = base_model.output
y = Input(shape=(num_classes, ))

x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Flatten()(x)
x = Dense(128, kernel_initializer='he_normal',
                kernel_regularizer=regularizers.l2(1e-4))(x)
#x = Dense(128, activation='relu')(x)
#predictions = Dense(num_classes, activation='softmax')(x)
predictions = ArcFace(num_classes,regularizer=keras.regularizers.l2(1e-4))([x, y])

model = Model(inputs=[base_model.input, y], outputs=predictions)

for layer in base_model.layers:
    layer.trainable = True

model.summary()
opt = SGD(lr=0.1, momentum=0.9, decay=5e-4, nesterov=True)

model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

# generate data
class train_Generator_xandy(object):
    def __init__(self):
        datagen = ImageDataGenerator(rescale=1.0 / 255.)
        train_generator = datagen.flow_from_directory(
            train_data_dir,
            target_size=(image_height, image_width),
            batch_size=batch_size,
            class_mode='categorical',
            shuffle=True)

        self.gene = train_generator

    def __iter__(self):
        return self

    def __next__(self):
        X, Y = self.gene.next()
        return [X, Y], Y

class val_Generator_xandy(object):
    def __init__(self):
        validation_datagen = ImageDataGenerator(rescale=1.0 / 255.)

        validation_generator = validation_datagen.flow_from_directory(
            validation_data_dir,
            target_size=(image_height, image_width),
            batch_size=batch_size,
            class_mode='categorical',
            shuffle=False,
        )

        self.gene = validation_generator

    def __iter__(self):
        return self

    def __next__(self):
        X, Y = self.gene.next()
        return [X, Y], Y

class test_Generator_xandy(object):
    def __init__(self):
        test_datagen = ImageDataGenerator(rescale=1.0 / 255.)

        test_generator = test_datagen.flow_from_directory(
            test_data_dir,
            target_size=(image_height, image_width),
            batch_size=batch_size,
            class_mode='categorical',
            shuffle=False,
        )

        self.gene = test_generator
        self.classes = test_generator.classes

    def __iter__(self):
        return self

    def __next__(self):
        X, Y = self.gene.next()
        return [X, Y], Y

train_generator = train_Generator_xandy()
validation_generator = val_Generator_xandy()
test_generator = test_Generator_xandy()

# Fit model
history = model.fit_generator(train_generator,
                    steps_per_epoch=(nb_train_samples // batch_size),
                    epochs=nb_epoch,
                    validation_data=validation_generator,
                    callbacks=[history, draw_pic],
                    validation_steps=(nb_validation_samples // batch_size)
                   )

The text was updated successfully, but these errors were encountered:

asjiangh · 2019-09-21T15:37:53Z

hello!
Follow the discussion in https://www.reddit.com/r/deeplearning/comments/cg1kev/help_needed_arcface_in_keras/
I just change the parameter s from 30 to 10, and the training becomes quite strange.

my code (I only change these code):

img_input = Input(shape=input_shape)
y= Input(shape=(num_classes,))

base_model = VGG19(include_top=False, weights='imagenet', input_shape=input_shape, input_tensor=img_input)
x = base_model.output

x = BatchNormalization()(x)
x = Dropout(0.4)(x)
x = Flatten()(x)
x = Dense(512, use_bias=False)(x)
x = BatchNormalization()(x)

predictions = ArcFace(num_classes, s=10.0, m=0.5)([x, y])


model = Model(inputs=[img_input, y], outputs=predictions)

model.summary()
opt = Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, decay=5e-4)

I'm confused. Please help me, thank you.

phambao · 2019-10-21T07:42:59Z

Have you try a different model layers?

x = base_model.output
x1 = GlobalAveragePooling2D(name='gap')(x)
x2= GlobalMaxPool2D()(x)
x = Concatenate()([x1, x2])
...

It works well with me.

hello!
Follow the discussion in https://www.reddit.com/r/deeplearning/comments/cg1kev/help_needed_arcface_in_keras/
I just change the parameter s from 30 to 10, and the training becomes quite strange.

my code (I only change these code):
img_input = Input(shape=input_shape)
y= Input(shape=(num_classes,))

base_model = VGG19(include_top=False, weights='imagenet', input_shape=input_shape, input_tensor=img_input)
x = base_model.output

x = BatchNormalization()(x)
x = Dropout(0.4)(x)
x = Flatten()(x)
x = Dense(512, use_bias=False)(x)
x = BatchNormalization()(x)

predictions = ArcFace(num_classes, s=10.0, m=0.5)([x, y])


model = Model(inputs=[img_input, y], outputs=predictions)

model.summary()
opt = Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, decay=5e-4)
I'm confused. Please help me, thank you.

coolEphemeroptera · 2020-10-16T07:04:46Z

你好！
按照https://www.reddit.com/r/deeplearning/comments/cg1kev/help_needed_arcface_in_keras/中的讨论进行操作，
我只是将参数_s_从30更改为10，因此训练变得很奇怪。我的代码（我只更改这些代码）：
img_input = Input(shape=input_shape)
y= Input(shape=(num_classes,))

base_model = VGG19(include_top=False, weights='imagenet', input_shape=input_shape, input_tensor=img_input)
x = base_model.output

x = BatchNormalization()(x)
x = Dropout(0.4)(x)
x = Flatten()(x)
x = Dense(512, use_bias=False)(x)
x = BatchNormalization()(x)

predictions = ArcFace(num_classes, s=10.0, m=0.5)([x, y])


model = Model(inputs=[img_input, y], outputs=predictions)

model.summary()
opt = Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, decay=5e-4)
我糊涂了。请帮助我，谢谢。

hello!
Follow the discussion in https://www.reddit.com/r/deeplearning/comments/cg1kev/help_needed_arcface_in_keras/
I just change the parameter s from 30 to 10, and the training becomes quite strange.

my code (I only change these code):
img_input = Input(shape=input_shape)
y= Input(shape=(num_classes,))

base_model = VGG19(include_top=False, weights='imagenet', input_shape=input_shape, input_tensor=img_input)
x = base_model.output

x = BatchNormalization()(x)
x = Dropout(0.4)(x)
x = Flatten()(x)
x = Dense(512, use_bias=False)(x)
x = BatchNormalization()(x)

predictions = ArcFace(num_classes, s=10.0, m=0.5)([x, y])


model = Model(inputs=[img_input, y], outputs=predictions)

model.summary()
opt = Adam(lr=1e-3, beta_1=0.9, beta_2=0.999, decay=5e-4)
I'm confused. Please help me, thank you.

i think that it's caused by BatchNorm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training is so slow, what's wrong? #10

Training is so slow, what's wrong? #10

asjiangh commented Sep 21, 2019 •

edited

asjiangh commented Sep 21, 2019

phambao commented Oct 21, 2019

coolEphemeroptera commented Oct 16, 2020

Training is so slow, what's wrong? #10

Training is so slow, what's wrong? #10

Comments

asjiangh commented Sep 21, 2019 • edited

asjiangh commented Sep 21, 2019

phambao commented Oct 21, 2019

coolEphemeroptera commented Oct 16, 2020

asjiangh commented Sep 21, 2019 •

edited