Triplet loss training #118

SaadSallam7 · 2023-07-10T20:52:39Z

I was trying to train FaceNet on kaggle using TPU but I had some problem and I noticed that you have train with it before and have good results so can you help me, please? I used batch hard strategy with the code provided here -I compared it with your implementation they gave the same results so there's no problem in the implementation- I'm training with vggface2 dataset where I take 32 image per the person and a batch size of 1024 so the batch will contain 32 different persons each with 32 image. The problem is that there's no improving on the test set, accuracy and threshold are constants at 0.5, 0 even after 10 epochs.
269/269 [==============================] - ETA: 0s - loss: 1.0424

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.000000
Improved = 0.500000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_1_0.500000.h5
Epoch 2/50
269/269 [==============================] - ETA: 0s - loss: 1.0030

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_2_0.500000.h5
269/269 [==============================] - 191s 712ms/step - loss: 1.0030
Epoch 3/50
213/269 [======================>.......] - ETA: 5s - loss: 1.0015
lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_3_0.500000.h5
269/269 [==============================] - 190s 710ms/step - loss: 1.0015
Epoch 4/50
269/269 [==============================] - ETA: 0s - loss: 1.0012

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_4_0.500000.h5
269/269 [==============================] - 191s 712ms/step - loss: 1.0012
Epoch 5/50
269/269 [==============================] - ETA: 0s - loss: 1.0011

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_5_0.500000.h5
269/269 [==============================] - 192s 715ms/step - loss: 1.0011
Epoch 6/50
269/269 [==============================] - ETA: 0s - loss: 1.0011

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_6_0.500000.h5
269/269 [==============================] - 193s 718ms/step - loss: 1.0011
Epoch 7/50
269/269 [==============================] - ETA: 0s - loss: 1.0009

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_7_0.500000.h5
269/269 [==============================] - 193s 718ms/step - loss: 1.0009
Epoch 8/50
269/269 [==============================] - ETA: 0s - loss: 1.0008

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_8_0.500000.h5
269/269 [==============================] - 192s 717ms/step - loss: 1.0008
Epoch 9/50
269/269 [==============================] - ETA: 0s - loss: 1.0008

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_9_0.500000.h5
269/269 [==============================] - 193s 718ms/step - loss: 1.0008
Epoch 10/50
269/269 [==============================] - ETA: 0s - loss: 1.0008

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000
Improved = 0.000000
Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_10_0.500000.h5

This is the notebook if you can take a look. Thanks in advance.

leondgarse · 2023-07-11T03:30:13Z

I cannot see your notebook, telling No saved version. Generally, triplet loss should better used after some softmax or arcface training, as in the early stage of training, the model cannot mine a good positive / negative pair. May refer some related issue like MobileFacenet SE Train from scratch #9 or the result ResNet101V2 using nadam and finetuning with triplet.

SaadSallam7 · 2023-07-12T18:23:52Z

I'm sorry but you can open it now. Ok, I will train it with arcface then triplet loss but to be honest, I don't think this what makes the a problem as the accuracy is 50% indicates that the model isn't really learning it gives always true or always false! Last question please, how are you initializing the dataset for online mining? for me, when I read the dataset I read it sorted so the first 32 example are for one class and the second 32 example are for another class and so on so the batches are fixed while fitting the model but I think in the original paper they were sample batches randomly.

leondgarse · 2023-07-13T12:08:05Z

I just took some basic tests in colab Keras_insightface_CASIA.ipynb, the last Test part, using only 4 classes for training. Though the result not good, but at least the loss is dropping, and the lfw accuracy just better than 0.5.
For offline mining, the kernel function in dataset is data.py#L445-L446, that takes image_per_class images from some randomly picked classes. It's just making sure each class has some positive samples. But technically, the regular dataset just randomly picking images without this strategy also works. Like if we picked [0, 1, 1, 2, 2, 2] classes, 0 will just use itself as positive one.
I think you are using [0, 255] value range for model trainig and evaluating, which maybe not good. Another tiny issue is in eval_callback.__eval_func__, don't need to call normalize again as you already have it normalized.
At least the loss should be dropping, and the lfw threshhold value should not be 0. You may check the trained model, like run manually on some images and compare their similarity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triplet loss training #118

Triplet loss training #118

SaadSallam7 commented Jul 10, 2023

leondgarse commented Jul 11, 2023

SaadSallam7 commented Jul 12, 2023

leondgarse commented Jul 13, 2023

Triplet loss training #118

Triplet loss training #118

Comments

SaadSallam7 commented Jul 10, 2023

leondgarse commented Jul 11, 2023

SaadSallam7 commented Jul 12, 2023

leondgarse commented Jul 13, 2023