Slow loss convergence #29

arielverbin · 2024-03-05T21:26:00Z

Hello,
I'm attempting to perform fine tuning with your implementation (I'm using the commit e8e2ad1 from April 24, as I don't need feet key points).
Unfortunately I think the loss might not converge properly. I tried to run the training without fine tuning (from scratch) - in the first 5 epochs it decreased from 0.0168 to 0.0063, but remained stuck at 0.0063 for the next 25 epochs.

Do you have any suggestions for how to solve it?
I've used the same hyper parameters in your code, but changed the layer decay rate from 0.75 to 1-1e-4.

Thank you for your time and assistance!

The text was updated successfully, but these errors were encountered:

JunkyByte · 2024-03-07T14:14:07Z

I'm sorry to hear that. Can you try to repeat your experiments with the original implementation I started from and see if there's any difference? https://github.com/jaehyunnn/ViTPose_pytorch

arielverbin · 2024-03-07T16:46:33Z

Same problem :( the loss doesn't seem to go below 0.006-0.007.

I used the exact code from the repository, except:

In config.yaml, changed resume_from to False.
In COCO.py, changed np.float to float (It raised an error probably due to a version difference).
In COCO.py, I also added conversion to RGB if image.ndim == 2 (as you did in this repository).
In train.py, changed data_version="train_custom" / "valid_custom", to "train2017" / "val2017" (so it would match the name of the directories in COCO). Maybe this is the problem? I used COCO dataset without any preprocessing.

I might just be impatient, but in the log files of the official repo, the loss reached 0.003 on the first epoch.

JunkyByte · 2024-03-07T16:59:45Z

I’m sorry, it seems to be a problem with the original project. I will remove the fine tuning part completely from the current state of the repository if it is broken. I would suggest you to use the original vitpose implementation or check if any obvious bug is present in this implementation. Good luck

JunkyByte mentioned this issue Apr 22, 2024

Finetuning on custom dataset #15

Closed

JunkyByte closed this as completed Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow loss convergence #29

Slow loss convergence #29

arielverbin commented Mar 5, 2024 •

edited

JunkyByte commented Mar 7, 2024

arielverbin commented Mar 7, 2024

JunkyByte commented Mar 7, 2024

Slow loss convergence #29

Slow loss convergence #29

Comments

arielverbin commented Mar 5, 2024 • edited

JunkyByte commented Mar 7, 2024

arielverbin commented Mar 7, 2024

JunkyByte commented Mar 7, 2024

arielverbin commented Mar 5, 2024 •

edited