Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you release detailed configuration? #2

Open
csyanbin opened this issue Feb 13, 2018 · 10 comments
Open

Can you release detailed configuration? #2

csyanbin opened this issue Feb 13, 2018 · 10 comments

Comments

@csyanbin
Copy link

csyanbin commented Feb 13, 2018

Hi Jake,
Prototypical networks is really a nice work.

I have run this code to reproduce the results in NIPS paper. However, it seems the results have some differences with the paper.

NIPS2017 paper:

5way1shot 5way5shot 20way1shot 20way5shot
98.8 99.7 96.0 98.9

Reproduced results:

5way1shot 5way5shot 20way1shot 20way5shot
98.4 99.6 94.9 98.6

I run this code several times and get similar results.
Can you release your hyper-parameter setting? Or is there any technical trick that may impact the performance?

Here is the cmds I used in 20way-1shot setting:

python scripts/train/few_shot/run_train.py --data.shot 1 --data.test_shot 1 --data.test_way 20 --data.cuda --log.exp_dir=results/20way1shot 
python scripts/predict/few_shot/run_eval.py --data.test_shot 1 --data.test_way 20 --model.model_path=results/20way1shot/best_model.t7 

Thanks.

@bertinetto
Copy link

Hi, I got the same as @csyanbin with python 3.5, cuda 8 and pytorch 0.3

@csyanbin
Copy link
Author

@bertinetto @jakesnell
I also reproduced the experiments using tensorflow with the same setting and parameters, still got similar results. I wonder if I miss some implementation or parameter details?

@dnlcrl
Copy link

dnlcrl commented Feb 22, 2018

Hello, I just want to add that I implemented the same algorithm in a slight different way (here), and I got the same as @csyanbin too (except for the 20way1shot where I obtained 95.1%).

Edit / Side Note: just read this paper: https://arxiv.org/pdf/1711.04043v3.pdf and it seems they are reporting different accuracies with ProtoNet too (97.4 | 99.3 | 95.4 | 98.8) (page 8).

@csyanbin
Copy link
Author

csyanbin commented Feb 23, 2018

@dnlcrl I think the results above (97.4 | 99.3 | 95.4 | 98.8) cite the original prototypical paper of ICLR (https://openreview.net/references/pdf?id=BJ-3bnVmg), which is different setting with this git repo.
The ICLR version always uses 1-shot training, but this repo uses the corresponding shot.

This is also given in appendix A of https://arxiv.org/pdf/1703.05175.pdf, in line1 and line2 of table4.

@dnlcrl
Copy link

dnlcrl commented Feb 24, 2018

@csyanbin Oh I got it, thank you, I must have missed it.

@yannlif
Copy link

yannlif commented Apr 4, 2018

Hi,
@csyanbin I think the results reported are when training on both the training and validation set.
run_trainval.py run this full training using the epoch number validated on the validation set during the first training with the same other hyperparameters.

The results I get are closer to the reported ones but still different:

5way1shot 5way5shot 20way1shot 20way5shot
98.5 99.6 95.3 98.7

@csyanbin
Copy link
Author

csyanbin commented Apr 4, 2018

Hi,
@yannlif I think validation cannot be combined into the training set for fair comparisons. Also, in this paper, the authors said: "We follow their procedure by training on the 64 training classes and using the 16 validation classes for monitoring generalization performance only."

Although the performance is slightly better, I think this is not fair for comparison.

@yannlif
Copy link

yannlif commented Apr 4, 2018

@csyanbin

We follow their procedure by training on the 64 training classes and using the 16 validation classes for monitoring generalization performance only.

This is for the mini-ImageNet experiments.
In the Omniglot part:

We use 1200 characters plus rotations for training (4,800 classes in total) and the remaining classes, including rotations, for test.

This corresponds to the trainval split. The train split has only 1028 unique characters.

@csyanbin
Copy link
Author

csyanbin commented Apr 4, 2018

@yannlif
I got it. I think you are right for Omniglot.
Thanks for this information. I will try again.

@debasmitdas
Copy link

Hi Guys,
Does anybody how many training epochs and episodes/epoch was used to reproduce the paper's results (for eg. the miniImagenet dataset) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants