Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with training baseline model #1

Open
chaitanyamalaviya opened this issue Mar 9, 2021 · 1 comment
Open

Issue with training baseline model #1

chaitanyamalaviya opened this issue Mar 9, 2021 · 1 comment

Comments

@chaitanyamalaviya
Copy link

Hi,

I followed all the preprocessing steps and installed the required packages.
However, I am facing an error in training a baseline model with this code.
I am using exactly the same command as here. I believe it has something to do with the use of extra_train_data. Would be helpful if you had any suggestion for how to resolve this.

Grad overflow on iteration 0
Using dynamic loss scale of 65536
Traceback (most recent call last):
  File "train.py", line 579, in <module>
    sys.exit(main(sys.argv[1:]))
  File "train.py", line 574, in main
    train(opt, shared, m, optim, train_data, val_data, extra_train, extra_val, unlabeled)
  File "train.py", line 410, in train
    train_perf, extra_train_perf, loss, num_ex = train_epoch(opt, shared, m, optim, train_data, i, train_idx, extra, extra_idx, unlabeled, unlabeled_idx)
  File "train.py", line 216, in train_epoch
    batch_ex_idx, batch_l, source_l, target_l, label, res_map) = data[batch_order[i]]
  File "/net/nfs.corp/alexandria/chaitanyam/consistency/data.py", line 264, in __getitem__
    batch_l, source_l, target_l, label) = self.batches[idx]
IndexError: list index out of range

Thanks for your help!

@t-li
Copy link
Member

t-li commented Jun 13, 2021

Hey, sorry for the late response. The issues panel is not actively monitored. For any further question, please email me directly.

It seems the issue was due to a later commit that was supposed to fix a similar issue. I have just reverted the change and did a quick run from scratch. It should be good now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants