-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to train the model with unlabeled data? #59
Comments
I think you maybe want to take a look of the data.py and datasets.py,which are the modules to load your dataset to the trainning model. |
@tarvaina So sorry to bother you several years on, but I have this same question. During training when some number of the samples in each batch are unlabeled ( In this line here: Can you clarify how this is working? Why are the unlabeled samples being taken into account? Thank you so much! |
Replying in case this is helpful for others: when the |
I want to transfer the MT framework to a NLP task but I don't understand how to train it with unlabeled data. I have got the idea of the paper, but i'm confusing about the implementation.
I notice that the
TwoStreamBatchSampler
divides the dataset into labeled part and unlabeled part, but the code above seems handles both labeled and unlabeled data in a universal way. I think only the labeled part ofmodel_out
should be used to calculate theclass_loss
. Did I get it wrong?The text was updated successfully, but these errors were encountered: