Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the models folder and predictions folder are empty! #27

Open
hosseinfani opened this issue Apr 9, 2021 · 5 comments
Open

the models folder and predictions folder are empty! #27

hosseinfani opened this issue Apr 9, 2021 · 5 comments

Comments

@hosseinfani
Copy link

Hello there,
Thank you for the package.
I could successfully make a run of it but I am wondering how I can see the prediction results on a test set. Assuming I trained a model, now I want to load and see the actual predictions on a test set as well as the metrics.
Now, in the output folder, the models folder and predictions folder are empty! However, I have the logfile that shows a successful run.

I can explore inside the code but I thought you may know the reason before.

@hosseinfani hosseinfani changed the title Predictions on Test Set the models folder and predictions folder are empty! Apr 9, 2021
@sadaharu-inugami
Copy link
Contributor

sadaharu-inugami commented Apr 9, 2021

Thank you for your message. The model should be saved to the model.pkl file. The models and predictions directories are remains of our internal allRank fork, where we saved the model after each iteration and also dumped the dataset predictions for the final model. We will add these files in the following allRank releases.

@hosseinfani
Copy link
Author

Thank you for your message. The model should be saved to the model.pkl file. The models and predictions directories are remains of our internal allRank fork, where we saved the model after each iteration and also dumped the dataset predictions for the final model. We will add these files in the following allRank releases.

Thank you for your reply. Yes, I can see the model.pkl.

Also, can you help me with how I can use this saved model for prediction on a test set? I mean I want to see the actual final prediction of the model.

@hosseinfani
Copy link
Author

hosseinfani commented Apr 14, 2021

Hello there,
I write a piece of code to write the output prediction for a test set. I did some minor changes to the LibSVMDataset in order to keep the query_ids and their order in the test set which are required when we pair ground truth and predicted order from the test set.


class LibSVMDataset(Dataset):
...
    def __init__(self, X, y, query_ids, transform=None):
 ...
       X = X.toarray()

        # in order to keep the order of qids as we read the input file 
        self.query_ids = Counter(query_ids)
        groups = np.cumsum(list(self.query_ids.values()))

        self.X_by_qid = np.split(X, groups)[:-1]
        self.y_by_qid = np.split(y, groups)[:-1]

def test():
    topn = 10
    test_path = f'../'

    #creating datset and dataloader instances
    test_ds = load_libsvm_dataset_role('test', test_path, topn )
    _, test_dl = create_data_loaders(test_ds, test_ds, num_workers=1, batch_size=test_ds.shape[0])
    
    n_features = test_ds.shape[-1]
    assert all_equal([n_features]), f"Last dimensions of datasets must match but got {n_features}"

    #loading trained model
    config = Config.from_json(f'{test_path}/used_config.json')
    model = make_model(n_features=n_features, **asdict(config.model, recurse=False))
    model.load_state_dict(load_state_dict_from_file(f'{test_path}/model.pkl', dev))
    
    x_, y_ = __rank_slates(test_dl, model)

    with open(f'{path}/test{foldidx}.pred.csv', 'w') as f:
            f.write(f'qid,eid,pred_score,true_sorted_by_pred\n')
            for i, (qid, count) in enumerate(test_dl.dataset.query_ids.items()):
                for j in range(count):
                    f.write(f'{qid},{int(x_[i, j, -1])},{topn - j},{int(y_[i, j])}\n')
        
    #again do the prediction for metrics calculation. I couldn't find a better way to do it reusing *y_ * in the above
    results = compute_metrics(config.metrics, model, test_dl, dev)
     

@sadaharu-inugami
Copy link
Contributor

sadaharu-inugami commented Apr 19, 2021

Thank you! The code looks OK. We will be working on a similar script after we release the reproducibility guide.

And I can confirm there is no straightforward way at the moment to calculate the metrics from the config, reusing your reranked x & y.

@supercar1
Copy link

Is there any update on this? I would also like to see the prediction results of the sets.
Thanks in advance :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants