Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whats the expected input to the losses and metrics? #58

Open
lneukom opened this issue Apr 14, 2023 · 1 comment
Open

Whats the expected input to the losses and metrics? #58

lneukom opened this issue Apr 14, 2023 · 1 comment

Comments

@lneukom
Copy link

lneukom commented Apr 14, 2023

E.g. for approxNDCGLoss, what is y_pred and y_true?

  • what is the range of the inputs?
  • does a slate have to be in order? e.g. best ranked first?

From what I gathered:

  • values should be in [0, ...)
  • higher values means better rank
  • the order does not matter

Is that correct? And are all losses and metrics following this API?
Thanks!

@mhsyno
Copy link

mhsyno commented Apr 17, 2023

Hello,

both y_pred and y_true are of type torch.Tensor with the same shape: [batch_size, slate_length]

  • y_true values are labels for each slate in the original order. The higher value the more relevant item is according in the context of a slate. Typically relevance takes integer (e. g. from 0 to 4 in MSLR-WEB30K) or binary values.
  • y_pred values are real-valued scores from model which are used to produce new order of y_true according to these scores (descending sorting). They're also in the same order as y_true.

As for the question:

are all losses and metrics following this API?

In short: yes. For all the metrics and losses y_pred and y_true are mandatory arguments. Some of them have also additional arguments (most of them with default values specified) e. g. ats for metrics which specifies top n items of a slate taken into account while calculating a metric.
For some of the losses there are other mandatory arguements - when using ordinal loss you need to pass the number of ordinal values and for pointwise rmse number of unique ground truth values is required.
Take a notice that some arguments of the loss functions can drastically change form of the function e. g. weighing_scheme for [lambdaLoss].(https://github.com/allegro/allRank/blob/master/allrank/models/losses/lambdaLoss.py)

If in doubts I encourage to take a look at our real-life configs (which corresponds to experiments from our paper) from the reproducibility guide: https://github.com/allegro/allRank/tree/master/reproducibility/configs

Best,
Mikołaj

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants