Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about answer ranking #118

Open
dhansmair opened this issue Jan 31, 2023 · 2 comments
Open

Question about answer ranking #118

dhansmair opened this issue Jan 31, 2023 · 2 comments

Comments

@dhansmair
Copy link

Hi there, I see that in line

log_probs_sum = log_probs.sum(1)

you are using a sum to accumulate the loss for the tokens in the answer sequence. How does this behave if the possible answers have varying lengths? Shouldn't the loss be divided by the sequence length to get the average loss per token? Otherwise, won't the ranking be biased towards shorter sequences?

@LiJunnan1992
Copy link
Contributor

Hi, sum of log_probs = log of the multiplication of probs = log of the sequence prob

@MLAlex1
Copy link

MLAlex1 commented Jun 28, 2023

I think @dhansmair makes a good point - indeed I also think it will be biased if we do not divide by the length of each answer sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants