Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAAI/bge-reranker-v2-m3 模型中是如何計算輸入的 max_length ? #740

Open
thebarkingdog-yh opened this issue Apr 30, 2024 · 4 comments

Comments

@thebarkingdog-yh
Copy link

thebarkingdog-yh commented Apr 30, 2024

reranker = FlagReranker('BAAI/bge-reranker-v2-m3', use_fp16=True)
scores = reranker.compute_score(['要查詢的問題', "查詢的文檔...."], normalize=True, max_length=512)

關於這個 max_length = 512 具體是什麽單位? 是 token 還是字符長度? 超過後又是如何處理? 直接截斷嗎?
reranker-v2-m3 模型本身的 max_length 有上限嗎? 這個 512 是可調整(例如拉高到1024 或 4096) 還是不建議調整?

@staoxiao
Copy link
Collaborator

max_lenth is the maximum number of tokens. We will truncate the text and only keep the first 8192 tokens.

The upper bound of max_length in bge-reranker-v2-* is 8192. A larger max_length allows the model to process long texts, but it comes with more computational consumption.

@thebarkingdog-yh
Copy link
Author

thebarkingdog-yh commented May 1, 2024

max_lenth is the maximum number of tokens. We will truncate the text and only keep the first 8192 tokens.

The upper bound of max_length in bge-reranker-v2-* is 8192. A larger max_length allows the model to process long texts, but it comes with more computational consumption.

So, The maximum input to the rerank model is 8192, and will be truncated if exceeded.

But I found that the max_length in the parameters of compute_score is 512 by default.
What is the purpose of this parameter?
Because I actually input more than 512, it can still work, and the output value will still change.

@staoxiao
Copy link
Collaborator

staoxiao commented May 2, 2024

A larger max_length allows the model to process long texts, but it comes with more computational consumption. The small default value: 512 is to speed up the inference. If most of your text is long, we recommend using a larger max_length.

@thebarkingdog-yh
Copy link
Author

A larger max_length allows the model to process long texts, but it comes with more computational consumption. The small default value: 512 is to speed up the inference. If most of your text is long, we recommend using a larger max_length.

So, I should decide an appropriate embedding length based on the length of my comparison documents to get better results.
If my use the default value, it will always compress all documents into 512 for comparison.
If my documents is longer, the results might be worse. Am I understanding this correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants