Potential bug in calculation of gradient updates for BPR #716

mturner-ml · 2024-05-14T21:19:49Z

I was just familiarizing myself with the code for this library after reading the BPR Paper and am concerned I found a bug. According to Line 5 in Figure 4 of the paper, the model parameters should receive gradient updates according to

$$\Theta \leftarrow \Theta + \alpha ( \frac{e^{-\hat{x_{uij}}}}{1+e^{-\hat{x_{uij}}}} \cdot \hat{x_{uij}} + \lambda_{\Theta} \cdot \Theta )$$

However, in this line, z is computed as z = 1.0 / (1.0 + exp(score)), which is the sigmoid function without taking the derivative (and possibly missing a negative as well?). I was able to compare results with my own implementation of BPR on a problem (cannot share until made public in a few months), but it seemed on my dataset my implementation performed better with the proper gradient updates. Would appreciate any feedback if there is some nuance I am missing!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential bug in calculation of gradient updates for BPR #716

Potential bug in calculation of gradient updates for BPR #716

mturner-ml commented May 14, 2024 •

edited

Potential bug in calculation of gradient updates for BPR #716

Potential bug in calculation of gradient updates for BPR #716

Comments

mturner-ml commented May 14, 2024 • edited

mturner-ml commented May 14, 2024 •

edited