Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validating partial_fit_users and partial_fit_items vs model.fit #650

Open
Selva163 opened this issue Mar 22, 2023 · 3 comments
Open

Validating partial_fit_users and partial_fit_items vs model.fit #650

Selva163 opened this issue Mar 22, 2023 · 3 comments

Comments

@Selva163
Copy link

Trying to use ALS for real-time recommendations. however seeing differences in recommended items between the model trained on full data vs trained on new incremental data. Steps followed,

  1. Train the model on entire data (last 30 days of user-item interactions) and save the model as pickle file.
  2. For real-time, load the saved(pickle) model, get the latest interactions in the last 5 mins, use partial_fit_users and partial_fit_items method to incrementally train on the new data.
  3. Get the recommendations for the latest active users.

However, for validation, I again trained the model with (last 30 days + last 5 mins) and got the recommendations.

The results from incremental training and retraining are quite different.

How can we get the same recommendations from both model? Idea is to save few mins of entire retraining.
Not able to see any (real-life) examples of partial_fit_users or partial_fit_items except on doc and als_test.py

@Selva163
Copy link
Author

Selva163 commented Apr 6, 2023

@benfred @Focus could you please provide any suggestion on this

@benfred
Copy link
Owner

benfred commented May 30, 2023

For real-time, load the saved(pickle) model, get the latest interactions in the last 5 mins, use partial_fit_users and partial_fit_items method to incrementally train on the new data.

For the partial_fit_items/users - are you just including the results from the last 5 minutes? If so - I don't think that will work. The partial fit functions allow you to only calculate factors for a subset of users/items - but for the users being updated you will need their full history, rather than just the most recently interacted items

@bahag-rehmanr1
Copy link

What about new users? Will they be accomodated too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants