Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build_anti_testset() takes along time and at the end it doesnot work #451

Open
AbdElrahmanMostafaRifaat1432 opened this issue Dec 5, 2022 · 2 comments

Comments

@AbdElrahmanMostafaRifaat1432

1- reader = Reader(rating_scale=(1, 5))
2- data = Dataset.load_from_df(ratings[['userId', 'asin', 'rating']], reader) # this is my own dataset
3 - svd = SVD(n_factors= 30 , n_epochs= 20 , lr_all = 0.005 , reg_all = 0.02 )
4 - real_trainset = data.build_full_trainset()
5 - svd.fit(real_trainset)
6 -real_testset = real_trainset.build_anti_testset() # the code stop here after along time and at the end it returns memory error
7 -predictions = svd.test(real_testset)
8 - top_n = get_top_n(predictions, n=20)

When I run the program it stops at line number 6 because of (build_anti_testset()) and it returns memory error after along time

however when I replace (build_anti_testset()) with (build_testset()) it works and doesnot have any problem

but I need to use (build_anti_testset()) instead of (build_testset()) because I need the predictions to be on the items that the users has not rated yet

@AbdElrahmanMostafaRifaat1432
Copy link
Author

Capture
this is my input data it may help you incase my input is not standard so the function cannot understand it
if this is the case please tell me the solution

@mohammadaminvali
Copy link

1- reader = Reader(rating_scale=(1, 5)) 2- data = Dataset.load_from_df(ratings[['userId', 'asin', 'rating']], reader) # this is my own dataset 3 - svd = SVD(n_factors= 30 , n_epochs= 20 , lr_all = 0.005 , reg_all = 0.02 ) 4 - real_trainset = data.build_full_trainset() 5 - svd.fit(real_trainset) 6 -real_testset = real_trainset.build_anti_testset() # the code stop here after along time and at the end it returns memory error 7 -predictions = svd.test(real_testset) 8 - top_n = get_top_n(predictions, n=20)

When I run the program it stops at line number 6 because of (build_anti_testset()) and it returns memory error after along time

however when I replace (build_anti_testset()) with (build_testset()) it works and doesnot have any problem

but I need to use (build_anti_testset()) instead of (build_testset()) because I need the predictions to be on the items that the users has not rated yet

Dear @bodymostafa123

those two functions use very different amounts of memory.

build_testset() function transforms the trainset into a somehow raw format. If your trainset has x lines of ratings, resulted test set also has x lines of ratings.

build_anti_testset() uses much more memory. consider there are n users and m items, this function has (n * m) - x lines of ratings. HTH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants