Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] sar_movielens.ipynb - top_k = model.recommend_k_items(test, top_k=TOP_K, remove_seen=True) error #2015

Closed
lordaouy opened this issue Oct 9, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@lordaouy
Copy link

lordaouy commented Oct 9, 2023

Description

I got error when running sar_movielens.ipynb notebooks from azure machine learning compute instance
image

2023-10-09 20:08:29,498 INFO     Calculating recommendation scores
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 2
      1 with Timer() as test_time:
----> 2     top_k = model.recommend_k_items(test, top_k=TOP_K, remove_seen=True)
      4 print("Took {} seconds for prediction.".format(test_time.interval))

File /anaconda/envs/msftrecsys2/lib/python3.9/site-packages/recommenders/models/sar/sar_singlenode.py:533, in SARSingleNode.recommend_k_items(self, test, top_k, sort_top_k, remove_seen)
    520 def recommend_k_items(self, test, top_k=10, sort_top_k=True, remove_seen=False):
    521     """Recommend top K items for all users which are in the test set
    522 
    523     Args:
   (...)
    530         pandas.DataFrame: top k recommendation items for each user
    531     """
--> 533     test_scores = self.score(test, remove_seen=remove_seen)
    535     top_items, top_scores = get_top_k_scored_items(
    536         scores=test_scores, top_k=top_k, sort_top_k=sort_top_k
    537     )
    539     df = pd.DataFrame(
    540         {
    541             self.col_user: np.repeat(
   (...)
    546         }
    547     )

File /anaconda/envs/msftrecsys2/lib/python3.9/site-packages/recommenders/models/sar/sar_singlenode.py:346, in SARSingleNode.score(self, test, remove_seen)
    344 # calculate raw scores with a matrix multiplication
    345 logger.info("Calculating recommendation scores")
--> 346 test_scores = self.user_affinity[user_ids, :].dot(self.item_similarity)
    348 # ensure we're working with a dense ndarray
    349 if isinstance(test_scores, sparse.spmatrix):

File /anaconda/envs/msftrecsys2/lib/python3.9/site-packages/scipy/sparse/_base.py:411, in _spbase.dot(self, other)
    409     return self * other
    410 else:
--> 411     return self @ other

File /anaconda/envs/msftrecsys2/lib/python3.9/site-packages/scipy/sparse/_base.py:622, in _spbase.__matmul__(self, other)
    620 def __matmul__(self, other):
    621     if isscalarlike(other):
--> 622         raise ValueError("Scalar operands are not allowed, "
    623                          "use '*' instead")
    624     return self._mul_dispatch(other)

ValueError: Scalar operands are not allowed, use '*' instead

In which platform does it happen?

It happen on Azure ML compute instance

How do we replicate the issue?

This is setup script I use in AzureML :

# 1. Install gcc if it is not installed already. On Ubuntu, this could done by using the command
# sudo apt install gcc

# 2. Create and activate a new conda environment
conda create -n msftrecsys2 python=3.9.16
conda activate msftrecsys2

# 3. Install the core recommenders package. It can run all the CPU notebooks.
pip install recommenders
pip install recommenders[gpu]
# 4. create a Jupyter kernel
python -m ipykernel install --user --name msftrecsys2 --display-name msftrecsys2

# 5. Clone this repo within VSCode or using command line:
git clone https://github.com/recommenders-team/recommenders.git

this is my compute instance setting:
image

Expected behavior (i.e. solution)

Cell below should run without throwing error

with Timer() as test_time:
    top_k = model.recommend_k_items(test, top_k=TOP_K, remove_seen=True)

print("Took {} seconds for prediction.".format(test_time.interval))

Other Comments

@lordaouy lordaouy added the bug Something isn't working label Oct 9, 2023
@Petkomat
Copy link

Petkomat commented Oct 11, 2023

Faced the same problem. I think it's scipy related and that return np.array(result) lines at the end of similarity functions in recommenders/utils/python_utils.py should be changed to return result.toarray().

For example, the new version of jaccard should read as

def jaccard(cooccurrence):
    """<docstring>"""

    diag_rows, diag_cols = _get_row_and_column_matrix(cooccurrence.diagonal())

    with np.errstate(invalid="ignore", divide="ignore"):
        result = cooccurrence / (diag_rows + diag_cols - cooccurrence)

    # return np.array(result) # not this
    return result.toarray()

@moflotas
Copy link

moflotas commented Dec 4, 2023

@lordaouy I am not sure, but it might be problem with scipy itself
I solved it lowering the version from the latest to scipy==1.10.1 and it worked perfectly

@SimonYansenZhao
Copy link
Collaborator

SimonYansenZhao commented Feb 22, 2024

See scipy/scipy#18796 and #1954

@SimonYansenZhao
Copy link
Collaborator

Resolved in PR #2083

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants