[🐛BUG] Handling scores on training items when evaluating based on ranking #2038

chanwoo0806 · 2024-04-23T15:28:46Z

Here, you described that you set the scores of training items to negative infinity for top-k prediction.
However, dataloader.general_dataloader.FullSortEvalDatalLoader.collate_fn returns None as history unless is_sequential is true.
Therefore trainer.Trainer._full_sort_batch_eval can not use history_index to mask scores into -infinity.
Would you please clarify how masking works?

    def collate_fn(self, index):
        index = np.array(index)
        if not self.is_sequential:
            user_df = self.user_df[index]
            uid_list = list(user_df[self.uid_field])

            history_item = self.uid2history_item[uid_list]
            positive_item = self.uid2positive_item[uid_list]

            history_u = torch.cat(
                [
                    torch.full_like(hist_iid, i)
                    for i, hist_iid in enumerate(history_item)
                ]
            )
            history_i = torch.cat(list(history_item))

            positive_u = torch.cat(
                [torch.full_like(pos_iid, i) for i, pos_iid in enumerate(positive_item)]
            )
            positive_i = torch.cat(list(positive_item))

            return user_df, (history_u, history_i), positive_u, positive_i
        else:
            interaction = self._dataset[index]
            transformed_interaction = self.transform(self._dataset, interaction)
            inter_num = len(transformed_interaction)
            positive_u = torch.arange(inter_num)
            positive_i = transformed_interaction[self.iid_field]

            return transformed_interaction, None, positive_u, positive_i

    def _full_sort_batch_eval(self, batched_data):
        interaction, history_index, positive_u, positive_i = batched_data
        try:
            # Note: interaction without item ids
            scores = self.model.full_sort_predict(interaction.to(self.device))
        except NotImplementedError:
            inter_len = len(interaction)
            new_inter = interaction.to(self.device).repeat_interleave(self.tot_item_num)
            batch_size = len(new_inter)
            new_inter.update(self.item_tensor.repeat(inter_len))
            if batch_size <= self.test_batch_size:
                scores = self.model.predict(new_inter)
            else:
                scores = self._spilt_predict(new_inter, batch_size)

        scores = scores.view(-1, self.tot_item_num)
        scores[:, 0] = -np.inf
        if history_index is not None:
            scores[history_index] = -np.inf
        return interaction, scores, positive_u, positive_i

The text was updated successfully, but these errors were encountered:

chanwoo0806 added the bug Something isn't working label Apr 23, 2024

chanwoo0806 changed the title ~~[🐛BUG] Handling scores on items in training interaction when evaluating based on ranking~~ [🐛BUG] Handling scores on training items when evaluating based on ranking Apr 23, 2024

zhengbw0324 assigned BishopLiu May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐛BUG] Handling scores on training items when evaluating based on ranking #2038

[🐛BUG] Handling scores on training items when evaluating based on ranking #2038

chanwoo0806 commented Apr 23, 2024 •

edited

[🐛BUG] Handling scores on training items when evaluating based on ranking #2038

[🐛BUG] Handling scores on training items when evaluating based on ranking #2038

Comments

chanwoo0806 commented Apr 23, 2024 • edited

chanwoo0806 commented Apr 23, 2024 •

edited