Skip to content

RecBole v1.2.0

Latest
Compare
Choose a tag to compare
@BishopLiu BishopLiu released this 04 Nov 11:23
· 35 commits to master since this release
362d31f

RecBole v1.2.0 Release Notes

After a long period of hard work, we have completed the upgrade of RecBole and released a new version: RecBole v1.2.0!

In this release, we fully consider users' feedback and demands to improve the user friendliness of RecBole. First, we include more benchmark models and datasets to meet the latest needs of users. Secondly, we improve the benchmark framework by including commonly used data processing methods and efficient training and evaluation APIs, and also provide more support for result analysis and use. Thirdly, in order to improve the user experience, we provide more comprehensive project pages and documentation. According to the issues and discussions, we also fix a number of bugs and update the documentation to make it more user-friendly.

In a word, RecBole v1.2.0 is more efficient, convenient and flexible than previous versions. More details will be introduced in the following part:

  • Highlights
  • New Features
  • Bug Fixes
  • Code Refactor
  • Docs

Highlights

The RecBole v1.2.0 release includes a quantity of wonderful new features, some bug fixes and code refactor. A few of the highlights include:

  1. We add 7 new models and 2 new datasets.
  2. More flexible data processing. We reframe the overall data flow with PyTorch towards a compatible data module and add more task-oriented data processing methods.
  3. More user-friendly documentations. We update the website and documentation with detailed descriptions including visualization of benchmark configurations and more practical examples of the customized training strategy, multi-GPU training cases and detailed running steps. Besides, we also develop a FAQ page based on the existing GitHub issues of RecBole.

New Features

  • Add 7 new models:
    • Context recommendation (3): FiGNN (#1509), KD_DAGFM (#1628), EulerNet (#1744).
    • General recommendation (3): Random (#1690), DiffRec and LDiffRec (#1885).
    • Sequential recommendation (1): FEARec(#1899).
  • Add 2 new datasets: Music4All-Onion (#1668), Amazon-M2 (#1828).
  • Add the pretrain method to ConvNCF (#1651).
  • Support converting results to latex code (#1645).
  • Support different eval dataloaders for valid and test phases (#1666).

Bug Fixes

  • Model:
    • Fix a bug in DIEN: mask the padding value in aux loss and add softmax to attention values (#1485).
    • Fix the bug of deepcopy in DCNV2, xDeepFM, SpectralCF, FOSSIL, HGN, SHAN and SINE (#1488).
    • Fix a bug in EASE: change the data format (#1497).
    • Fix a bug in NeuMF: fix the load_pretrain function (#1502).
    • Fix a bug in LINE: add log function when computing loss (#1507).
    • Fix the field counts for float-like features in abstract_recommender.py (#1603).
    • Fix a bug in GCMC: change the last dense layer to dense_layer_v for item hidden representations (#1635).
    • Fix a bug in KD_DAGFM: use xavier_normal_initialization to initialize embedding (#1641).
    • Fix a bug in KSR: add an extra param kg_embedding_size (#1647).
    • Fix a bug in S3Rec: load item_seq from gpu to cpu for indexing (#1651).
    • Fix a bug in AutoEncoderMixin: convert tensors into the correct device (#1749).
    • Fix a bug in DGCF: correct l2 distance computation (#1845).
  • Dataset:
    • Fix the truncation bug of seq type data (#1481).
    • Fix a bug in interaction.py: transform type from torch.tensor to np.array (#1612).
    • Fix saving and loading for datasets and dataloaders (#1698).
    • Fix reversing kg data (#1829).
    • Fix if condition in set_neg_sample_args function (#1863).
  • Trainer:
    • Delete unnecessary import in trainer.py (#1500).
    • Fix calculate_loss error when use multi-gpu (#1873).
  • Util:
    • Fix the bugs on the suffix of dataset downloading (#1501).
    • Fix a bug in alpha parameter for eval sampler (#1504).
  • Evaluator:
    • Fix data.count_users in collector.py (#1526).
  • Config:
    • Fix numpy compatibility issue (#1621).
    • Update configurator.py (#1625).
  • Main:
    • Fix bugs when collecting results from mp.spawn in multi-GPU training (#1875).
  • Typo:
    • Fix typos in dataset_list.json (#1756).

Code Refactor

  • Refactor all autoencoder models: add class AutoEncoderMixin and only set rating matrix to cuda when get_rating_matrix is called (#1491).
  • Refactor BERT4Rec: align with the original paper (#1522, #1639, #1859).

Docs

  • Mask the ip information (#1479).
  • Update docs of train_neg_sample_args parameter (#1513).
  • Add hypertune config docs (#1524).
  • Add model_list and dataset_list (#1525).
  • Add FiGNN to the model_list (#1548).
  • Add numerical_feature to docs (#1560).
  • Replace neg_sampling with train_neg_sample_args in docs (#1569, #1570).
  • Add docs of KD_DAGFM (#1642).
  • Add significant test (#1644).
  • Add the rst file of FiGNN, KD_DAGFM and RecVAE (#1650).
  • Add update for SIGIR 2023 in README.md (#1662).
  • Update requirement.txt (#1870).