[🐛BUG] FEARec模型训练时陷入死循环 #2020

yin214 · 2024-03-18T14:02:57Z

描述这个 bug
FEARec模型在sports数据集上训练时会陷入死循环卡住

如何复现
复现这个 bug 的步骤：

您引入的额外 yaml 文件


hidden_dropout_prob: 0.5         # (float) The probability of an element to be zeroed.
attn_dropout_prob: 0.5           # (float) The probability of an attention score to be zeroed.


global_ratio: 0.6                  # (float) The ratio of frequency components
dual_domain: False               # (bool) Frequency domain processing or not
std: False                       # (bool) Use the specific time index or not
spatial_ratio: 0.1                 # (float) The ratio of the spatial domain and frequency domain
fredom: True                    # (bool)  Regularization in the frequency domain or not
fredom_type: None                # (str)  The type of loss in different scenarios
topk_factor: 5                   # (int)  To aggregate time delayed sequences with high autocorrelation


epochs: 100  #训练的最大轮数
train_batch_size: 8192
eval_batch_size: 8192

learning_rate: 0.001
# training_neg_sample_num: 1 #负采样数目
eval_step: 1 #每次训练后做evalaution的次数
stopping_step: 10
valid_metric: recall@20

topk: [1,5,10,20]

neg_sampling: ~

eval_args: {'split':{'RS': [0.8,0.1,0.1]}, 'order': 'TO', 'mode': 'full'}

您的运行脚本
python run_recbole.py --model=FEARec --dataset=sports --config_files=./config_files/fearec.yaml --checkpoint_dir='./saved/FEARec/sports'
预期
跑了其他几个数据集没有出现这种情况

屏幕截图
卡在这种状态不动了

应该是在模型代码213到223行陷入死循环

            while True:
                sample_index = random.choice(targets_index)
                cur_item_list = interaction[self.ITEM_SEQ][i].to("cpu")
                sample_item_list = dataset.inter_feat[self.ITEM_SEQ][sample_index]
                are_equal = torch.equal(cur_item_list, sample_item_list)
                sample_item_length = dataset.inter_feat[self.ITEM_SEQ_LEN][sample_index]
                if not are_equal or lens == 1:
                    #print("helllo")
                    sem_pos_lengths.append(sample_item_length)
                    sem_pos_seqs.append(sample_item_list)
                    break

链接
添加能够复现 bug 的代码链接，如 Colab 或者其他在线 Jupyter 平台。（可选）

实验环境（请补全下列信息）：
我在两台机器上都出现了这个bug

The text was updated successfully, but these errors were encountered:

yin214 · 2024-03-18T14:12:32Z

# Basic Information
USER_ID_FIELD: user_id          # (str) Field name of user ID feature.
ITEM_ID_FIELD: item_id          # (str) Field name of item ID feature.
RATING_FIELD: rating            # (str) Field name of rating feature.
TIME_FIELD: timestamp           # (str) Field name of timestamp feature.
seq_len: ~                      # (dict) Field name of sequence feature: maximum length of each sequence
LABEL_FIELD: label              # (str) Expected field name of the generated labels for point-wise dataLoaders. 
threshold: ~                    # (dict) 0/1 labels will be generated according to the pairs.
NEG_PREFIX: neg_                # (str) Negative sampling prefix for pair-wise dataLoaders.

# Sequential Model Needed
ITEM_LIST_LENGTH_FIELD: item_length   # (str) Field name of the feature representing item sequences' length. 
LIST_SUFFIX: _list              # (str) Suffix of field names which are generated as sequences.
MAX_ITEM_LIST_LENGTH: 50       # (int) Maximum length of each generated sequence.
POSITION_FIELD: position_id     # (str) Field name of the generated position sequence.

user_inter_num_interval: "[10,inf)"
item_inter_num_interval: "[10,inf)"

load_col:                       # (dict) The suffix of atomic files: (list) field names to be loaded.
    inter: [user_id, item_id, rating, timestamp]
    item: [item_id, categories]
selected_features: [categories]
item_attribute: categories

TayTroye · 2024-03-22T07:02:03Z

@yin214 Hello! Thanks for your careful check! We have fixed this bug in #2024

yin214 added the bug Something isn't working label Mar 18, 2024

zhengbw0324 assigned TayTroye and BishopLiu Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐛BUG] FEARec模型训练时陷入死循环 #2020

[🐛BUG] FEARec模型训练时陷入死循环 #2020

yin214 commented Mar 18, 2024

yin214 commented Mar 18, 2024

TayTroye commented Mar 22, 2024

[🐛BUG] FEARec模型训练时陷入死循环 #2020

[🐛BUG] FEARec模型训练时陷入死循环 #2020

Comments

yin214 commented Mar 18, 2024

yin214 commented Mar 18, 2024

TayTroye commented Mar 22, 2024