Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛BUG] LightGCN在ml-100k数据集上性能不佳 #2031

Open
zzzZHANGYIXUAN opened this issue Apr 2, 2024 · 6 comments
Open

[🐛BUG] LightGCN在ml-100k数据集上性能不佳 #2031

zzzZHANGYIXUAN opened this issue Apr 2, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@zzzZHANGYIXUAN
Copy link

zzzZHANGYIXUAN commented Apr 2, 2024

我使用如下的yaml文件:

tmodel settings

embedding_size: 64 # (int) The embedding size of users and items.
n_layers: 2 # (int) The number of layers in LightGCN.
reg_weight: 1e-05 # (float) The L2 regularization weight.

training settings

stopping_step: 10 #控制训练收敛的步骤数,在该步骤数内若选取的评测标准没有什么变化,就可以提前停止了

evalution settings

split_ratio: [0.8,0.1,0.1] #切分比例
metrics: ["Recall", "MRR","NDCG","Hit","Precision","MAP", "GAUC","ItemCoverage","AveragePopularity","GiniIndex","ShannonEntropy","TailPercentage"] #评测标准
topk: [10] #评测标准使用topk,设置成10评测标准就是["Recall@10", "MRR@10", "NDCG@10", "Hit@10", "Precision@10"]
valid_metric: Precision@10 #选取哪个评测标准作为作为提前停止训练的标准
eval_batch_size: 4096 #评测的batch_size

运行得出的结果
Precision@10:0.1716

这项指标远低于预期
请问是我参数文件使用错误?还是伯乐使用的数据集或者指标公式不同?

@zzzZHANGYIXUAN zzzZHANGYIXUAN added the bug Something isn't working label Apr 2, 2024
@NEUYuYang
Copy link

请你的lightgcn训练样本数和验证样本数正常吗,我跑lightgcn的时候,训练集的数量远小于验证机和测试集的数量

@zzzZHANGYIXUAN
Copy link
Author

请你的lightgcn训练样本数和验证样本数正常吗,我跑lightgcn的时候,训练集的数量远小于验证机和测试集的数量

你是否设置了相关参数来划分训练集、验证集和测试集?

@NEUYuYang
Copy link

model config

embedding_size: 32

dataset config

field_separator: "\t" #指定数据集field的分隔符
seq_separator: " " #指定数据集中token_seq或者float_seq域里的分隔符
USER_ID_FIELD: user_id #指定用户id域
ITEM_ID_FIELD: item_id #指定物品id域
RATING_FIELD: rating #指定打分rating域
TIME_FIELD: timestamp #指定时间域
NEG_PREFIX: neg_ #指定负采样前缀
#指定从什么文件里读什么列,这里就是从ml-1m.inter里面读取user_id, item_id, rating, timestamp这四列
load_col:
inter: [user_id, item_id, rating, timestamp]

training settings

epochs: 500 #训练的最大轮数
train_batch_size: 4096 #训练的batch_size
learner: adam #使用的pytorch内置优化器
learning_rate: 0.001 #学习率
training_neg_sample_num: 1 #负采样数目
eval_step: 1 #每次训练后做evalaution的次数
stopping_step: 10 #控制训练收敛的步骤数,在该步骤数内若选取的评测标准没有什么变化,就可以提前停止了
eval_args:
split: {'RS':[0.8, 0.1, 0.1]} #对数据随机重排,设置按比例划分数据集
group_by: ~ #是否将一个user的记录划到一个组里
mode: full
order: RO
metrics: ["Recall","NDCG"] #评测标准
topk: [10] #评测标准使用topk,设置成10评测标准就是["Recall@10", "MRR@10", "NDCG@10", "Hit@10", "Precision@10"]
valid_metric: Recall@10 #选取哪个评测标准作为作为提前停止训练的标准
eval_batch_size: 4096 #评测的batch_size

val_interval:
rating: "[3,inf)"
unused_col:
inter: [rating]
user_inter_num_interval: "[10,inf)"
item_inter_num_interval: "[10,inf)"

这是我的参数,请大佬指教一下

@NEUYuYang
Copy link

1712577188634
这是我的图片,训练集只有201,验证集有6033

@NEUYuYang
Copy link

您是这样的吗

@LouHerGetUp
Copy link

可以把 stopping_step 增大,或者关掉试试

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants