-
Notifications
You must be signed in to change notification settings - Fork 832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data_loader.py中的数据标准化问题 #414
Comments
因为模型拟合的训练集。如果按照全体数据集标准化,标准化参数改变,模型拟合效果变差,因为拟合的不是同一个目标。 |
感谢解答,这里主要的考虑是为了避免【数据泄露】,因为在实际应用过程中,我们只知道训练数据,未知测试数据,及时是测试数据的mean和std也是无法拿到的,所以只在训练数据集上进行归一化信息的提取。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
data_loader.py文件中:
if self.scale:
train_data = df_data[border1s[0]:border2s[0]]
self.scaler.fit(train_data.values)
data = self.scaler.transform(df_data.values)
为什么只对训练数据train_data 计算标准化参数,而用这些参数对全体数据df_data进行了标准化?难道不是用全体数据计算fit然后对全体数据进行标准化么?
The text was updated successfully, but these errors were encountered: