Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

不支持语义运算 #145

Open
SnowyYANG opened this issue Jul 14, 2021 · 7 comments
Open

不支持语义运算 #145

SnowyYANG opened this issue Jul 14, 2021 · 7 comments

Comments

@SnowyYANG
Copy link

用的维基百科sgns.wiki.word,试了一下
国王-男人+女人 != 王后 向量值相差很远
梨树-树+花 != 梨花

比较了梨花、茶花、水花和花的曼哈顿距离
梨花离花比较远,是85+
茶花和水花距离花的距离差不多,是77.58和79.27

@shenshen-hungry
Copy link
Collaborator

是用的cosine距离来计算的词类比吗?具体可以参考项目中的词类比评测代码。

@SnowyYANG
Copy link
Author

计算了余弦相似性,结果如下,水花与花的相似性和梨花、茶花与花的相似性没有什么区别。
Similar(梨花,花)=0.29208240106642025
Similar(茶花,花)=0.3482703247116077
Similar(茶花,梨花)=0.469544978245012
Similar(水花,花)=0.3118033594185841
Similar(水花,茶花)=0.6777076692573767

代码:
public static double Similar(double[] v1, double[] v2)
{
var ab = 0d;
var aa = 0d;
var bb = 0d;
for (int i = 0; i < v1.Length; i++)
{
ab += v1[i] * v2[i];
aa += v1[i] * v1[i];
bb += v2[i] * v2[i];
}
return ab / (Math.Sqrt(aa) * Math.Sqrt(bb));
}

还有语义加减运算那个,也是“国王-男人”和“王后-女人”的结果完全不同。

@dongrongliang
Copy link

所谓的语义运算是要针对数据集的,你数据集如果没有关于国王,男人,女人的相关语义联合的语料,就不能这么算

@HunterHeidy
Copy link

HunterHeidy commented Mar 10, 2022 via email

@Yang2018
Copy link

Yang2018 commented Apr 8, 2022

计算了余弦相似性,结果如下,水花与花的相似性和梨花、茶花与花的相似性没有什么区别。 Similar(梨花,花)=0.29208240106642025 Similar(茶花,花)=0.3482703247116077 Similar(茶花,梨花)=0.469544978245012 Similar(水花,花)=0.3118033594185841 Similar(水花,茶花)=0.6777076692573767

代码: public static double Similar(double[] v1, double[] v2) { var ab = 0d; var aa = 0d; var bb = 0d; for (int i = 0; i < v1.Length; i++) { ab += v1[i] * v2[i]; aa += v1[i] * v1[i]; bb += v2[i] * v2[i]; } return ab / (Math.Sqrt(aa) * Math.Sqrt(bb)); }

还有语义加减运算那个,也是“国王-男人”和“王后-女人”的结果完全不同。

请问这个问题解决了吗,我也是计算余弦相似度有问题

@Yang2018
Copy link

Yang2018 commented Apr 8, 2022

请问这个问题解决了吗,我也是计算余弦相似度有问题

@HunterHeidy
Copy link

HunterHeidy commented Apr 8, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants