Skip to content
/ wv4tc Public

如何更好的在中文文本分类中使用词向量

Notifications You must be signed in to change notification settings

doubleEN/wv4tc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 

Repository files navigation

词向量在中文文本分类中的应用研究

数据信息

corpus original_size(doc)
fudan 19636
easenet 24000
sohu 1298155(valid)

数据清理

easenet

Auto Culture Economy Medicine Military Sports
4000 4000 4000 4000 4000 4000

fudan

Agriculture Art Computer Economy Enviornment Politics Space Sports
2043 1482 2715 3201 2435 2050 1282 2507

sohu

Auto Edu Ent Economy Health IT Sports
73087 11635 22064 59880 23140 20551 37211

About

如何更好的在中文文本分类中使用词向量

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages