tensorflow-word2vec:求单词的相似度

代码位置:我的网盘/工作/项目源码--CSDN/word2vec/tensorflow-word2vec

读取文本错误的时候,可以制定编码:open('stop_words.txt',encoding= 'utf-8'),本例制定为encoding= 'utf-8'

训练得到的LOG如下:

E:\programfile\anaconda3\python.exe E:/pycharm_exercise/main.py
Building prefix dict from the default dictionary ...
停用词读取完毕,共1893个单词
Loading model from cache C:\Users\TP\AppData\Local\Temp\jieba.cache
Loading model cost 0.808 seconds.
Prefix dict has been built succesfully.
文本中总共有19780个单词,不重复单词数5568,选取前30000个单词进入词典
WARNING:tensorflow:From E:/pycharm_exercise/main.py:101: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
2018-07-10 21:29:52.652391: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
0 sentences dealed, loss: 318.508056640625
1000 sentences dealed, loss: 100.96044921875
2000 sentences dealed, loss: 44.58319854736328
3000 sentences dealed, loss: 28.653278350830078
4000 sentences dealed, loss: 23.59463882446289
5000 sentences dealed, loss: 12.839811325073242
6000 sentences dealed, loss: 11.788054466247559
7000 sentences dealed, loss: 7.652200222015381
8000 sentences dealed, loss: 8.088170051574707
9000 sentences dealed, loss: 6.735531806945801
[679, 239]
['天地', '级别'] [['佩恩', '收', '问道', '听得', '大厅', '清风', '玄黄', '拍卖', '缝隙', '头来'], ['翻', '冲击', '应', '玄阶', '叔叔', '紧闭', '随意', '日后', '本章', '额头']] 0.107279 0.00576229

Process finished with exit code 0

猜你喜欢

转载自blog.csdn.net/m0_37870649/article/details/80992497