jieba分词,识别【带空格的词】

自定义添加【带空格的词】,并分词识别

方法:找到源码的变量进行修改
示例:使【Blade Master】这类中间有空格的词被识别

jieba

import jieba, re
sentence = 'Blade Master疾风刺杀Archmage'
jieba.add_word('Blade Master')  # 添词
print([word for word in jieba.cut(sentence)])
jieba.re_han_default = re.compile('(.+)', re.U)  # 修改格式
print([word for word in jieba.cut(sentence)])
打印结果
[‘Blade’, ’ ', ‘Master’, ‘疾风’, ‘刺杀’, ‘Archmage’]
[‘Blade Master’, ‘疾风’, ‘刺杀’, ‘Archmage’]

jieb.posseg

import jieba, jieba.posseg as jp, re
sentence = 'Demon Hunter斩杀大法师'
jieba.add_word('Demon Hunter', 9, 'hero')  # 添词
print(jp.lcut(sentence))
jp.re_han_internal = re.compile('(.+)', re.U)  # 修改格式
print(jp.lcut(sentence))
打印结果
[pair(‘Demon’, ‘eng’), pair(’ ', ‘x’), pair(‘Hunter’, ‘eng’), pair(‘斩杀’, ‘v’), pair(‘大法师’, ‘n’)]
[pair(‘Demon Hunter’, ‘x’), pair(‘斩杀’, ‘v’), pair(‘大法师’, ‘n’)]

猜你喜欢

转载自blog.csdn.net/Yellow_python/article/details/82961965