ELMo、GPT详解

详细请看:

Bert前世篇:从Word Embedding到Word2Vec、ELMo和GPT


获得ELMo词向量代码示例:

此处用到了allennlp库,下载请参考:

https://wangguisen.blog.csdn.net/article/details/127222333

from allennlp.modules.elmo import Elmo, batch_to_ids

ppp = '/Users/wangguisen/Documents/markdowns/AI-note/NLP/bert/data/elmo/'
options_file = ppp + "elmo_2x2048_256_2048cnn_1xhighway_options.json"  # 配置文件地址
weight_file = ppp + "elmo_2x2048_256_2048cnn_1xhighway_weights.hdf5"  # 权重文件地址
# 这里的1表示产生一组线性加权的词向量。
# 如果改成2 即产生两组不同的线性加权的词向量。
elmo = Elmo(options_file, weight_file, 1)

# use batch_to_ids to convert sentences to character ids
# sentence_lists = ["I have a dog", "How are you , today is Monday", "I am fine thanks"]
sentence_lists = ["我有一条狗", "你好吗,今天是星期一", "我很好,谢谢"]
character_ids = batch_to_ids(sentence_lists)

print(character_ids.shape)
print(character_ids)

embeddings = elmo(character_ids)['elmo_representations'][0]

print(embeddings.shape)
print(embeddings)

猜你喜欢

转载自blog.csdn.net/qq_42363032/article/details/127249738