已解决UnicodeEncodeError: ‘utf-8’ codec can’t encode characters in position 42-43: surrogates not allowed

文章目录

报错代码
报错翻译
报错原因
解决方法

报错代码

需要是python把词库文件写入txt中，写入代码如下：

# 保存结果
with open(out_path, 'w', encoding='utf8') as f:
     f.writelines([word + '\t' + file_name + '\n' for count, py, word, file_name in GTable])

报错信息截图：

在这里插入图片描述

词库名： ＬＯＬ
词库类型： 角色扮演
描述信息： ６６６
词库示例： 大龙走起不要怂就是干 小龙叫了你们看着办吧 中单撸起 辅助走你 中团啦 中单速度推塔 
Traceback (most recent call last):
  File "E:/Python/test3.py", line 139, in <module>
    f.writelines([word + '\t' + file_name + '\n' for count, py, word, file_name in GTable])
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 42-43: surrogates not allowed

报错翻译

报错信息翻译：

Unicode编码错误：“utf-8”编码解码器无法编码位置42-43中的字符：不允许使用代理

报错原因

报错原因：

一些字符串无法被utf-8解码，所以可以把无法转化为utf-8格式的字符‘ignore’掉，再进行解码。

解决方法

遇到这种报错在字符串后面加上如下代码即可：

.encode('UTF-8', 'ignore').decode('UTF-8')

修改写入代码即可：

# 保存结果
with open(out_path, 'w', encoding='utf-8') as f:
    try:
        f.writelines([word + '\t' + file_name + '\n' for count, py, word, file_name in GTable])
    except:
        f.writelines([word.encode('UTF-8', 'ignore').decode('UTF-8') + '\t' + file_name.encode('UTF-8', 'ignore').decode('UTF-8') + '\n' for count, py, word, file_name in GTable])

再次写入就成功了！

扫描二维码关注公众号，回复： 14507076 查看本文章

已解决UnicodeEncodeError: ‘utf-8‘ codec can‘t encode characters in position 42-43: surrogates not allow

文章目录

报错代码

报错翻译

报错原因

解决方法

猜你喜欢