windows对于utf-8编码的文件自带BOM,但是其他系统utf-8编码默认不带BOM。
这就造成在某些情况下字符解码会出现问题,比如python自带的json在读取在window下编码得来的utf-8文件时,会报如下错误:
ValueError: No JSON object could be decoded
方法一:
f = open("data","r")
s = f.read()
u = s.decode("utf-8-sig") # 得到一个不含BOM的unicode string
s = u.encode("utf-8") # 将unicode转换为utf-8
f.close()
方法二:
import codecs
f = open("data","r")
s = f.read()
if s.startswith(codecs.BOM_UTF8):
s = s[len(codecs.BOM_UTF8):]
f.close()
原文:https://blog.csdn.net/founderznd/article/details/52197078
参考:https://stackoverflow.com/questions/8898294/convert-utf-8-with-bom-to-utf-8-with-no-bom-in-python