phthon踩雷(二):UnicodeEncodeError: ‘UCS-2‘ codec can‘t encode characters in position...

报错如下:

UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 3298-3298: 
Non-BMP character not supported in Tk

翻译一下就是:

Unicode编码错误:'UCS-2’编码器不能编码在3298-3298这个位置的字符类:
Non-BMP 字符类在Tk中不被支持。

大致就是这么个意思,翻译的不一定准,但是大致意思就是某行代码让编译器出现了问题,不能编译了。

解决:
俗话说的好,‘如果不能解决问题,那就把出现问题的代码解决掉’,这样不就没问题了吗? 没错,就是这么个思路!

排查: 所写代码如下

import requests
from bs4 import BeautifulSoup

url = 'https://movie.douban.com/top250'
headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    'Accept-Encoding': 'gzip, deflate, br',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    'Cookie': 'douban-fav-remind=1; bid=QhopwFw0HcQ; __utmz=30149280.1586506561.6.5.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; _pk_ses.100001.4cf6=*; __utma=30149280.1613818664.1547469206.1586506561.1594137884.7; __utmb=30149280.0.10.1594137884; __utmc=30149280; __utma=223695111.510808800.1561795806.1561795806.1594137884.2; __utmb=223695111.0.10.1594137884; __utmc=223695111; __utmz=223695111.1594137884.2.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); ap_v=0,6.0; _pk_id.100001.4cf6=b46c129b3f0ba258.1561795806.2.1594137900.1561795859.',
    'Host': 'movie.douban.com',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'
    }
res = requests.get(url, headers=headers)
html = res.text
# soup = BeautifulSoup(html, 'html.parser')
print(html)

经过仔细查找,发现headers里面有一句和encoding编码相关的代码'Accept-Encoding': 'gzip, deflate, br', ,就是这句了,翻译一下就是‘可以接受的编码有xxx’,不包含我们需要的呀。 所以 干脆把这代码干掉!!

修改后的代码如下:

import requests
from bs4 import BeautifulSoup

url = 'https://movie.douban.com/top250'
headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    # 'Accept-Encoding': 'gzip, deflate, br',
    'Cache-Control': 'max-age=0',
    'Connection': 'keep-alive',
    'Cookie': 'douban-fav-remind=1; bid=QhopwFw0HcQ; __utmz=30149280.1586506561.6.5.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; _pk_ses.100001.4cf6=*; __utma=30149280.1613818664.1547469206.1586506561.1594137884.7; __utmb=30149280.0.10.1594137884; __utmc=30149280; __utma=223695111.510808800.1561795806.1561795806.1594137884.2; __utmb=223695111.0.10.1594137884; __utmc=223695111; __utmz=223695111.1594137884.2.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); ap_v=0,6.0; _pk_id.100001.4cf6=b46c129b3f0ba258.1561795806.2.1594137900.1561795859.',
    'Host': 'movie.douban.com',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-User': '?1',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'
    }
res = requests.get(url, headers=headers)
html = res.text
# soup = BeautifulSoup(html, 'html.parser')
print(html)

咦?!搞定了!! 哈哈哈, 没想到吧。 这个思路也不错吧~~

如果有更好的方法,或者有更好的解释,欢迎评论区留言讨论。加油!

猜你喜欢

转载自blog.csdn.net/tuzi007a/article/details/107194569