python使用requests和BeautifulSoup爬取网页乱码问题 - 代码天地

python使用requests和BeautifulSoup爬取网页乱码问题

其他 2018-08-09 15:16:14 阅读次数: 0

微信搜索关注“程序员旅途”，查看更多

python使用requests和BeautifulSoup爬取网页乱码问题

requests和beautifulsoup模块都会自行评测原网页的编码格式，所以存在评测错误的情况，所以可以在requests爬取之后Beautifulsoup调用之前对内容进行编码(设为网页本身的编码格式)即可，例如：

网页编码为：

     [python]  
    view plain copy
#encoding=utf-8  
import requests  
from bs4 import BeautifulSoup  
html = requests.get("http://www.baidu.com/")  
html.encoding='utf-8'#去掉这句则乱码，加上则正常显示，其中utf-8是根据网页源代码中设置的编码格式来指定的  
soup = BeautifulSoup(html.text,'lxml')  
print(soup.title.text)  

乱码显示：

正常显示：

猜你喜欢

转载自blog.csdn.net/leosblog/article/details/79832614

python使用requests和BeautifulSoup爬取网页乱码问题

Python爬虫实战：使用Requests和BeautifulSoup爬取网页内容

用requests和BeautifulSoup爬取静态网页

Python使用BeautifulSoup爬取网页信息

Python使用urllib,urllib3,requests库+beautifulsoup爬取网页

requests与BeautifulSoup爬取网页图片

python 简单爬取本地文档与爬取网页使用requests和bs4，及自己问题的解决

Python使用BeautifulSoup与Requests爬取大学排名

使用Requests和BeautifulSoup爬取妹子图

python使用requests和BeautifulSoup包爬取Pixiv图片--指定tag下的所有作品

python爬虫爬取招聘（ requests，BeautifulSoup）

python获取网页page数，同时按照href批量爬取网页（requests+BeautifulSoup）

Python爬虫学习三------requests+BeautifulSoup爬取简单网页

python 爬虫（一） requests+BeautifulSoup 爬取简单网页代码示例

python爬虫——利用requests库BeautifulSoup简单爬取网页上照片

python爬虫——利用requests库BeautifulSoup定向爬取网页内容写入txt文件

python爬虫——利用requests库BeautifulSoup简单爬取网页上照片—代码完善

python爬虫使用requests和BeautifulSoup出现中文乱码

requests与BeautifulSoup结合爬取网页数据应用

xpath和beautifulsoup爬取网页的demo

Python使用requests爬取一个网页并保存

Python爬虫学习（一）使用Requests和正则表达式爬取简单网页

关于Python BeautifulSoup 爬取网页信息中文乱码解决方法

python requests的网页乱码问题

如何使用 Python 和 BeautifulSoup 爬取网站

python requests 简单网页文本爬取

python的requests模块爬取网页内容

Python3爬虫--两种方法（requests(urllib)和BeautifulSoup）爬取网站pdf

利用python的requests和BeautifulSoup库爬取小说网站内容

使用requests+BeautifulSoup爬取龙族V小说

今日推荐

周排行

Java基础系列-Java11特性解读

前端面试查漏补缺--(十一) 前端软件架构模式MVC/MVP/MVVM

java Listener监听器

矩阵的迹

运用MVP实现二级联动

019基于JSP的学生考勤管理系统(MySQL版)

一道逻辑题 - 我拿走了哪个数

C# 通用单例窗体类

分布式之消息队列复习精讲【转】

Mac 使用.bash_profile

每日归档

更多

2024-07-11(0)

2024-07-10(0)

2024-07-09(0)

2024-07-08(0)

2024-07-07(0)

2024-07-06(0)

2024-07-05(0)

2024-07-04(0)

2024-07-03(0)

2024-07-02(0)