python爬虫基本方法 - 代码天地

python爬虫基本方法

其他 2018-07-17 23:38:26 阅读次数: 0

import requests
from bs4 import BeautifulSoup
response = requests.get(url='http://www.ifeng.com/')#网页
# response.encoding = 'utf8'
response.encoding = response.apparent_encoding#编码问题
# print(response.text)
soup = BeautifulSoup(response.text,features='html.parser')#(lxml)#变为对象，第二个参数为模式
target = soup.find(id = 'turnRed')#找到id
obj = target.find_all('li')#id 下的所有li
# print(obj)
for i in obj:
    a = i.find('a')#找到所有a标签
    if a:
        print(a.attrs.get('href'))#找到属性

总结：

总结一：
1、requests
response = requests.get('URL')
response.content #返回字节流
response.enconding
response.aparent_encoding
response.status_code

2 soup = beautifulsoup('<html>...)',features = 'html.parser')
v1 = soup.find('div')
soup.find(id = 'il')
soup.find('div',id='il)

v2 = soup.find_all('div') #返回列表
obj = v1
obj = v[0]

obj.text #获取文本
obj.attrs #获取属性

猜你喜欢

转载自blog.csdn.net/weixin_41701299/article/details/80919514

python爬虫基本方法

Python爬虫库urllib，requests基本方法

爬虫的基本方法

Python爬虫之Selenium库的基本使用方法

python爬虫的基本框架

python爬虫----基本操作

python爬虫基本示例

python爬虫的基本流程

Python爬虫基本流程

python爬虫的基本介绍

Python爬虫基本框架

Python爬虫的基本操作

Python 爬虫（二）爬虫基本入门

PYTHON爬虫（爬虫的基本原理）

python爬虫 scrapy爬虫框架的基本使用

Python网络爬虫与信息提取（一）requests库的安装与基本方法之get()方法

Python 几种爬虫的方法

Python爬虫-urllib的基本用法

python 爬虫基本组成

Python爬虫工作基本流程

Python爬虫的基本套路

python爬虫基本知识

Python 爬虫 requests 基本使用

Python 爬虫 Selenium 基本使用

python爬虫开发基本准备

python爬虫基本库的使用

python爬虫从入门到放弃（四）- Urllib库的基本使用方法2

python爬虫从入门到放弃（三）- Urllib库的基本使用方法1

Python爬虫框架Scrapy的基本使用方法（以爬取加密货币GitHub链接为例）

Python爬虫解析方法以及爬虫实现

今日推荐

周排行

成为C++高手之宏与枚举

在CAD二次开发中使用进度条

Js插件ECharts，HighCharts学习网址整理

Celery提交任务出错(on windows.)

cephfs内核客户端性能追踪

thinkphp中PHPExcel用法

EntityFramework动态组合多排序字段

汇编语言（八）实验9 根据材料编程

安装ubuntu后必须做的事情（对我而言）

JS函数式编程

每日归档

更多

2024-10-22(0)

2024-10-21(0)

2024-10-20(0)

2024-10-19(0)

2024-10-18(0)

2024-10-17(0)

2024-10-16(0)

2024-10-15(0)

2024-10-14(0)

2024-10-13(0)