爬虫【进阶】(套路二) - 代码天地

爬虫【进阶】(套路二)

其他 2019-06-14 19:01:05 阅读次数: 0

GitHub爬虫

import requests
from bs4 import BeautifulSoup

req1 = requests.get('https://github.com/login')
soup = BeautifulSoup(req1.text, features='lxml')
tag = soup.find(name='input', attrs={'name': 'authenticity_token'})	#在源代码的form表单中
authenticity_token = tag.get('value')
cookie1 = req1.cookies.get_dict()


form_data = {
    'authenticity_token': authenticity_token,
    'commit': 'Sign in',
    'utf8': '✓',
    'login': 'fds',
    'password': 'fdsa',
    'webauthn-support': 'supported',
}

req2 = requests.post(url='https://github.com/session',data=form_data,cookies = cookie1 )
cookie2 = req2.cookies.get_dict()
cookie1.update(cookie2)

req3 = requests.get(url='https://github.com/settings/repositories',cookies = cookie1)
soup2 = BeautifulSoup(req3.text,features='lxml')
# print(soup2.text)
list_group = soup2.find(name='div', class_='col-9 float-left')

# print(list_group)
# print(list_group)
p_list = list_group.find(name = 'p',class_ = 'js-collaborated-repos-empty')
print(p_list.text)

在上面的代码中，只要是获取网页源代码form表单中的authenticity_token。

猜你喜欢

转载自blog.csdn.net/weixin_43265998/article/details/89163362

爬虫【进阶】(套路二)

爬虫常见套路

爬虫与反爬虫之间的套路

Python爬虫进阶之APP逆向(二)

python爬虫urllib使用和进阶 | Python爬虫实战二

python爬虫进阶（二）：动态网页爬虫

各种典型反爬虫套路

Python爬虫的基本套路

爬虫进阶

无套路学习 Android 开发进阶

vue嵌套路由（二）

python爬虫urllib使用和进阶 | Python爬虫实战二（1）

python爬虫urllib使用和进阶 | Python爬虫实战二（4）

那些你不知道的爬虫反爬虫套路

Java 那些你不知道的爬虫反爬虫套路

干货 | 那些你不知道的爬虫反爬虫套路

爬虫进阶：反反爬虫技巧

爬虫进阶：反反爬虫技巧！

左神高级进阶班6（利用快排的partition过程、BFPRT、动态规划的斜率优化技巧、二叉树的递归套路、完美洗牌问题）

爬虫的进阶须知

爬虫进阶深入目标

爬虫进阶：Scrapy入门

Python爬虫进阶

爬虫进阶（1）

网页爬虫--scrapy进阶

爬虫进阶（序）

Python爬虫入门与进阶

爬虫之进阶 twisted

网络爬虫进阶

17 爬虫进阶

今日推荐

周排行

Leetcode简单题61~80

解决zookeeper磁盘IO高的问题

多线程相关方法详解

Maven-setting.xml文件详解

Maven 项目的 classpath 理解

渊亭科技大数据笔试题

配置JVM内存分配

计算机网络个人学习笔记（三）网络层：第三部分连载

js中两个等号(==)和三个等号(===)的区别

用C程序自动打开电脑上的程序

每日归档

更多

2024-09-18(0)

2024-09-17(0)

2024-09-16(0)

2024-09-15(0)

2024-09-14(0)

2024-09-13(0)

2024-09-12(0)

2024-09-11(0)

2024-09-10(0)

2024-09-09(0)