爬虫----基础

 request模块:

更多文档:http://cn.python-requests.org/zh_CN/latest/

安装

pip install requests

使用

import requests

response=requests.get("https://movie.douban.com/cinema/nowplaying/beijing/")

参数
print(response.content) # 字节数据 print(response.text) # 字符数据 print(type(response)) # <class 'requests.models.Response'> print(response.status_code) # 200 print(response.encoding) # utf-8 print(response.cookies) # <RequestsCookieJar[<Cookie bid=YwWqpRG7Z_E for .douban.com/>]>

 

 GET请求:

如果想请求JSON文件,可以利用 json() 方法解析

response=requests.get("https://github.com/timeline.json")

print(response.text)
print(response.json().get("message"))

原始响应内容

如果想获取来自服务器的原始套接字响应,可以取得 r.raw 。 不过需要在初始请求中设置 stream=True 。

>>> r = requests.get('https://github.com/timeline.json', stream=True)
>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'

但一般情况下,你应该以下面的模式将文本流保存到文件:

with open(filename, 'wb') as fd:
    for chunk in r.iter_content(chunk_size):
        fd.write(chunk) 

定制请求头

headers={"Content-Type":"application/json"}
data={"username":"yuan"}
response=requests.get("https://movie.douban.com/cinema/nowplaying/beijing/",params=data,headers=headers)

print(response.url)
print(response.headers)

 

 POST请求:

payload = {'key1': 'value1', 'key2': 'value2'}
response = requests.post("http://httpbin.org/post", data=payload)
print(response.text)

你还可以为 data 参数传入一个元组列表。在表单中多个元素使用同一 个  key 的时候,这种方式尤其有效:

import requests
payload = (('key1', 'value1'), ('key1', 'value2'))
# payload = {"k":"v"}
r = requests.post('http://httpbin.org/post', data=payload)
print(r.text)

使用 json 参数直接传递

方式一:

import requests
import json
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
r = requests.post(url, data=json.dumps(payload))

方式二

import requests
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
r = requests.post(url, json=payload)

requests  上传文件

import requests
url = 'http://httpbin.org/post'
files = {'file': open('test', 'rb')}
r = requests.post(url, files=files)
print(r.text)

猜你喜欢

转载自www.cnblogs.com/yanxiaoge/p/10631300.html