看协程asyncio有点云里雾里,原理是明白了,但总要有点实际应用吧,协程对于IO密集有着天然的优势,aiohttp还没看,谨以此例先体验下协程的实际应用,同时了解一下分块下载的方法。
1.着先获取待下载文件的大小 size
下载文件的大小通常都在headers里的"Content-Length",所以先读取一下header获得size:
resp = requests.head(url) size = int(resp.headers["Content-Length"])
2.根据设置的n把待下载的文件分块,并记录分块的边界
通常分块并不能等分,所以最后一块就大一些:
spos = [] fpos = [] persize = size//n for i in range(0, size, persize): spos.append(i) fpos.append(i + persize - 1) fpos[-1] = size
3.下载文件指定的区间
通过requests.get()方法的header["Range"]指定下载文件的区间,比如需下载10-20字节段的文件:
header["Range"] = "bytes=10-20"4.定义一个协程的函数
由于一般函数不能被await所修饰,必须要用loop.run_in_executor封装一下,但是loop.run_in_executor传参数比较坑,不支持**kwagrs,所以需要把requests.get(url)再封装一下:
get = lambda:requests.get(url,headers=headers) resp = await loop.run_in_executor(None, get)
同时,分块下载后写入文件时需要找到块的起始位置,这就需要用到f.seek(offset,where)了。
完整的代码如下:
import asyncio import requests import time url = "http://xia2.kekenet.com/Sound/2018/06/bbcdqmd175_3937944FiP.mp3" async def download(spos, fpos, f, i): """""" headers = {} headers['Range'] = "bytes=%d-%d"%(spos, fpos) # print("bytes=%d-%d"%(spos, fpos)) try: get = lambda:requests.get(url,headers=headers) print('part of %d is ready!'%i) resp = await loop.run_in_executor(None, get) f.seek(spos,0) f.write(resp.content) print('part of %d is completed!'%i) except Exception as e: print("download file error:",e) if __name__ == '__main__': n = 10 resp = requests.head(url) size = int(resp.headers["Content-Length"]) spos = [] fpos = [] persize = size//n for i in range(0, size, persize): spos.append(i) fpos.append(i + persize - 1) fpos[-1] = size print(spos) print(fpos) f = open("D:\\kekenet.mp3",'wb') f.close() f = open("D:\\kekenet.mp3",'rb+') start_time = time.time() loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*[download(spos[i], fpos[i], f, i+1) for i in range(n)])) finish_time = time.time() f.close() print('average speed is %0.2f KB/s'%(size/1000.0*(finish_time-start_time)))打印结果如下:
由打印结果可知,协程的开始并不是按顺序的,完成也不一定按开始的顺序的,这也是它效率高的原因吧。
[0, 207553, 415106, 622659, 830212, 1037765, 1245318, 1452871, 1660424, 1867977, 2075530] [207552, 415105, 622658, 830211, 1037764, 1245317, 1452870, 1660423, 1867976, 2075529, 2283082] part of 4 is ready! part of 10 is ready! part of 5 is ready! part of 2 is ready! part of 6 is ready! part of 1 is ready! part of 7 is ready! part of 3 is ready! part of 8 is ready! part of 9 is ready! part of 4 is completed! part of 5 is completed! part of 8 is completed! part of 3 is completed! part of 2 is completed! part of 9 is completed! part of 7 is completed! part of 1 is completed! part of 10 is completed! part of 6 is completed! average speed is 6390.67 KB/s