Python全栈(四)高级编程技巧之10.Python多任务-协程

一、生成器-send方法

1.同步、异步

  • 同步:
    是指代码调用IO操作时,必须等待IO操作完成才返回的调用方式。
  • 异步:
    是指代码调用IO操作时,不必等IO操作完成就返回的调用方式。
    同步异步比较如下:
    比较

2.堵塞、非堵塞

  • 阻塞:
    从调用者的角度出发,如果在调用的时候,被卡住,不能再继续向下运行,需要等待,就说是阻塞。
    堵塞的例子有:
    • 多个用户同时操作数据库和锁机制
    • Socket的accept()方法
    • input()
  • 非阻塞:
    从调用者的角度出发,如果在调用的时候,没有被卡住,能够继续向下运行,无需等待,就说是非阻塞。

3.生成器的send()方法

之前讲到生成器:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1


g = create_fib(5)
print(next(g))
print()
for i in g:
    print(i)

打印

0

1
1
2
3

假如生成器中有返回值,要想获取返回值:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


while True:
    try:
        ret = next(g)
        print(ret)
    except Exception as e:
        print(e.args[0])
        break

打印

0
1
1
2
3
hello

显然,hello在异常处理的except语句中获取到并打印出来。
send()方法有一个参数,该参数指定的是上一次被挂起的yield语句的返回值。
send()方法启动生成器:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(g.send(None))
print(g.send('hello'))

打印

0
1

当第一次调用的是send()方法时,传入的参数只能是None,否则会报错:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(g.send('hello'))
print(g.send('world'))

打印

Traceback (most recent call last):
  File "xxx/demo.py", line 14, in <module>
    print(g.send('hello'))
TypeError: can't send non-None value to a just-started generator

显然,第一次调用send()时必须传入None,即第一次调用的不是next()时,那么调用send()的参数必须是None
send()方法可以和next()方法结合使用:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        yield a
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))

打印

0
1

当对yield a进行赋值时:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print(result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))

打印

0
hello
1

打印出了hello,对hello的打印进行验证:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print('result-->', result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))

打印

0
result--> hello
1

显然,是通过打印result打印出来的。
解释:
执行result = yield a时,停在此处,先执行yield a返回打印出0,在调用send()方法时,将hello赋值给整个yield a,即赋值给result,继续向下执行,第二次循环打印出1。
再次测试:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print('result-->', result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
print(g.send('hello'))
print(g.send('world'))

打印

0
result--> hello
1
result--> world
1

生成器的close()方法使用:

def create_fib(num):
    a, b = 0, 1
    current_num = 0
    while current_num < num:
        result = yield a
        print('result-->', result)
        a, b = b, a + b
        current_num += 1
    return 'hello'


g = create_fib(5)


print(next(g))
# 关闭生成器
g.close()
print(g.send('hello'))
print(g.send('world'))

打印

0
Traceback (most recent call last):
  File "xxx/demo.py", line 18, in <module>
    print(g.send('hello'))
StopIteration

即调用close()方法后,生成器关闭、迭代停止,不能再向下迭代,即不能再调用next()send()方法。

二、使用yield完成多任务和yield from

1.使用yield完成多任务

使用yield实现多任务测试:

import time


def task1():
    while True:
        print('---1---')
        time.sleep(0.1)
        yield


def task2():
    while True:
        print('---2---')
        time.sleep(0.1)
        yield


def main():
    t1 = task1()
    t2 = task2()
    while True:
        next(t1)
        next(t2)


if __name__ == '__main__':
    main()

显示:
result1
实现了交替运行、实现多任务的效果,并且消耗的资源比线程、进程更少。

2.yield from的使用

itertools.chain可以实现对多个可迭代对象输出结果:

from itertools import chain


lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}

for value in chain(lis, dic, range(5,10)):
    print(value)

打印

1
2
3
name
age
5
6
7
8
9

并且可以对itertools.chain对象强制转化为列表:

from itertools import chain


lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}

print(list(chain(lis, dic, range(5,10))))
for value in chain(lis, dic, range(5,10)):
    print(value)

打印

[1, 2, 3, 'name', 'age', 5, 6, 7, 8, 9]
1
2
3
name
age
5
6
7
8
9

可以使用yield实现同样的功能:

lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}


def my_chain(*args, **kwargs):
    for my_iterable in args:
        for value in my_iterable:
            yield value


for value in my_chain(lis, dic, range(5,10)):
    print(value)

打印

1
2
3
name
age
5
6
7
8
9

python3.3新加了yield from语法。
使用yield from可以实现同样的效果:

lis = [1, 2, 3]
dic = {
    'name':'Corley',
    'age':18
}


def my_chain(*args, **kwargs):
    for my_iterable in args:
        yield from my_iterable


for value in my_chain(lis, dic, range(5,10)):
    print(value)

执行结果与前者相同,即yield from相当于一个for循环。
yieldyield from对比:

def generator1(lis):
    yield lis


def generator2(lis):
    yield from lis


lis = [1, 2, 3, 4, 5]

for i in generator1(lis):
    print(i)

for i in generator2(lis):
    print(i)

打印

[1, 2, 3, 4, 5]
1
2
3
4
5

生成器中传入参数求和:

# 子生成器
def generator_1():
    total = 0
    while True:
        x = yield
        print('add --', x)
        if not x:
            break
        total += x
    return total


# 委托生成器
def generator_2():
    while True:
        total = yield from generator_1()  # 子生成器
        print('sum is --', total)


# 调用方
def main():
    g1 = generator_1()
    g1.send(None)
    g1.send(2)
    g1.send(3)
    g1.send(None)


if __name__ == '__main__':
    main()

打印

Traceback (most recent call last):
add -- 2
add -- 3
add -- None
  File "xxx/demo.py", line 121, in <module>
    main()
  File "xxx/demo.py", line 112, in main
    g1.send(None)
StopIteration: 5

即通过generator_1不能实现功能。
generator_2进行尝试:

# 子生成器
def generator_1():
    total = 0
    while True:
        x = yield
        print('add --', x)
        if not x:
            break
        total += x
    return total


# 委托生成器
def generator_2():
    while True:
        # yield from建立调用方和子生成器的通道
        total = yield from generator_1()  # 子生成器
        print('sum is --', total)


# 调用方
def main():
    g2 = generator_2()
    g2.send(None)
    g2.send(2)
    g2.send(3)
    g2.send(None)


if __name__ == '__main__':
    main()

打印

add -- 2
add -- 3
add -- None
sum is -- 5

实现了功能。
说明和解释:
子生成器:yield from后的generator_1()生成器函数是子生成器;
委托生成器:generator_2()是程序中的委托生成器,它负责委托子生成器完成具体任务;
调用方:main()是程序中的调用方,负责调用委托生成器。
yield from建立了调用方和子生成器的通道,借助委托生成器,send()函数传的值通过yield from传给子生成器中;
yield from省去了很多异常处理。

三、协程-使用greenlet&gevent完成多任务

1.协程概念

协程,又称微线程,是Python中另外一种实现多任务的方式,只不过是比线程占用(需要的资源)更小的执行单元。
Python中的协程大概经历了如下三个阶段:

  • (1)最初的生成器变形yield/send
  • (2)yield from
  • (3)在最近的Python3.5版本中引入async/await关键字

协程自带CPU上下文,通过yield保存运行状态,才能恢复CPU上下文程序。

2.使用greenlet完成多任务

安装模块:

pip install greenlet

greenlet使用:

from greenlet import greenlet
import time


def demo1():
    while True:
        print('---demo1---')
        gr2.switch()
        time.sleep(0.5)


def demo2():
    while True:
        print('---demo2---')
        gr1.switch()
        time.sleep(0.5)


gr1 = greenlet(demo1)
gr2 = greenlet(demo2)
gr1.switch()

显示:
result 2
易知,协程利用程序的IO来切换任务,用greenlet模块需要人工手动切换。

3.使用gevent完成多任务

安装模块:

pip install gevent

使用gevent进行尝试:

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


g1 = gevent.spawn(f1, 5)
g2 = gevent.spawn(f2, 5)
g3 = gevent.spawn(f3, 5)

g1.join()
g2.join()
g3.join()

打印

<Greenlet at 0x25f41185378: f1(5)> 0
<Greenlet at 0x25f41185378: f1(5)> 1
<Greenlet at 0x25f41185378: f1(5)> 2
<Greenlet at 0x25f41185378: f1(5)> 3
<Greenlet at 0x25f41185378: f1(5)> 4
<Greenlet at 0x25f41185598: f2(5)> 0
<Greenlet at 0x25f41185598: f2(5)> 1
<Greenlet at 0x25f41185598: f2(5)> 2
<Greenlet at 0x25f41185598: f2(5)> 3
<Greenlet at 0x25f41185598: f2(5)> 4
<Greenlet at 0x25f411856a8: f3(5)> 0
<Greenlet at 0x25f411856a8: f3(5)> 1
<Greenlet at 0x25f411856a8: f3(5)> 2
<Greenlet at 0x25f411856a8: f3(5)> 3
<Greenlet at 0x25f411856a8: f3(5)> 4

显然未达到预期的效果实现多任务。
进行改进–使用gevent.sleep()

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


g1 = gevent.spawn(f1, 5)
g2 = gevent.spawn(f2, 5)
g3 = gevent.spawn(f3, 5)

g1.join()
g2.join()
g3.join()

打印

<Greenlet at 0x2518fda5378: f1(5)> 0
<Greenlet at 0x2518fda5598: f2(5)> 0
<Greenlet at 0x2518fda56a8: f3(5)> 0
<Greenlet at 0x2518fda5378: f1(5)> 1
<Greenlet at 0x2518fda5598: f2(5)> 1
<Greenlet at 0x2518fda56a8: f3(5)> 1
<Greenlet at 0x2518fda5378: f1(5)> 2
<Greenlet at 0x2518fda5598: f2(5)> 2
<Greenlet at 0x2518fda56a8: f3(5)> 2
<Greenlet at 0x2518fda5378: f1(5)> 3
<Greenlet at 0x2518fda5598: f2(5)> 3
<Greenlet at 0x2518fda56a8: f3(5)> 3
<Greenlet at 0x2518fda5378: f1(5)> 4
<Greenlet at 0x2518fda5598: f2(5)> 4
<Greenlet at 0x2518fda56a8: f3(5)> 4

此时实现了多任务。
再次测试–假如time.sleep(2)

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)

print('--1--')
g1 = gevent.spawn(f1, 5)
print('--2--')
time.sleep(2)
g2 = gevent.spawn(f2, 5)
print('--3--')
g3 = gevent.spawn(f3, 5)
print('--4--')

g1.join()
g2.join()
g3.join()

显示:
result 3
显然,time.sleep()并没有影响到gevent的运行,在sleep()之后才开始执行。
改成gevent.sleep()效果就会不同:

import gevent
import time


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        gevent.sleep(0.5)

print('--1--')
g1 = gevent.spawn(f1, 5)
print('--2--')
gevent.sleep(2)
g2 = gevent.spawn(f2, 5)
print('--3--')
g3 = gevent.spawn(f3, 5)
print('--4--')

g1.join()
g2.join()
g3.join()

显示:
result4
即影响到了协程的执行,在实际中会用耗时的IO操作代替gevent.sleep()
如果代码中存在大量的time.sleep()等耗时操作代码,不用全部手动改为gevent.sleep(),可以使用模块中的类实现:

import gevent
import time
from  gevent import monkey


# 将程序中用到的耗时操作转换为gevent中实现的模块
monkey.patch_all() # 相当于打补丁


def f1(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f2(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)


def f3(n):
    for i in range(n):
        print(gevent.getcurrent(), i)
        time.sleep(0.5)

print('--1--')
g1 = gevent.spawn(f1, 5)
print('--2--')
time.sleep(2)
g2 = gevent.spawn(f2, 5)
print('--3--')
g3 = gevent.spawn(f3, 5)
print('--4--')

g1.join()
g2.join()
g3.join()

显示:
result5
可以实现一样的效果。

4.gevent简单应用

import gevent
from gevent import monkey
monkey.patch_all()
import requests


def download(url):
    print('to get:%s' % url)
    res = requests.get(url)
    data = res.text
    print('Got:', len(data), url)


g1 = gevent.spawn(download, 'http://www.baidu.com')
g2 = gevent.spawn(download, 'https://www.csdn.net/')
g3 = gevent.spawn(download, 'https://stackoverflow.com')

g1.join()
g2.join()
g3.join()

显示:
result6
import requests必须在monkey.patch_all()之后,否则会有警告信息。
进一步简化代码:

import gevent
from gevent import monkey
monkey.patch_all()
import requests


def download(url):
    print('to get:%s' % url)
    res = requests.get(url)
    data = res.text
    print('Got:', len(data), url)


gevent.joinall([
    gevent.spawn(download, 'http://www.baidu.com'),
    gevent.spawn(download, 'https://www.csdn.net/'),
    gevent.spawn(download, 'https://stackoverflow.com')
])

执行结果与之前相同。
可得,协程是并发的,因为是属于单线程完成多个任务。

5.进程、线程和协程对比

  • 进程是资源分配的单位;
  • 线程是操作系统调度的单位;
  • 进程切换需要的资源很大、效率很低;
  • 线程切换需要的资源一般、效率一般(在不考虑GIL的情况下);
  • 协程切换任务资源很小、效率高;
  • 多进程、多线程根据CPU核数不同可能是并行的,但是协程是在一个线程中,所以是并发。
发布了72 篇原创文章 · 获赞 336 · 访问量 9万+

猜你喜欢

转载自blog.csdn.net/CUFEECR/article/details/104211154