先导入包
>>> import heapq
>>> import numpy
再创建随机数
>>> a = numpy.random.randn(15)
>>> a
array([-0.49531266, -0.79290141, 0.71889591, -1.0173929 , -2.68083486,
0.47483837, -0.93054724, 0.25544862, 1.07715225, -0.61792921,
0.18161928, 0.18686656, -0.63361904, 0.54088353, -0.12846399])
再拿出这些随机数中的最大最小值(前多少个)
>>> heapq.nlargest(3, a)
[1.0771522469304198, 0.7188959125120067, 0.5408835322858578]
>>> heapq.nlargest(5, a)
[1.0771522469304198, 0.7188959125120067, 0.5408835322858578, 0.47483836525940487, 0.25544862334733875]
>>> heapq.nsmallest(5, a)
[-2.6808348625209417, -1.0173928966949122, -0.9305472394653257, -0.7929014073665832, -0.6336190387049688]
>>> heapq.nsmallest(3, a)
[-2.6808348625209417, -1.0173928966949122, -0.9305472394653257]
>>>
参数很简单,前面的,就是前几个。后面的参数,就是一个可迭代的对象。
当然啦,还可以自定义比较函数
>>> heapq.nlargest(5, a, key=lambda x: abs(x))
[-2.6808348625209417, 1.0771522469304198, -1.0173928966949122, -0.9305472394653257, -0.7929014073665832]
由于上面的这个原因,所以,我们就可以处理结构体(或者字典)。只需要通过key来筛选,确定特定的目标数值就好了
还可以使用heapify来堆化这个序列
但是,参数必须得是list类型。
>>> heapq.heapify(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: heap argument must be a list
所以,我们对ndarray进行转换
>>> b = list(a)
>>> b
[-0.4953126601742176, -0.7929014073665832, 0.7188959125120067, -1.0173928966949122, -2.6808348625209417, 0.47483836525940487, -0.9305472394653257, 0.25544862334733875, 1.0771522469304198, -0.617929213222602, 0.18161927846394466, 0.18686656428500778, -0.6336190387049688, 0.5408835322858578, -0.12846398782575524]
>>> heapq.heapify(b)
>>> b
[-2.6808348625209417, -1.0173928966949122, -0.9305472394653257, -0.4953126601742176, -0.7929014073665832, -0.6336190387049688, -0.12846398782575524, 0.25544862334733875, 1.0771522469304198, -0.617929213222602, 0.18161927846394466, 0.18686656428500778, 0.47483836525940487, 0.5408835322858578, 0.7188959125120067]
>>>
速度对比(来源于书中)
条件 | 方法 |
---|---|
所找的元素数量较少 | nlargest(),nsmallest() |
只需要找到最小的,或者最大的 | max(),min() |
N如果接近序列长度 | sorted()排序才是最好的选择 |