使用line_profiler记录每行代码的运行时间
1. 需求
Python中使用requests包去访问一个网站的所花费的时间比浏览器去直接访问的时间还要多的多。Python程序使用time类去计时的,大约花费9秒多访问一个请求,而浏览器马上就有响应了,差距实在太大。
在网上找一些针对python计时的工具时,发现了这个line_profiler,可以对每行代码计时,直观的就能看到每行代码所花费的时间,对程序中耗时操作的发现和查找十分方便。
2. 安装
IDE的环境是pycharm 64位版,python 2.7。
在pycharm中,file->Settings在搜索框中输入Project InterPreter,找到自己当前使用的python版本,在右边点击加号,搜索line_profiler,进行在线安装。
出现问题:
开始几次死活安装不上,怀疑是网络问题,发现开了科学上网之后,也安装不了。最后报了一个错误提示,缺少windows下的VC编译环境,按照提示给的网址https://www.microsoft.com/en-us/download/details.aspx?id=44266,去微软官网下载了这个支持python的VC环境。
然后安装,很快就装好了。
3. 测试
defLogin(_idcard,_password):
global g_Session
param = {'idcard': _idcard,'password': _password,'g_validationCode':'', 'g_success_func': '',
'g_success_param':''}
time_start = time.time()
page = g_Session.post(login_url, data=param,headers=headers)
time_finish = time.time()
msg = ("GetAllReserve::g_Session.postelapse: %s",time_finish -time_start)
log(msg)
ifpage.status_code == 200:
json = page.json()
if json["Result"] ==True and json["Data"]==True:
log("登陆成功")
else:
msg = ("登陆失败:%s%s") %(page.status_code,page.text)
log(msg)
GetAllReserve()
#获得所有预约的信息
def GetAllReserve():
global g_Reserve
global g_Session
time_start =time.time()
rresponse = g_Session.post(r_url, data='',headers=headers)
time_finish = time.time()
msg = ("GetAllReserve::g_Session.postelapse: %s",time_finish -time_start)
log(msg)
if rresponse.status_code ==200:
json = rresponse.json()
g_Reserve =json['rows']
else:
msg = ("获取预约信息失败:%s%s") %(rresponse.status_code,rresponse.text)
log(msg)
这里Login()函数中包含了GetAllReserve()函数,所以在测试的时候,需要添加lp.add_function(GetAllReserve)
开始包含包:
from line_profiler import LineProfiler
测试代码:
if __name__ == '__main__':
time_start = time.time()
lp = LineProfiler()
lp.add_function(GetAllReserve)
lp_wrapper = lp(Login)
lp_wrapper(_idcard, _password)
lp.print_stats()
time_finish = time.time()
print 'main thread over,elapse ' , time_finish-time_start
lp = LineProfiler() lp.enable_by_count() #lp.add_function(test2) lp_wrapper = lp(do_getdata) do_getdata() # lp_wrapper(canshu) lp.print_stats()
加了这句话才能按行显示消耗的时间 lp.enable_by_count()
把函数加到lp_wrapper之后,还需要把这个函数运行起来,才能被统计到
4. 结果
2018-01-23 16:03:17.479000('GetAllReserve::g_Session.post elapse: %s', 0.06099987030029297)
2018-01-23 16:03:17.480000 登陆成功
2018-01-23 16:03:17.505000('GetAllReserve::g_Session.post elapse: %s', 0.02500009536743164)
Timer unit: 4.10544e-07 s
Total time: 0.0889514 s
File:C:/Users/Administrator/PycharmProjects/***/login.py
Function: Login at line 80
Line # Hits Time Per Hit % Time Line Contents
==============================================================
80 defLogin(_idcard,_password):
81 global g_Session
82 1 7.0 7.0 0.0 param = {'idcard':_idcard, 'password': _password, 'g_validationCode': '', 'g_success_func': '',
83 1 4.0 4.0 0.0 'g_success_param': ''}
84 1 3.0 3.0 0.0 time_start = time.time()
85 1 148311.0 148311.0 68.5 page = g_Session.post(login_url, data=param, headers=headers)
86 1 7.0 7.0 0.0 time_finish = time.time()
87 1 2.0 2.0 0.0 msg =("GetAllReserve::g_Session.post elapse: %s", time_finish -time_start)
88 1 2717.0 2717.0 1.3 log(msg)
89 #data ={"Result":'true',"Data":'true',"Msg":'null'}
90 #page = json.dumps(data)
91
92 1 4.0 4.0 0.0 if page.status_code ==200:
93 1 118.0 118.0 0.1 json = page.json()
94 #json = json.loads(page)
95
96 #print json
97 #print page.text
98 #print page.content
99
100 #for d in json:
101 # print ("%s = %s") % (d,json[d])
102 1 24.0 24.0 0.0 ifjson["Result"] == True and json["Data"]== True:
103 1 2361.0 2361.0 1.1 log("登陆成功")
104 else:
105 msg = ("登陆失败:%s%s") % (page.status_code, page.text)
106 log(msg)
107 1 63109.0 63109.0 29.1 GetAllReserve()
Total time: 0.0258717 s
File:C:/Users/Administrator/PycharmProjects/yyrmyy/login.py
Function: GetAllReserve at line 110
Line # Hits Time Per Hit % Time Line Contents
==============================================================
110 defGetAllReserve():
111 global g_Reserve
112 global g_Session
114 1 3.0 3.0 0.0 time_start = time.time()
115 1 59912.0 59912.0 95.1 rresponse = g_Session.post(r_url,data='', headers=headers)
116 1 7.0 7.0 0.0 time_finish = time.time()
117 1 2.0 2.0 0.0 msg = ("GetAllReserve::g_Session.postelapse: %s", time_finish - time_start)
118 1 2969.0 2969.0 4.7 log(msg)
119 1 4.0 4.0 0.0 if rresponse.status_code== 200:
120 1 101.0 101.0 0.2 json = rresponse.json()
121 '''
122 log(json)
123 log("total: " + json['total'])
124 for i in json["rows"]:
125 for ii in i:
126 for iii in ii:
127 msg = ("%s = %s") % (iii,ii[iii])
128 log(msg)
129 '''
130
131 1 18.0 18.0 0.0 g_Reserve =json['rows']
132 '''
133 for i in json['rows']:
134 msg = ''
135 for ii in i:
136 print ("%s = %s")%(ii,i[ii][0])
137 print "\n"
138 1 1.0 1.0 0.0 '''
139 else:
140 msg = ("获取预约信息失败:%s%s")% (rresponse.status_code, rresponse.text)
141 log(msg)
main thread over,elapse 0.0909998416901
5. 参考资料
http://blog.csdn.net/guofangxiandaihua/article/details/77825524
python性能调试过程中最突出的问题就是耗时,性能测试工具有很多,像profiler,cprofiler等等,都是只能返回函数整体的耗时,而line_profiler就能够很好解决这个问题(大家可以试试就知道了)。
怎么使用这个工具呢?网上大部分都是说在所需要测的函数前面加一个@profile,如文档所说。但是加了@profile后函数无法直接运行,只能优化的时候加上,调试的时候又得去掉。文章中提到了这个问题的解决办法,个人觉得还是有点麻烦,不太能理解这是为什么。我在stackoverflow上看到了另一种关于line_profile的使用方法,简单而且实用。
这是使用案例:
from line_profiler importLineProfiler
import random
def do_stuff(numbers):
s = sum(numbers)
l =[numbers[i]/43for i in range(len(numbers))]
m =['hello'+str(numbers[i])for i in range(len(numbers))]
numbers =[random.randint(1,100)for i in range(1000)]
lp =LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
输出结果:
Timer unit:1e-06 s
Total time:0.000649 s
File:<ipython-input-2-2e060b054fea>
Function: do_stuff at line 4
Line# Hits Time Per Hit % Time Line Contents
==============================================================
4 def do_stuff(numbers):
5 1 10 10.0 1.5 s = sum(numbers)
6 1 186 186.0 28.7 l =[numbers[i]/43for i in range(len(numbers))]
7 1 453 453.0 69.8 m =['hello'+str(numbers[i])for i in range(len(numbers))]
这种方法就比较好用了,如果你想在其他文件中调用这个包含line_profiler函数测试的文件,又不想把这个信息显示出来,使用
if __name__ == '__main__':
把需要测试的函数放到这个下面,这样就能完美解决上述问题。
对于函数内部调用函数的情况:
from line_profiler importLineProfiler
import random
def do_other_stuff(numbers):
s = sum(numbers)
def do_stuff(numbers):
do_other_stuff(numbers)
l =[numbers[i]/43for i in range(len(numbers))]
m =['hello'+str(numbers[i])for i in range(len(numbers))]
numbers =[random.randint(1,100)for i in range(1000)]
lp =LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
这样做的话,只能显示子函数的总时间。输出如下:
Timer unit:1e-06 s
Total time:0.000773 s
File:<ipython-input-3-ec0394d0a501>
Function: do_stuff at line 7
Line# Hits Time Per Hit % Time Line Contents
==============================================================
7 def do_stuff(numbers):
8 1 11 11.0 1.4 do_other_stuff(numbers)
9 1 236 236.0 30.5 l =[numbers[i]/43for i in range(len(numbers))]
10 1 526 526.0 68.0 m =['hello'+str(numbers[i])for i in range(len(numbers))]
为了能够同时显示函数每行所用时间和调用函数每行所用时间,加入add_function就能够解决。
from line_profiler importLineProfiler
import random
def do_other_stuff(numbers):
s = sum(numbers)
def do_stuff(numbers):
do_other_stuff(numbers)
l =[numbers[i]/43for i in range(len(numbers))]
m =['hello'+str(numbers[i])for i in range(len(numbers))]
numbers =[random.randint(1,100)for i in range(1000)]
lp =LineProfiler()
lp.add_function(do_other_stuff) # add additional function to profile
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
Timer unit:1e-06 s
Total time:9e-06 s
File:<ipython-input-4-dae73707787c>
Function: do_other_stuff at line 4
Line# Hits Time Per Hit % Time Line Contents
==============================================================
4 def do_other_stuff(numbers):
5 1 9 9.0 100.0 s = sum(numbers)
Total time:0.000694 s
File:<ipython-input-4-dae73707787c>
Function: do_stuff at line 7
Line# Hits Time Per Hit % Time Line Contents
==============================================================
7 def do_stuff(numbers):
8 1 12 12.0 1.7 do_other_stuff(numbers)
9 1 208 208.0 30.0 l =[numbers[i]/43for i in range(len(numbers))]
10 1 474 474.0 68.3 m =['hello'+str(numbers[i])for i in range(len(numbers))]
结束