Python:捕获urllib.request超时异常的2种方法
1. 背景
在使用urllib.request.urlopen时,经常出现超时异常导致程序停止运行。每次停止还要重启程序,这不利于程序的健壮性。现希望捕获urllib的超时异常来做超时处理。
from urllib import request
headers = { # 用户代理,伪装浏览器用户访问网址
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3941.4 Safari/537.36'
}
# 测试url是否有效
def test_url(url):
r = request.Request(url, headers=headers)
r1 = request.urlopen(r, timeout=0.1)
print(r1.status)
if __name__ == '__main__':
url1 = 'https://www.baidu.com/'
url2 = 'http://httpbin.org/get'
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
test_url(url2)
2. 方法
2.1 except Exception as e
from urllib import request
headers = { # 用户代理,伪装浏览器用户访问网址
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3941.4 Safari/537.36'
}
# 测试url是否有效
def test_url(url):
try:
r = request.Request(url, headers=headers)
r1 = request.urlopen(r, timeout=0.1)
print(r1.status)
except Exception as e: # 捕获除与程序退出sys.exit()相关之外的所有异常
print(e)
if __name__ == '__main__':
url1 = 'https://www.baidu.com/'
url2 = 'http://httpbin.org/get'
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
test_url(url2)
2.2 except error.URLError as e
from urllib import request, error
import socket
headers = { # 用户代理,伪装浏览器用户访问网址
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3941.4 Safari/537.36'
}
# 测试url是否有效
def test_url(url):
try:
r = request.Request(url, headers=headers)
r1 = request.urlopen(r, timeout=0.1)
print(r1.status)
except error.HTTPError as e:
print(str(e.code) + ':' + e.reason)
except error.URLError as e:
print(e.reason)
if isinstance(e.reason, socket.timeout):
print('时间超时')
if __name__ == '__main__':
url1 = 'https://www.baidu.com/'
url2 = 'http://httpbin.org/get'
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
test_url(url2)
3 注意
- 当测试url1,百度响应速度极快,一般运行程序没有问题。
url1 = 'https://www.baidu.com/'
- 当测试url2,程序确实可以捕获超时异常。
url2 = 'http://httpbin.org/get'
方法1运行结果:
方法2运行结果:
- 当测试url3时,方法1可以捕获超时异常, 方法2报错退出。其原因可能是网络状态不佳,申请网页内容过多造成。
url3 = 'https://www.jianshu.com/p/5d6f1891354f'
方法1运行结果:
方法2运行结果:
4 总结
"except error.URLError as e"只能捕获与urllib相关的超时异常, " except Exception as e"可以捕获所有的超时异常。