urllib库学习（1）

urllib库

urllib.request:请求模块

urllib.parse:url解析模块

urllib.error:异常处理模块

1.urllib.request:请求模块

This example gets the youdao main page and displays the first 300 bytes of it.

In general, a program will decode the returned bytes object to string once it determines or guesses the appropriate encoding.

1 import urllib.request #导入请求模块
2 
3 with urllib.request.urlopen("http://www.youdao.com/w/bytes/#keyfrom=dict2.top") as fd:
4     print(fd.read().decode("utf-8")) #编码格式

2.urllib.parse:url解析模块

 1 def urlparse(url, scheme='', allow_fragments=True):
 2     """Parse a URL into 6 components:
 3     <scheme>://<netloc>/<path>;<params>?<query>#<fragment>
 4     Return a 6-tuple: (scheme, netloc, path, params, query, fragment).
 5     Note that we don't break the components up in smaller bits
 6     (e.g. netloc is a single string) and we don't expand % escapes."""
 7     url, scheme, _coerce_result = _coerce_args(url, scheme)
 8     splitresult = urlsplit(url, scheme, allow_fragments)
 9     scheme, netloc, url, query, fragment = splitresult
10     if scheme in uses_params and ';' in url:
11         url, params = _splitparams(url)
12     else:
13         params = ''
14     result = ParseResult(scheme, netloc, url, params, query, fragment)
15     return _coerce_result(result)

import urllib.parse #导入url解析模块

url = "http://www.youdao.com/w/bytes/#keyfrom=dict2.top"
result = urllib.parse.urlparse(url=url) #识别与分段
print(result)

参数说明：
scheme:表示协议
netloc:域名
path:路径
params:参数
query:查询条件，一般都是get请求的url
fragment:锚点，用于直接定位页面的下拉位置，跳转到网页的指定位置

3.urllib.error:异常处理模块

 1 import urllib.request   #导入请求模块
 2 import urllib.error #导入错误信息模块
 3 
 4 def check_urlerror():
 5     try:
 6         with urllib.request.urlopen("http://www.youdaoxxx.com/w/bytes/#keyfrom=dict2.top") as fd:
 7             print(fd.read().decode("utf-8"))
 8     except urllib.error.URLError as err:
 9         print(err.reason)
10 
11 check_urlerror()

urllib库

猜你喜欢