urllib.parse
主要用来把URL字符串拆分成URL组件,或者把URL组件拼装成URL字符串
-
拆分
urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)
实例
from urllib.parse import urlparse result = urlparse("http://www.baidu.com/index.html;user?id=5#comment") print(result)
或者指定协议
from urllib.parse import urlparse result = urlparse("www.baidu.com/index.html;user?id=5#comment",scheme="https") print(result)
如果URL字符串中已经包含了协议,
scheme
指定无效 -
拼接
urllib.parse.urlunparse(data)
实例
from urllib.parse import urlunparse data = ['http','www.baidu.com','index.html','user','a=123','commit'] print(urlunparse(data))
-
连接
urllib.parse.urljoin(str1,str2)
实例
print(urljoin('http://www.baidu.com', 'FAQ.html')) print(urljoin('http://www.baidu.com', 'https://pythonsite.com/FAQ.html')) print(urljoin('http://www.baidu.com/about.html', 'https://pythonsite.com/FAQ.html')) print(urljoin('http://www.baidu.com/about.html', 'https://pythonsite.com/FAQ.html?question=2')) print(urljoin('http://www.baidu.com?wd=abc', 'https://pythonsite.com/index.php')) print(urljoin('http://www.baidu.com', '?category=2#comment')) print(urljoin('www.baidu.com', '?category=2#comment')) print(urljoin('www.baidu.com#comment', '?category=2'))
拼接的时候后面的优先级高于前面的URL。
-
字典转换URL字符串
urllib.parse.urlencode(dict)
实例
from urllib.parse import urlencode params = { "name":"zhaofan", "age":23, } base_url = "http://www.baidu.com?" url = base_url+urlencode(params) print(url)
-
URL字符串转换字典
urllib.parse.unquote(urlstr)