当使用urllib模块爬去数据报以下错误时
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:748)>
错误原因:
Python 2.7.9 之后引入了一个新特性当你urllib.urlopen一个 https 的时候会验证一次 SSL 证书 当目标使用的是自签名的证书时就会爆出一个 urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)> 的错误消息
那么要解决这个问题,PEP-0476的文档说
import ssl
# This restores the same behavior as before.
context = ssl._create_unverified_context()
urllib.urlopen("https://no-valid-cert", context=context)
It is also possible, though highly discouraged , to globally disable verification by monkeypatching the ssl module in versions of Python that implement this PEP:
import ssl
try:
_create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
# Legacy Python that doesn't verify HTTPS certificates by default
pass
else:
# Handle target environment that doesn't support HTTPS verification
ssl._create_default_https_context = _create_unverified_https_context
就是说你可以禁掉这个证书的要求,urllib来说有两种方式,一种是urllib.urlopen()有一个参数context,把他设成ssl. _create_unverified_context或者修改现在的全局默认值_create_unverified_https_context
或
ssl._create_default_https_context
实例:
就是说你可以禁掉这个证书的要求,urllib来说有两种方式,一种是urllib.urlopen()有一个参数context,把他设成ssl. _create_unverified_context或者修改现在的全局默认值_create_unverified_https_context
或
ssl._create_default_https_context
实例:
import urllib.request,ssl
context = ssl._create_unverified_context()
res = urllib.request.Request('https://book.douban.com/latest?icn=index-latestbook-all ',headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.101 Safari/537.36"})
res = urllib.request.urlopen(res)
设置ssl验证关闭
print requests.get('https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=xxxxx&client_secret=xxxxxxxx', verify=False).content
urlretrieve函数下载ssl问题
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
url.request.retrieve(‘https:www.baidu.com’,’./url/1.html’)
用这个方法可以直接下载图片:
url = ‘path‘
urllib.request.urlretrieve(url, ‘1.jpg’)
就会在本地下载一个url地址的图片,保存在本地为1.jpg,注意在pycharm中老是报缺少引号的错误,这时只需要在url最后换行的时候加上一个\或者用”’三引号就行(因为换行时候相当于多了个换行符号)