抓取某小说网站时,因其https,请求内容时要验证证书,获取内容失败,报错如下:
Traceback (most recent call last):
File "C:/Users/cxqy003/PycharmProjects/untitled1/books.py", line 149, in <module>
book_detail()
File "C:/Users/cxqy003/PycharmProjects/untitled1/books.py", line 127, in book_detail
req = requests.get(url).text
File "C:\Users\cxqy003\Anaconda2\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\cxqy003\Anaconda2\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\cxqy003\Anaconda2\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\cxqy003\Anaconda2\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "C:\Users\cxqy003\Anaconda2\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.dingdiann.com', port=443)
这种问题解决办法就是搞定证书,方法有许多,我选择的是简单粗暴的不验证。
req = requests.get(url, verify=False).text
在请求时加入“verify=False”,便可以不验证证书,拿到内容了。