pytho爬虫(1)-初体验Mac安装Python3与简单爬虫代码

        本人使用的是Mac系统 ,Mac本身自带Python2.7,所以本人在通过brew去进行安装的过程中出现了问题,最后选择了比较简单的从官网直接下载后进行安装并且修改默认的python运行环境的方式。

官网python下载地址:https://www.python.org/downloads/mac-osx/ 本人下载的最新版本3.8.2进行傻瓜式安装即可。

安装完成后修改默认运行版本:

#查找python3安装路径

brew info python3

#修改 Mac 系统配置文件

vi ~/.bash_profile

#添加配置信息

alias python="/usr/local/bin/python3"

#编译系统配置文件

source ~/.bash_profile

#系统当前的python版本。

python -V

如此便完成了python环境的准备,接下来我们进行代码编程。


接下来我们可以进行编码了,可以在应用程序中找到“IDEL”引用打开,或者创建文本后将文件修改成py后缀。

写如下代码:


我将文件保存到到了/Users/zhengbozheng/Documents/python  下接下来通过命令符进行操作,进入到当前目录下执行该代码:python 文件夹名称.py 进行执行。

会发现出现如下的错误:



Traceback (most recent call last):

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1319, in do_open

    h.request(req.get_method(), req.selector, req.data, headers,

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1230, in request

    self._send_request(method, url, body, headers, encode_chunked)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1276, in _send_request

    self.endheaders(body, encode_chunked=encode_chunked)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1225, in endheaders

    self._send_output(message_body, encode_chunked=encode_chunked)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1004, in _send_output

    self.send(msg)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 944, in send

    self.connect()

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1399, in connect

    self.sock = self._context.wrap_socket(self.sock,

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket

    return self.sslsocket_class._create(

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create

    self.do_handshake()

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake

    self._sslobj.do_handshake()

ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "python-pc1-fb.py", line 9, in <module>

    response = urllib.request.urlopen(url)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen

    return opener.open(url, data, timeout)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 525, in open

    response = self._open(req, data)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 542, in _open

    result = self._call_chain(self.handle_open, protocol, protocol +

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain

    result = func(*args)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1362, in https_open

    return self.do_open(http.client.HTTPSConnection, req,

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1322, in do_open

    raise URLError(err)

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)>



这个错误是ssl证书验证导致的,我们需要取消证书验证即可:


执行文件出现了如下错误:



Traceback (most recent call last):

  File "python-pc1-fb.py", line 14, in <module>

    response = urllib.request.urlopen(url)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen

    return opener.open(url, data, timeout)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 531, in open

    response = meth(req, response)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 640, in http_response

    response = self.parent.error(

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 569, in error

    return self._call_chain(*args)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain

    result = func(*args)

  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 649, in http_error_default

    raise HTTPError(req.full_url, code, msg, hdrs, fp)

urllib.error.HTTPError: HTTP Error 403: Forbidden



看报错是找到网站403通讯出现了问题,这个可能是因为网站做了反爬虫处理,查看了官网对代码进行了调整。

代码修改如下:


再次执行后就能获取到你想要的信息了。

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容