MacBook-Air:~ huangyong$ python3
Python 3.6.1 (default, Apr 4 2017, 09:40:21)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request as ur
>>> s=ur.urlopen('https://www.zhihu.com')
>>> sl=s.read()
#略去print(sl)
>>>from bs4 import BeautifulSoup
>>> bsObj = BeautifulSoup(s.read())
#使用bsObj = BeautifulSoup(sl)的话会有警告
>>> print(bsObj.h1)
<h1 class="logo hide-text">知乎</h1>
bs4是用来给html代码分块的。
>>> f=open('test.txt','w+')
没有test.txt 会自动创建一个,python读写文件还是非常简单的。
>>> f.write(sl.decode('utf-8'))
把整个页面信息保存下来了,f.write()只能保存字符串,不解码也不能保存,
Make sure you use the right version ofpiporeasy_installfor your Python version (these may be namedpip3andeasy_install3respectively if you’re using Python 3).
pip pip3的区别是一个下载到python2.*,一个下载到python3.*