BeautifulSoup警告: BeautifulSoup([your markup], "html.parser")

今天看<python数据采集>一书,跟着敲了点代码,代码如下:

from urllib.request import urlopen
from bs4 import BeautifulSoup


html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read())
print(bsObj.h1)

这个书中没提,但是会报警告

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/jiangxiaohan/Desktop/PythonDemo/beautifulSoupDemo/beautifulSoupTest1.py
/Users/jiangxiaohan/Library/Python/3.6/lib/python/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 6 of the file /Users/jiangxiaohan/Desktop/PythonDemo/beautifulSoupDemo/beautifulSoupTest1.py. To get rid of this warning, change code that looks like this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

  markup_type=markup_type))
<h1>An Interesting Title</h1>

Process finished with exit code 0

这个就是说你没有指定beautifulsoup的解析器,所以作者默认使用html.parser来解析,一般没什么问题,但是如果运行在其它系统或环境它可能会使用不同的解析器(可能会导致不同的结果)。如果想消除这个警告信息你可以这样写

BeautifulSoup(html.read(), "html.parser")

这样就好了

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容