python学习笔记之二[Beautifulsoup4]

下载安装
https://pypi.python.org/pypi?%3Aaction=search&term=BeautifulSoup&submit=search

Paste_Image.png
Paste_Image.png

下载完成开始安装
解压压缩包
python setup.py install

    C:\Python34\beautifulsoup4-4.5.3>python setup.py install
    running install
    running bdist_egg
    running egg_info
    writing beautifulsoup4.egg-info\PKG-INFO
    writing dependency_links to beautifulsoup4.egg-info\dependency_links.txt
    writing top-level names to beautifulsoup4.egg-info\top_level.txt
    writing requirements to beautifulsoup4.egg-info\requires.txt
    reading manifest file 'beautifulsoup4.egg-info\SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'beautifulsoup4.egg-info\SOURCES.txt'
    installing library code to build\bdist.win-amd64\egg
    running install_lib
    running build_py
    creating build
    creating build\lib
    creating build\lib\bs4
    copying bs4\1631353.py -> build\lib\bs4
    copying bs4\dammit.py -> build\lib\bs4
    copying bs4\diagnose.py -> build\lib\bs4
    copying bs4\element.py -> build\lib\bs4
    copying bs4\testing.py -> build\lib\bs4
    copying bs4\__init__.py -> build\lib\bs4
    creating build\lib\bs4\builder
    copying bs4\builder\_html5lib.py -> build\lib\bs4\builder
    copying bs4\builder\_htmlparser.py -> build\lib\bs4\builder
    copying bs4\builder\_lxml.py -> build\lib\bs4\builder
    copying bs4\builder\__init__.py -> build\lib\bs4\builder
    creating build\lib\bs4\tests
    copying bs4\tests\test_builder_registry.py -> build\lib\bs4\tests
    copying bs4\tests\test_docs.py -> build\lib\bs4\tests
    copying bs4\tests\test_html5lib.py -> build\lib\bs4\tests
    copying bs4\tests\test_htmlparser.py -> build\lib\bs4\tests
    copying bs4\tests\test_lxml.py -> build\lib\bs4\tests
    copying bs4\tests\test_soup.py -> build\lib\bs4\tests
    copying bs4\tests\test_tree.py -> build\lib\bs4\tests
    copying bs4\tests\__init__.py -> build\lib\bs4\tests
    Fixing build\lib\bs4\1631353.py build\lib\bs4\dammit.py build\lib\bs4\diagnose.p
    y build\lib\bs4\element.py build\lib\bs4\testing.py build\lib\bs4\__init__.py bu
    ild\lib\bs4\builder\_html5lib.py build\lib\bs4\builder\_htmlparser.py build\lib\
    bs4\builder\_lxml.py build\lib\bs4\builder\__init__.py build\lib\bs4\tests\test_
    builder_registry.py build\lib\bs4\tests\test_docs.py build\lib\bs4\tests\test_ht
    ml5lib.py build\lib\bs4\tests\test_htmlparser.py build\lib\bs4\tests\test_lxml.p
    y build\lib\bs4\tests\test_soup.py build\lib\bs4\tests\test_tree.py build\lib\bs
    4\tests\__init__.py
    Skipping optional fixer: buffer
    Skipping optional fixer: idioms
    Skipping optional fixer: set_literal
    Skipping optional fixer: ws_comma
    Fixing build\lib\bs4\1631353.py build\lib\bs4\dammit.py build\lib\bs4\diagnose.p
    y build\lib\bs4\element.py build\lib\bs4\testing.py build\lib\bs4\__init__.py bu
    ild\lib\bs4\builder\_html5lib.py build\lib\bs4\builder\_htmlparser.py build\lib\
    bs4\builder\_lxml.py build\lib\bs4\builder\__init__.py build\lib\bs4\tests\test_
    builder_registry.py build\lib\bs4\tests\test_docs.py build\lib\bs4\tests\test_ht
    ml5lib.py build\lib\bs4\tests\test_htmlparser.py build\lib\bs4\tests\test_lxml.p
    y build\lib\bs4\tests\test_soup.py build\lib\bs4\tests\test_tree.py build\lib\bs
    4\tests\__init__.py
    Skipping optional fixer: buffer
    Skipping optional fixer: idioms
    Skipping optional fixer: set_literal
    Skipping optional fixer: ws_comma
    creating build\bdist.win-amd64
    creating build\bdist.win-amd64\egg
    creating build\bdist.win-amd64\egg\bs4
    copying build\lib\bs4\1631353.py -> build\bdist.win-amd64\egg\bs4
    creating build\bdist.win-amd64\egg\bs4\builder
    copying build\lib\bs4\builder\_html5lib.py -> build\bdist.win-amd64\egg\bs4\buil
    der
    copying build\lib\bs4\builder\_htmlparser.py -> build\bdist.win-amd64\egg\bs4\bu
    ilder
    copying build\lib\bs4\builder\_lxml.py -> build\bdist.win-amd64\egg\bs4\builder
    copying build\lib\bs4\builder\__init__.py -> build\bdist.win-amd64\egg\bs4\build
    er
    copying build\lib\bs4\dammit.py -> build\bdist.win-amd64\egg\bs4
    copying build\lib\bs4\diagnose.py -> build\bdist.win-amd64\egg\bs4
    copying build\lib\bs4\element.py -> build\bdist.win-amd64\egg\bs4
    copying build\lib\bs4\testing.py -> build\bdist.win-amd64\egg\bs4
    creating build\bdist.win-amd64\egg\bs4\tests
    copying build\lib\bs4\tests\test_builder_registry.py -> build\bdist.win-amd64\eg
    g\bs4\tests
    copying build\lib\bs4\tests\test_docs.py -> build\bdist.win-amd64\egg\bs4\tests
    copying build\lib\bs4\tests\test_html5lib.py -> build\bdist.win-amd64\egg\bs4\te
    sts
    copying build\lib\bs4\tests\test_htmlparser.py -> build\bdist.win-amd64\egg\bs4\
    tests
    copying build\lib\bs4\tests\test_lxml.py -> build\bdist.win-amd64\egg\bs4\tests
    copying build\lib\bs4\tests\test_soup.py -> build\bdist.win-amd64\egg\bs4\tests
    copying build\lib\bs4\tests\test_tree.py -> build\bdist.win-amd64\egg\bs4\tests
    copying build\lib\bs4\tests\__init__.py -> build\bdist.win-amd64\egg\bs4\tests
    copying build\lib\bs4\__init__.py -> build\bdist.win-amd64\egg\bs4
    byte-compiling build\bdist.win-amd64\egg\bs4\1631353.py to 1631353.cpython-34.py
    c
    byte-compiling build\bdist.win-amd64\egg\bs4\builder\_html5lib.py to _html5lib.c
    python-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\builder\_htmlparser.py to _htmlpars
    er.cpython-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\builder\_lxml.py to _lxml.cpython-3
    4.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\builder\__init__.py to __init__.cpy
    thon-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\dammit.py to dammit.cpython-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\diagnose.py to diagnose.cpython-34.
    pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\element.py to element.cpython-34.py
    c
    byte-compiling build\bdist.win-amd64\egg\bs4\testing.py to testing.cpython-34.py
    c
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_builder_registry.py to t
    est_builder_registry.cpython-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_docs.py to test_docs.cpy
    thon-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_html5lib.py to test_html
    5lib.cpython-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_htmlparser.py to test_ht
    mlparser.cpython-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_lxml.py to test_lxml.cpy
    thon-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_soup.py to test_soup.cpy
    thon-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\test_tree.py to test_tree.cpy
    thon-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\tests\__init__.py to __init__.cpyth
    on-34.pyc
    byte-compiling build\bdist.win-amd64\egg\bs4\__init__.py to __init__.cpython-34.
    pyc
    creating build\bdist.win-amd64\egg\EGG-INFO
    copying beautifulsoup4.egg-info\PKG-INFO -> build\bdist.win-amd64\egg\EGG-INFO
    copying beautifulsoup4.egg-info\SOURCES.txt -> build\bdist.win-amd64\egg\EGG-INF
    O
    copying beautifulsoup4.egg-info\dependency_links.txt -> build\bdist.win-amd64\eg
    g\EGG-INFO
    copying beautifulsoup4.egg-info\requires.txt -> build\bdist.win-amd64\egg\EGG-IN
    FO
    copying beautifulsoup4.egg-info\top_level.txt -> build\bdist.win-amd64\egg\EGG-I
    NFO
    zip_safe flag not set; analyzing archive contents...
    creating dist
    creating 'dist\beautifulsoup4-4.5.3-py3.4.egg' and adding 'build\bdist.win-amd64
    \egg' to it
    removing 'build\bdist.win-amd64\egg' (and everything under it)
    Processing beautifulsoup4-4.5.3-py3.4.egg
    Copying beautifulsoup4-4.5.3-py3.4.egg to c:\python34\lib\site-packages
    Adding beautifulsoup4 4.5.3 to easy-install.pth file
    
    Installed c:\python34\lib\site-packages\beautifulsoup4-4.5.3-py3.4.egg
    Processing dependencies for beautifulsoup4==4.5.3
    Finished processing dependencies for beautifulsoup4==4.5.3
    
    C:\Python34\beautifulsoup4-4.5.3>

看到finished证明beautifusoup安装完成

可以愉快的测试了

# coding=utf-8
__author__ = 'zdz8207'
from bs4 import BeautifulSoup
>
import urllib.request
import urllib.parse
import re
import urllib.request, urllib.parse, http.cookiejar

def getHtml(url):
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
opener.addheaders = [('User-Agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36'),('Cookie', '4564564564564564565646540')]
urllib.request.install_opener(opener)
html_bytes = urllib.request.urlopen(url).read()
html_string = html_bytes.decode('utf-8')
return html_string

html_doc = getHtml("http://zst.aicai.com/ssq/openInfo/")
soup = BeautifulSoup(html_doc, 'html.parser')

\# print(soup.title)
\#table = soup.find_all('table', class_='fzTab')
\#print(table)#<tr onmouseout="this.style.background=''" 这种tr丢失了
\#soup.strip() 加了strip后经常出现find_all('tr') 只返回第一个tr
tr = soup.find('tr',attrs={"onmouseout": "this.style.background=''"}) 
\#print(tr)
tds = tr.find_all('td')
opennum = tds[0].get_text()
\#print(opennum) 
reds = [] 
for i in range(2,8): 
    reds.append(tds[i].get_text()) 
print(reds) 
blue = tds[8].get_text()
print(blue)

#把list转换为字符串:(',').join(list)
#最终输出结果格式如:2015075期开奖号码:6,11,13,19,21,32, 蓝球:4  print(opennum+'期开奖号码:'+ (',').join(reds)+", 蓝球:"+blue)

测试

C:\Users\wang\python>python xinlangshuangseqiu.py
['02']
15
['02', '04']
15
['02', '04', '12']
15
['02', '04', '12', '14']
15
['02', '04', '12', '14', '17']
15
['02', '04', '12', '14', '17', '24']
15

C:\Users\wang\python>

成功

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,377评论 6 496
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,390评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,967评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,344评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,441评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,492评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,497评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,274评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,732评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,008评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,184评论 1 342
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,837评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,520评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,156评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,407评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,056评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,074评论 2 352

推荐阅读更多精彩内容

  • # Python 资源大全中文版 我想很多程序员应该记得 GitHub 上有一个 Awesome - XXX 系列...
    aimaile阅读 26,462评论 6 428
  • linux和windows下安装python拓展包-pycharm、numpy、scipy、matplotlib、...
    hzyido阅读 81,237评论 2 10
  • GitHub 上有一个 Awesome - XXX 系列的资源整理,资源非常丰富,涉及面非常广。awesome-p...
    若与阅读 18,633评论 4 418
  • 环境管理管理Python版本和环境的工具。p–非常简单的交互式python版本管理工具。pyenv–简单的Pyth...
    MrHamster阅读 3,791评论 1 61
  • 我们就算再听从父母的话,顺从父母的意愿,我们最多也只能活成他们现在的样子,我们依旧活在恐惧里,不敢越雷池一...
    路人非甲阅读 221评论 0 0