week1-2作业:
学习python的第一周 5.16号爬取了本地的静态页面
主要抓取上图中商品的 标题,价格,评分,星级,图片地址等信息 抓取代码如下
frombs4importBeautifulSoup
withopen('C:/Users/rjkf/Desktop/python/Plan-for-combating-master/week1/1_2/1_2answer_of_homework/1_2_homework_required/index.html','r')aswb_data:
Soup=BeautifulSoup(wb_data,'lxml')
titles=Soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a')# 复制每个元素的css selector 路径即可
images=Soup.select('body > div > div > div.col-md-9 > div > div > div > img')
gradeNums=Soup.select('body > div > div > div.col-md-9 > div > div > div > div.ratings > p.pull-right')
prices=Soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4.pull-right')
stars=Soup.select('body > div > div > div.col-md-9 > div > div > div > div.ratings > p:nth-of-type(2)')
fortitle,price,gradeNum,image,starinzip(titles,prices,gradeNums,images,stars):
data={
'标题':title.get_text(),
'价格':price.get_text(),
'图片路径':image.get('src'),
'评分量':gradeNum.get_text(),
'星级':len(star.find_all("span",class_='glyphicon glyphicon-star'))
}
print(data)
运行结果为:
总结:
-.通过week1-2的练习 初步掌握了BeautifulSoup的基础运用
-.了解了with open的用法
-. 初步掌握了元组的用法