3 Scrapy 爬取（2）

根据前面的知识可以写出一个简单的爬虫，再一步步完善它

# -*- coding: utf-8 -*-
import scrapy


class QuotesSpider(scrapy.Spider):
    name = 'quotes'
    allowed_domains = ['quotes.toscrape.com']
    start_urls = ['http://quotes.toscrape.com/']

    def parse(self, response):
        quotes = reponse.xpath('//*[@class="quote"]')
        for quote in quotes:
            text = quote.xpath('.//*[@class="text"]/text()').extract_first()
            author = quote.xpath('.//*[@itemprop="author"]/text()').extract()
            tags = quote.xpath('.//*[@itemprop="keywords"]/@content').extract()

            print '\n'
            print text
            print author
            print tags
            print '\n'

在爬虫的根目录中输入命令
scrapy crawl quotes

最后编辑于：2017.12.11 13:31:27

©著作权归作者所有,转载或内容合作请联系作者
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

推荐阅读更多精彩内容

网络爬虫Scrapy从入门到进阶
Advanced Web Scraping: Bypassing "403 Forbidden," captcha...
treelake阅读 51,349评论 8赞 111
Scrapy爬虫入门教程三命令行工具介绍和示例
Python版本管理：pyenv和pyenv-virtualenvScrapy爬虫入门教程一安装和基本使用Scr...
inke阅读 19,678评论 0赞 41
Scrapy爬虫入门教程一安装和基本使用
Python版本管理：pyenv和pyenv-virtualenvScrapy爬虫入门教程一安装和基本使用Scr...
inke阅读 63,558评论 12赞 130
《Learning Scrapy》（中文版）第3章爬虫基础
序言第1章 Scrapy介绍第2章理解HTML和XPath第3章爬虫基础第4章从Scrapy到移动应用第5章...
SeanCheney阅读 15,156评论 13赞 61
我想写一首诗做名字
我想写一首诗做名字，构思了很久，始终没有执笔。生活如此多彩，命运百转千回，真不知道从哪里写起。也许还没...
兄弟相拥而眠阅读 695评论 0赞 0

赞1赞

赞赏

手机看全文