1.安装Python,这个不用不说了吧
2.安装依赖包
2.1安装wheel,因为需要离线安装库文件
pip install wheel
2.2安装离线库文件
Scrapy用到的依赖库文件:Lxml、Twisted
一般直接安装Scrapy会出错,主要是依赖库问题。
直接安装出错的主要原因是Twisted,会报错:
Error:Microsoft Visual C 14.0 is required.
建议采用离线安装方式
登陆http://www.lfd.uci.edu/~gohlke/pythonlibs/,
Ctrl+F搜索Lxml、Twisted、Scrapy(关键是Twisted)
下载对应的版本,例如:lxml-3.7.3-cp35-cp35m-win_adm64.whl,表示lxml的版本为3.7.3,对应的python版本为3.5-64bit
pip install lxml-3.7.3-xxxxxx.whl
pip install Twisted-18.9.0-xxxxx.whl
这时就可以安装Scrapy,离线在线安装都没问题了。
pip install Scrapy
3.大功告成
安装完成后在命令行输入:scrapy,会显示scrapy版本
Scrapy 1.5.1 - XXXXX
Usage:
scrapy <command> [options] [args]
Available commands:
bench Run quick benchmark test
check Check spider contracts
crawl Run a spider
edit Edit spider
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
list List available spiders
parse Parse URL (using its spider) and print the results
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy