anaconda
清华镜像:https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/?C=M&O=A
环境变量:https://blog.csdn.net/baidu_32542573/article/details/79361456
补充知识:
https://www.jianshu.com/p/eaee1fadc1e9
https://www.cnblogs.com/zhusleep/p/5616099.html
chromedriver
chrome的的对应版本:https://blog.csdn.net/huilan_same/article/details/51896672
anaconda 下Scripts路径:C:\ProgramData\Anaconda3\Scripts
geckodriver
火狐浏览器驱动
下载地址:https://github.com/mozilla/geckodriver/releases/
phantomjs
下载地址:http://phantomjs.org/download.html
解压,把bin中的exe文件放入scripts中,或者直接把bin文件放入。
selenium 已经不支持:https://blog.csdn.net/qq_30242609/article/details/79323963
pycharm
aiohttp
lxml
下载地址:https://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml
cp代表python版本
打开cmd,进入到lxml下载的文件夹,运行如下命令:
ana
提示:You should consider upgrading via the 'python -m pip install --upgrade pip' command.
需要升级pip
查看pip版本:pip show pip
提示:You are using pip version 10.0.1, however version 18.1 is available.
升级:python -m pip install --upgrade pip
无法升级
anaconda prompt 拒绝访问
管理权限运行 升级成功
beautiful soup
pip install beautifulsoup4
代码:
from bs4 import BeautifulSoup
import lxml
soup = BeautifulSoup('<p>hello</p>', 'lxml')
print(soup.p.string)
报错
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
明明
Requirement already satisfied: lxml in c:\programdata\anaconda3\lib\site-packages (4.2.5)
最终找到两种解决方法:
https://www.cnblogs.com/zrdm/p/8490767.html
https://blog.csdn.net/qq_16546829/article/details/79405605
pyquery
tesserocr
先安装tesseract
下载地址:https://digi.bib.uni-mannheim.de/tesseract/
pip install tesserocr pillow
import tesserocr
报错
应对方法:
下载whi文件
下载地址:[https://github.com/simonflueckiger/tesserocr-windows_build/releases)
添加环境变量:Tesseract-OCR目录
tesseract image.png result -l eng && cat result.txt
报错
去掉 “&& cat result.txt”
from PIL import Image
报错:ImportError: DLL load failed: 找不到指定的模块。
卸载重新安装
pip uninstall Pillow
pip install Pillow
image = Image.open('2.png')
RuntimeError: Failed to init API, possibly an invalid tessdata path: D:\ProgramData\Anaconda3\
分析:因为没有配置全局变量,无法跨盘执行数据转换,这里我们在环境变量那增加一个配置信息
增加一个TESSDATA_PREFIX变量名,变量值还是我的安装路径C:\Program Files (x86)\Tesseract-OCR\tessdata
RuntimeError: Failed to init API, possibly an invalid tessdata path: C:\Program Files (x86)\Tesseract-OCR\
卸载重新安装Tesseract还是不行,我用尽了所有网上的方法。
最后自己发现,对应版本不对tesserocr最新更新对应的是3.05.02,而上一个更新对应的竟然是4.0,卸载安装3.05解决。
数据库
关系型数据库:SQLite,MySQL、SQL。表现形式为表格。
非关系型数据库:MongoDB。表现形式是键值对。
MySQL
官网:https://www.mysql.com/
下载教程:https://www.jianshu.com/p/2337d8fd0863
安装教程:http://www.cnblogs.com/zlslch/p/6961598.html
PyMySQL
pip install pymysql
Flask
pip install flask
Tornado
pip install tornado
Charles
下载网址:https://www.charlesproxy.com/latest-release/download.do
mitmproxy
pip install mitmproxy
Appium
下载地址:https://github.com/appium/appium-desktop/releases
https://nodejs.org/en/download/
教程:https://www.runoob.com/nodejs/nodejs-install-setup.html
https://www.runoob.com/nodejs/nodejs-install-setup.html
npm install -g appium
http://www.android-studio.org/index.php/download
pyspider
pip install pyspider
Scrapy
conda install Scrapy
Scrapy-Splash
Scrapy-Redis
下载地址:https://oomake.com/download/docker-windows
Scrapyd
Scrapyd-Client
Scrapyd API
pip install python-scrapyd-api