在selenium的登录中经常需要输入验证码,用了很多种方法,识别准确率都不高(比较靠谱的tesseract)。可以用下面几种方法来替代
- 去掉验证码
- 设置个万能验证码
- cookie代替
我用的第三种方法,首先在浏览器手动登陆,然后查看request header复制到selenium中的header中
# -*- coding: utf-8 -*-
from selenium import webdriver
import os
headers = { 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate, sdch',
'Accept-Language':'en-US,en;q=0.8',
'Cache-Control':'max-age=0',
'Connection':'keep-alive',
'Cookie':'thinkphp_show_page_trace=0|0; token=lufs9rm61fl5jqecb3gqh56i66; Hm_lvt_080836300300be57b7f34f4b3e97d911=1478065312,1478138929; Hm_lpvt_080836300300be57b7f34f4b3e97d911=1478140277',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36' }
for key in headers:
webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = headers[key]
phantomjs_path = r"E:\app\phantomjs-2.1.1-windows\bin\phantomjs.exe"
driver = webdriver.PhantomJS(executable_path=phantomjs_path, service_log_path=os.path.devnull)
driver.get("url")
driver.maximize_window() #将浏览器最大化
print(driver.title)