python爬虫入门

一、环境搭建,安装pycharm

pycharm破解:https://www.exception.site/essay/how-to-free-use-pycharm-2020

pycharm官网下载:https://www.jetbrains.com/pycharm/download

pycharm会自动安装pip环境

二、其他官网文档

python官网下载会自动安装pip:https://www.python.org/

pip安装:https://www.runoob.com/w3cnote/python-pip-install-usage.html

requests(上手):https://requests.readthedocs.io/zh_CN/latest/

对应浏览器版本的WebDriver:https://sites.google.com/a/chromium.org/chromedriver/home

三、简单的爬虫实例

1、安装selenium

image

2、将安装的谷歌webdriver导入到项目中

image

3、简单的爬虫代码

import os
from time import sleep
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

project_path = os.path.abspath(os.path.dirname(__file__))
drivers_path = os.path.join(project_path, 'drivers')
chrome_driver_path = os.path.join(drivers_path, 'chromedriver.exe')

print(chrome_driver_path)
def learning():
    driver = webdriver.Chrome(executable_path=chrome_driver_path)
    sleep(2)
    driver.get("http://www.python.org")
    sleep(2)
    assert "Python" in driver.title
    elem = driver.find_element_by_name("q")
    sleep(2)
    elem.clear()
    sleep(2)
    elem.send_keys("pycon")
    sleep(2)
    elem.send_keys(Keys.RETURN)
    assert "No results found." not in driver.page_source
    sleep(10)
    driver.close()


if __name__ == "__main__":
    learning()

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容