Playwright错误处理与重试机制实现
在实际的自动化测试和网络爬虫开发中,稳定性是衡量脚本质量的重要指标。即使编写了最完善的Playwright脚本,也不可避免地会遇到各种运行时错误:元素加载延迟、网络波动、页面响应超时等。本文将分享如何构建健壮的Playwright错误处理与重试机制。
为什么需要错误处理机制?
我曾遇到过这样的场景:一个精心编写的爬虫脚本在本地运行完美,但放到服务器上却频繁失败。调查后发现,服务器与目标网站之间的网络延迟较高,导致元素加载时间超出预期。这就是缺乏错误处理机制的典型表现。
基础错误处理策略
1. 智能等待替代硬性等待
新手常犯的错误是使用page.waitForTimeout(5000)这样的固定等待。更好的做法是使用Playwright内置的智能等待方法:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left; visibility: visible;">// 不推荐 - 硬性等待 await page.waitForTimeout(5000); await page.click('[#submit](javascript:;)'); // 推荐 - 智能等待 await page.waitForSelector('[#submit](javascript:;)', { state: 'visible' }); await page.click('[#submit](javascript:;)'); </pre>
2. 异常捕获基础
最简单的错误处理是try-catch块:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left; visibility: visible;">async function safeClick(page, selector) { try { await page.click(selector); return true; } catch (error) { console.warn(点击元素 ${selector} 失败:, error.message); return false; } } </pre>
实现重试机制
基础重试函数
下面是一个通用的重试函数,可以包装任何可能失败的操作:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">async function withRetry( operation, maxAttempts = 3, delay = 1000, backoffFactor = 2 ) { let lastError; for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { returnawait operation(); } catch (error) { lastError = error; console.log(尝试 {maxAttempts} 失败:
, error.message); if (attempt === maxAttempts) break; const waitTime = delay * Math.pow(backoffFactor, attempt - 1); console.log(等待 ${waitTime}ms 后重试...); awaitnewPromise(resolve => setTimeout(resolve, waitTime)); } } throw lastError; } </pre>
页面操作重试包装器
针对常见的页面操作,我们可以创建专门的重试包装器:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">class RetryablePage { constructor(page) { this.page = page; } async clickWithRetry(selector, options = {}) { const maxAttempts = options.maxAttempts || 3; for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { // 确保元素可见且可点击 awaitthis.page.waitForSelector(selector, { state: 'visible', timeout: 10000 }); awaitthis.page.click(selector); return; } catch (error) { console.log(点击 {attempt}/
{selector} 均失败:
{response.status()}:
{url} 失败 (尝试
{maxAttempts}):
, error.message); if (attempt === maxAttempts) { throw error; } // 如果是网络错误,尝试刷新页面 if (error.message.includes('net::') || error.message.includes('Navigation')) { awaitthis.page.reload(); } awaitthis.page.waitForTimeout(2000 * attempt); } } } } </pre>
Playwright Test中的重试机制
如果你使用Playwright Test框架,它内置了重试功能:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">// playwright.config.js module.exports = { // 全局重试配置 retries: process.env.CI ? 2 : 1, use: { // 操作失败时的截图配置 screenshot: 'only-on-failure', // 视频录制配置 video: 'retain-on-failure', }, // 项目级别的重试配置 projects: [ { name: 'chromium', retries: 2, use: { browserName: 'chromium' }, }, ], }; </pre>
自定义测试重试逻辑
对于需要特殊处理的重试场景,可以在测试内部实现:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">import { test, expect } from'@playwright/test'; test('重要的支付流程测试', async ({ page }) => { let paymentSuccessful = false; for (let attempt = 1; attempt <= 3; attempt++) { try { // 执行支付流程 await page.goto('/payment'); await page.fill('[#card](javascript:;)-number', '4111111111111111'); await page.click('[#pay](javascript:;)-now'); // 验证支付成功 await expect(page.locator('.success-message')).toBeVisible({ timeout: 10000 }); paymentSuccessful = true; break; } catch (error) { console.log(支付测试尝试 ${attempt} 失败:, error.message); if (attempt === 3) { throw error; } // 清理状态,准备重试 await page.goto('/cart'); await page.waitForTimeout(2000 * attempt); } } expect(paymentSuccessful).toBeTruthy(); }); </pre>
高级错误处理策略
1. 错误分类与不同处理策略
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">class ErrorHandler { static shouldRetry(error) { const retryableErrors = [ 'TimeoutError', 'NetworkError', 'net::', 'Target closed', 'Element not found' ]; const errorMessage = error.toString(); return retryableErrors.some(retryableError => errorMessage.includes(retryableError) ); } static classifyError(error) { const message = error.toString(); if (message.includes('Timeout')) { return'TIMEOUT'; } elseif (message.includes('net::')) { return'NETWORK'; } elseif (message.includes('not found') || message.includes('not visible')) { return'ELEMENT_NOT_FOUND'; } else { return'UNKNOWN'; } } } asyncfunction resilientOperation(operation) { const maxAttempts = 3; let lastError; for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { returnawait operation(); } catch (error) { lastError = error; if (!ErrorHandler.shouldRetry(error) || attempt === maxAttempts) { break; } const errorType = ErrorHandler.classifyError(error); const delay = this.calculateDelay(attempt, errorType); console.log([{attempt} 失败,${delay}ms 后重试
); awaitnewPromise(resolve => setTimeout(resolve, delay)); } } throw lastError; } </pre>
2. 上下文恢复与状态重置
某些错误需要重置浏览器上下文:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">async function withContextRecovery(browser, operation) { let context; let page; for (let attempt = 1; attempt <= 3; attempt++) { try { if (!context || context._closed) { context = await browser.newContext(); page = await context.newPage(); } returnawait operation(page); } catch (error) { console.log(操作失败,尝试 ${attempt}/3:, error.message); if (attempt === 3) { throw error; } // 清理旧上下文 if (context && !context._closed) { await context.close(); } // 短暂等待后继续 awaitnewPromise(resolve => setTimeout(resolve, 1000 * attempt)); } } }</pre>
实战:完整的爬虫错误处理示例
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">const { chromium } = require('playwright'); class RobustCrawler { constructor() { this.maxRetries = 3; this.requestTimeout = 30000; } async crawl(url) { const browser = await chromium.launch(); const results = []; try { for (let retry = 1; retry <= this.maxRetries; retry++) { try { const page = await browser.newPage(); // 设置超时 page.setDefaultTimeout(this.requestTimeout); // 监听请求失败 page.on('requestfailed', request => { console.warn(请求失败: {request.failure().errorText}
); }); // 访问页面 console.log(尝试 {this.maxRetries}: 访问
{response.status()}:
{retry} 失败:
, error.message); if (retry === this.maxRetries) { thrownewError(爬取 {error.message}
); } // 指数退避 awaitnewPromise(resolve => setTimeout(resolve, 1000 * Math.pow(2, retry - 1)) ); } } } finally { await browser.close(); } return results; } async extractData(page) { // 使用选择器重试提取数据 const extractWithRetry = async (selector, extractor) => { for (let i = 0; i < 3; i++) { try { await page.waitForSelector(selector, { timeout: 5000 }); returnawait extractor(); } catch (error) { if (i === 2) throw error; await page.waitForTimeout(1000); } } }; const title = await extractWithRetry('h1', async () => { returnawait page.textContent('h1'); }); return { title }; } } </pre>
监控与日志记录
完善的错误处理还需要良好的监控:
<pre data-tool="mdnice编辑器" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0); margin: 10px 0px; padding: 0px; outline: 0px; max-width: 100%; box-sizing: border-box !important; overflow-wrap: break-word !important; border-radius: 5px; box-shadow: rgba(0, 0, 0, 0.55) 0px 2px 10px; text-align: left;">class MonitoringErrorHandler { constructor() { this.errors = []; this.stats = { totalOperations: 0, failedOperations: 0, retriedOperations: 0, recoveredOperations: 0 }; } async trackOperation(operationName, operation) { this.stats.totalOperations++; const startTime = Date.now(); try { const result = await operation(); return result; } catch (error) { this.stats.failedOperations++; this.errors.push({ operation: operationName, error: error.message, timestamp: newDate().toISOString(), duration: Date.now() - startTime }); // 可以发送到监控系统 this.reportToMonitoringSystem(error, operationName); throw error; } } reportToMonitoringSystem(error, operationName) { // 发送到Sentry, Datadog等 console.error([MONITORING] ${operationName} failed:, error.message); } getStats() { return { ...this.stats, successRate: ((this.stats.totalOperations - this.stats.failedOperations) / this.stats.totalOperations * 100).toFixed(2) + '%' }; } } </pre>
最佳实践总结
分级处理策略:根据错误类型采取不同的重试策略,网络错误可以立即重试,业务错误可能需要延迟重试。
避免无限重试:始终设置最大重试次数,避免陷入死循环。
指数退避算法:重试间隔应逐渐增加,避免对目标服务器造成压力。
上下文隔离:每次重试前清理状态,确保测试的独立性。
详细日志记录:记录每次重试的上下文信息,便于问题排查。
监控集成:将错误信息集成到现有监控系统,实现主动告警。
用户可配置:将重试参数(次数、延迟等)设计为可配置项,适应不同场景需求。
结语
错误处理不是Playwright脚本的事后考虑,而是应该在设计初期就纳入架构的重要部分。一个健壮的脚本不仅要能完成任务,更要能优雅地处理失败。通过实现智能的重试机制,你的自动化脚本将能够在生产环境中稳定运行,显著减少人工干预的需要。
记住,好的错误处理机制是透明的一一当它正常工作时,用户几乎感觉不到它的存在;当问题出现时,它又能提供足够的信息帮助快速定位问题。这才是真正有价值的自动化解决方案。