Puppeteer 的网络拦截功能允许开发者拦截、修改、阻止和监控网络请求,这对于测试、调试和性能优化非常有用。
1. 启用请求拦截
使用 page.setRequestInterception(true) 启用请求拦截功能。
javascriptconst puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); // 启用请求拦截 await page.setRequestInterception(true); // 添加请求监听器 page.on('request', (request) => { // 处理请求 request.continue(); }); await page.goto('https://example.com'); await browser.close(); })();
2. 请求拦截的基本操作
继续请求(continue):
javascriptpage.on('request', (request) => { request.continue(); // 继续原始请求 });
阻止请求(abort):
javascriptpage.on('request', (request) => { if (request.resourceType() === 'image') { request.abort(); // 阻止图片加载 } else { request.continue(); } });
修改请求(continue with overrides):
javascriptpage.on('request', (request) => { // 修改请求头 request.continue({ headers: { ...request.headers(), 'Authorization': 'Bearer token123' } }); });
响应请求(respond):
javascriptpage.on('request', (request) => { if (request.url().includes('/api/data')) { // 返回模拟响应 request.respond({ status: 200, contentType: 'application/json', body: JSON.stringify({ data: 'mock data' }) }); } else { request.continue(); } });
3. 拦截特定类型的资源
javascriptpage.on('request', (request) => { const resourceType = request.resourceType(); // 阻止图片、字体、样式表等 if (['image', 'font', 'stylesheet'].includes(resourceType)) { request.abort(); } else { request.continue(); } });
资源类型包括:
document- 文档stylesheet- CSS 样式表image- 图片media- 媒体文件font- 字体script- 脚本texttrack- 文本轨道xhr- XMLHttpRequestfetch- Fetch APIeventsource- 事件源websocket- WebSocketmanifest- Web App 清单other- 其他类型
4. 拦截特定 URL
javascriptpage.on('request', (request) => { const url = request.url(); // 阻止特定域名 if (url.includes('analytics.com')) { request.abort(); } // 修改特定 API 请求 else if (url.includes('/api/')) { request.continue({ headers: { ...request.headers(), 'X-Custom-Header': 'value' } }); } // 模拟 API 响应 else if (url.includes('/api/mock')) { request.respond({ status: 200, body: JSON.stringify({ success: true }) }); } else { request.continue(); } });
5. 响应拦截
监听响应事件以处理服务器响应。
javascriptpage.on('response', async (response) => { const url = response.url(); const status = response.status(); console.log(`Response from ${url}: ${status}`); // 获取响应内容 if (url.includes('/api/data')) { const data = await response.json(); console.log('API Response:', data); } });
6. 实际应用场景
场景 1:阻止广告和追踪
javascriptconst blockedDomains = ['ads.com', 'analytics.com', 'tracker.com']; page.on('request', (request) => { const url = request.url(); if (blockedDomains.some(domain => url.includes(domain))) { request.abort(); } else { request.continue(); } });
场景 2:API Mocking(测试用)
javascriptconst mockResponses = { '/api/users': { users: [{ id: 1, name: 'John' }] }, '/api/posts': { posts: [{ id: 1, title: 'Hello' }] } }; page.on('request', (request) => { const url = request.url(); for (const [path, mockData] of Object.entries(mockResponses)) { if (url.includes(path)) { request.respond({ status: 200, contentType: 'application/json', body: JSON.stringify(mockData) }); return; } } request.continue(); });
场景 3:添加认证头
javascriptconst authToken = 'your-auth-token'; page.on('request', (request) => { const headers = { ...request.headers(), 'Authorization': `Bearer ${authToken}` }; request.continue({ headers }); });
场景 4:性能优化(阻止不必要的资源)
javascriptpage.on('request', (request) => { const resourceType = request.resourceType(); // 阻止图片和字体以加快页面加载 if (['image', 'font', 'media'].includes(resourceType)) { request.abort(); } else { request.continue(); } });
场景 5:网络请求监控
javascriptconst requests = []; const responses = []; page.on('request', (request) => { requests.push({ url: request.url(), method: request.method(), resourceType: request.resourceType(), timestamp: Date.now() }); }); page.on('response', (response) => { responses.push({ url: response.url(), status: response.status(), headers: response.headers(), timestamp: Date.now() }); }); // 使用收集的数据 await page.goto('https://example.com'); console.log('Total requests:', requests.length); console.log('Total responses:', responses.length);
7. 错误处理
javascriptpage.on('requestfailed', (request) => { console.log('Request failed:', request.url()); console.log('Failure:', request.failure()); });
最佳实践:
- 及时清理拦截器:在不需要时禁用拦截
- 使用条件判断:避免拦截所有请求
- 处理异常情况:添加错误处理逻辑
- 性能考虑:拦截器会增加请求处理时间
- 测试验证:确保拦截逻辑正确工作
注意事项:
- 请求拦截必须在页面导航之前启用
- 每个请求必须调用
continue()、abort()或respond()之一 - 拦截器会影响页面性能,谨慎使用
- 某些请求(如导航请求)可能无法被拦截