Puppeteer的入门教程和实践

Puppeteer的入门教程和实践

2023年6月29日发(作者:)

Puppeteer的⼊门教程和实践Puppeter是什么的?Puppeter在github上对⾃⼰的介绍是:Haedless Chrome Node APIpuppeteer是⼀个nodejs的库,⽀持调⽤Chrome的API来操纵Web,相⽐较Selenium或是PhantomJs,它最⼤的特点就是它的操作Dom可以完全在内存中进⾏模拟既在V8引擎中处理⽽不打开浏览器(headless⽆界⾯)。但要注意的是,它虽然很好⽤,但⼀般却不建议⽤来做测试使⽤,因为是专门针对Chrome处理的,当然你也可以根据业务需要来选择。Puppeter能做什么?给了⼏个例⼦,分别是:(1)⽹页截图。(2)⽣成页⾯的PDF。(3)分析当前页的脚本。(4) 写爬⾍(5) ....安装Puppeteer ⾄少需要 Node v6.4.0,如要使⽤ async / await,只有 Node v7.6.0 或更⾼版本才⽀持。如果项⽬路径下没有就先执⾏“npm init”,然后按照提⽰填写完毕后,⽣成⼀个⽂件,然后执⾏:npm i puppeteer我在安装过程中遇到了错误:是在执⾏ 下载Chromium时出错,你也可以通过设置环境变量setPUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1阻⽌下载 Chromium,稍后再⼿动下载,但⼿动下载后还要配置路径,太⿇烦啦,所以解决⽅案是打开FQ软件再重新执⾏下“npm i puppeteer”。使⽤(1)⽹页截图//t puppeteer = require('puppeteer');const config = require('./config/config');(async () => { const browser = await (); const page = await e(); await (''); await shot({ path:`${shot}/${()}.png`, }); await ();})();//t path = require('path')s ={ screenshot:e(__dirname,'../../screenshot')}(2) 将⽹页⽣成pdfconst puppeteer = require('puppeteer');const config = require('./config/config');(async () => { const browser = await (); const page = await e(); await ('',{waitUntil:'networkidle2'}); await ({path: `${t}/${()}.pdf`, format: 'A4'}); await ();})();(3)分析⽹页const puppeteer = require('puppeteer');(async () => { const browser = await (); const page = await e(); await (''); // Get the "viewport" of the page, as reported by the page. const dimensions = await te(() => { return { width: Width, height: Height, deviceScaleFactor: PixelRatio }; }); ('Dimensions:', dimensions); await ();})();(4) 写爬⾍//t puppeteer = require('puppeteer');const config = require('./config/config');const srcToImg = require('./helper/srcToImg');const chalk = require('chalk'); (async () => { const browser = await (); const page = await e(); await ('/'); ('go to /') await wport({ width: 1920, height: 1080 }) ("reset viewpoint"); await ('#kw'); await aracter('单⾝狗'); await ('.s_search'); ((("reset viewpoint"))); ('go to searchlist'); ('load', async () => { ('page loading done,.') const srcs = await te(() => { const images = electorAll('_img'); return (images, img => ); }) h(src => { srcToImg(src,) }); await (); }) })();//t http = require('http');const https = require('https');const path = require('path');const fs = require('fs');const { promisify } = require('util');const writeFile = promisify(ile)s = async (src,dir) =>{ if(/.(jpg|png|gif)$/.test(src)){ await urlToImg(src,dir); }else{ await base64ToImg(src,dir); }}//url => imgconst urlToImg = async (url,dir) =>{ const mod = /^https:/.test(url)?https:http; const ext = e(url); const file = (dir,`${()}${ext}`)

(url, res => { (WriteStream(file)) .on('finish',() =>{ (file); }) })}//base64 => imgconst base64ToImg = async function(base64Str,dir){ const matches = (/^data:(.+?);base64,(.+)$/); try{ const ext = matches[1].split('/')[1] .replace('jpeg','jpg'); const file = (dir,`${()}.${ext}`) await writeFile(file,matches[2],'base64'); (file); }catch(err){ ("⾮法的base64 字符串") }};//t path = require('path')s ={ imgUrl:e(__dirname,'../../images'),

}

发布者:admin,转转请注明出处:http://www.yc00.com/xiaochengxu/1687985791a63948.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信