$ npm install
$ sls offline
- Add your profile in
serverless.yml
and run
$ sls deploy
When it comes to AWS Lambda function , they have their own limits as follows
So , When you try to use Puppeteer your deployment package size(unzipped) easily go's above 250 mb because When you install Puppeteer, it downloads a recent version of Chromium (~170MB Mac, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API.
Best solution I found for this problem is using this awesome Serverless-framework Headless Chrome Plugin i.e
serverless-plugin-chrome
plugins:
- serverless-plugin-chrome
- superagent
- @serverless-chrome/lambda
- puppeteer
$ npm i superagent @serverless-chrome/lambda puppeteer
We can do this in package section of our serverless.yml
package:
exclude:
- node_modules/puppeteer/.local-chromium/**
Add the following lines to chrome-script.js
const launchChrome = require ("@serverless-chrome/lambda");
const request = require ("superagent");
module.exports.getChrome = async () => {
const chrome = await launchChrome();
const response = await request
.get(`${chrome.url}/json/version`)
.set("Content-Type", "application/json");
const endpoint = response.body.webSocketDebuggerUrl;
return {
endpoint,
instance: chrome
};
};
@@serverless-chrome/lambda
provide scaffolding for using Headless Chrome during a serverless function invocation. Serverless Chrome takes care of building and bundling the Chrome binaries and making sure Chrome is running when your serverless function executes. In addition, this project also provides a few example services for common patterns (e.g. taking a screenshot of a page, printing to PDF, some scraping, etc.)
- import chrome in our
handler.js
const {getChrome} = require('./chrome-script')
- connect it with puppeteer
const browser = await puppeteer.connect({
browserWSEndpoint: chrome.endpoint
});
That's all you can now use puppeteer on aws lambda
$ npm i serverless-offline
$ npm i chrome-launcher
- Make the following request (replace
{{URL}}
with the page you want to get content for)
curl -X GET \
'http://localhost:3000?url={{URL}}' \
$ sls deploy
- Make the following request (replace
{{URL}}
with the page you want to get content for and{{lambda_url}}
with your lambda url)
curl -X GET \
'{{lambda_url}}?url={{URL}}' \