Skip to content

Commit

Permalink
fix: Use puppeteer adblocker to block cookies notices
Browse files Browse the repository at this point in the history
  • Loading branch information
MohamedBassem committed Mar 5, 2024
1 parent fdd60a5 commit 591358c
Show file tree
Hide file tree
Showing 3 changed files with 120 additions and 0 deletions.
6 changes: 6 additions & 0 deletions packages/workers/crawler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import { db } from "@hoarder/db";
import { Browser } from "puppeteer";
import puppeteer from "puppeteer-extra";
import StealthPlugin from "puppeteer-extra-plugin-stealth";
import AdblockerPlugin from "puppeteer-extra-plugin-adblocker";

import metascraper from "metascraper";

Expand Down Expand Up @@ -70,6 +71,11 @@ async function launchBrowser() {
export class CrawlerWorker {
static async build() {
puppeteer.use(StealthPlugin());
puppeteer.use(
AdblockerPlugin({
blockTrackersAndAnnoyances: true,
}),
);
await launchBrowser();

logger.info("Starting crawler worker ...");
Expand Down
1 change: 1 addition & 0 deletions packages/workers/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
"openai": "^4.26.1",
"puppeteer": "^22.0.0",
"puppeteer-extra": "^3.3.6",
"puppeteer-extra-plugin-adblocker": "^2.13.6",
"puppeteer-extra-plugin-stealth": "^2.11.2",
"tsx": "^4.7.1",
"typescript": "^5",
Expand Down
113 changes: 113 additions & 0 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 591358c

Please sign in to comment.