Skip to content
This repository has been archived by the owner on Sep 29, 2024. It is now read-only.

🐭 Get notified on Slack about new offers on Tutti.ch

Notifications You must be signed in to change notification settings

livioso/moritz-tutti-scrapy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Moritz

General purpose Tutti crawler with optional pipeline posting to Slack when a new offer matching a searchterm gets published on Tutti.ch.

Scrapinghhub

  1. Setup a new Scrapinghub project.
  2. Deploy the spider using shub deploy.
  3. Optional: Set SLACK_WEBHOOK and SCRAPINGHUB_API_KEY in the settings of your project to receive Slack notifications.
  4. Run the spider with desired searchterm argument on Scrapinghub (manual or periodic).

Development

Installation

python3 -m venv .venv
. ./.venv/bin/activate
pip install -r repository.txt

Add add an optional .env file

# Optional: Slack Webhook to be called
# SLACK_WEBHOOK=https://hooks.slack.com/services/XXXXXXXX/XXXXXXXX/XXXXXXXX

# Optional: Scraping Hub Project & Key
# only make sense for development
# SCRAPINGHUB_API_KEY=xxx
# SCRAPY_PROJECT_ID=xxx

Running the spider to crawl for a searchterm

Example 1: Crawl the latest roomba offers:

scrapy crawl tutti -a searchterm=roomba

Example 2: Crawl the latest 100 pages of all offers and dump results to a json:

scrapy crawl tutti -o offers.json -a pages=100

Screenshot of Slack integration

Releases

No releases published

Packages

No packages published

Languages