This project was completed in week 6 of the Data Science Bootcamp at Spiced Academy in Berlin.
This is a simple implementation of a dockerized data pipeline that sends randomized tweets about politics together with their sentiment scores.
The Docker-Compose pipeline includes five containers. With the following folder structure the data pipeline...
-
collects tweets with the Twitter API and
tweepy
-
stores the tweets in a MongoDB
-
applies an ETL job that
- extracts the tweets from MongoDB
- gets the sentiments of the texts with
VADERSentiment
-
loads the tweets and their sentiment scores in a Postgres database
-
creates a Slackbot that post a randomly selected anonymized tweet from the Postgres database into a Slack channel.
Pipeline folder including the docker-compose.yml is here.
The tweet_collector.py is taken from Paul Wlodkowski's twitter-mongoDB repository.
Various code snipplets in tweet_collector.py and slackbot.py are adopted from Krystana Föh's code.