Running on Telegram @NYTtopic
This code maintains a simple Telegram bot which collects fresh updates from the Twitter account of The New York Times and allows the user to look for recent articles on topics of their choice.
Hosted on Amazon EC2, the NYTtopic Bot consists of a pipeline of Docker containers:
This code maintains a simple Telegram bot which collects fresh updates from the Twitter account of The New York Times and allows the user to look for recent articles on topics of their choice.
Hosted on Amazon EC2, the NYTtopic Bot consists of a pipeline of Docker containers:
➤ a first container runs a Python module which leverages Tweepy for accessing The New York Times's profile via the Twitter API, creating a stream of tweets and storing these into a Mongo database (second container);
➤ the third container carries out ETL tasks. It uses SpaCy to perform named-entity recognition (NER) on the text of each tweet extracted from MongoDB. These tags are then formatted as #hashtags, and all the data are eventually stored into a PostgreSQL database (fourth container);
➤ the fifth container feeds all the data into the Telegram bot, which is controlled and kept online using a library called Python Telegram Bot;
➤ the sixth and last container runs once per week, removing the records older than a year from both databases, so as to prevent them from growing too large.
I hope this bot will be useful anytime you are looking for high quality information.
📌 STEP 1: Obtain credentials for the Twitter API and the Telegram Bot API
-
Open profiles on Twitter and Telegram if you do not already have them.
-
Four authentication keys are needed to access Twitter's Streaming API: API Key, API Secret, Access Token and Access Token Secret:
- You can obtain them by registering an application on apps.twitter.com.
- Once in possession of the access keys, store them locally as environment variables with the following names:
API_KEY
,API_SECRET
,ACCESS_TOKEN
,SECRET_ACCESS_TOKEN
.
-
Authentication to Telegram Bot Api is coparatively easier, as you only need one Access Token:
- Clone this repository and install Docker if needed.
- Go into the folder
NYTopic_twitter_to_telegram
:- run
docker-compose build
and wait for Docker to set up everything for you; - run
docker-compose up
. The bot should start responding within a few seconds.
- run
- Open a Telegram chat with your new bot and start browsing The New York Times!
-
Add a container for removing old records from Mongo and Postgres. - Provide the user with links to similar content in other newspapers.
- Make hashtag-based queries possible, so as to return all the available articles related to a precise topic in a single message.