How To Download Many Tweets

This tool is fine for occasional one-off use, but because Twint is CPU-intensive, if you want to do extended crawling of a user's Twitter timeline or even crawl multiple timelines at the same time, I recommend spinning up a virtual machine and running Twint there. This document will show you how.

Prerequisites

An account on Digital Ocean
Installing the doctl app
Uploading an SSH public key to your Digital Ocean account.

Creating a Droplet and Fetching Tweets

Create a droplet: doctl compute droplet create twint --size s-3vcpu-1gb --image ubuntu-18-04-x64 --region nyc1 --ssh-keys $( doctl compute ssh-key list --no-header | head -n1 | awk '{print $3}' )
- As of this writing, the above command will create a Droplet with 3 CPUs running Ubuntu 18, and will cost $15/mo or 2.2 cents/hour.
Next, copy up your tmux configuration files. You are using tmux, right? I recommend running it on your droplet so that you can disconnect and let long-running instances of Twint continue to run:
- scp ~/.tmux* root@$(doctl compute droplet list twint --format PublicIPv4 --no-header):.
Now, SSH in, install Docker, and clone this app:
- doctl compute ssh twint
- apt-get update
- apt-get install -y docker.io pv
- docker run hello-world
- git clone git@github.com:dmuth/twint-splunk.git
- cd twint-splunk
Back on your machine, if you have run Twint locally and wish to copy up your logs/ directory, do so with:
- rsync -avz logs/ root@$(doctl compute droplet list twint --format PublicIPv4 --no-header):twint-splunk/logs
On the Droplet again run tmux and then grab a user's tweets with this command:
- ./twint-user-by-year USERNAME 2005 2019 | pv -l > /dev/null
Back on your machine, you can download all tweets and destroy the Droplet when you're done:
- rsync -avz root@$(doctl compute droplet list twint --format PublicIPv4 --no-header):twint-splunk/logs/ logs/
- doctl compute droplet delete twint -f
- Make a backup of the tweets: tar cfvz ~/Dropbox/tweets.tgz logs/
- Finally, start up Splunk with ./bin/devel.sh or ./bin/run.sh and go to https://localhost:8000/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HOW-TO-DOWNLOAD-MANY-TWEETS.md

HOW-TO-DOWNLOAD-MANY-TWEETS.md

How To Download Many Tweets

Prerequisites

Creating a Droplet and Fetching Tweets

Files

HOW-TO-DOWNLOAD-MANY-TWEETS.md

Latest commit

History

HOW-TO-DOWNLOAD-MANY-TWEETS.md

File metadata and controls

How To Download Many Tweets

Prerequisites

Creating a Droplet and Fetching Tweets