Requirements and References:
Apache Kafka
https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-20-04. Version: https://archive.apache.org/dist/kafka/2.1.1/kafka_2.11-2.1.1.tgz
twint
https://github.com/twintproject/twint
CMAK
https://github.com/yahoo/CMAK
MongoDB
https://linuxhint.com/install_mongodb_ubuntu_20_04/
MongoDB-Compass
https://docs.mongodb.com/compass/current/install/
Java 11+
Python 3.6+
Run the twitter_shell.sh file in order to install the basic packages, including twint
Execution steps:
Step 1:
Checking running status of Kafka and MongoDB:
$sudo systemctl start kafka
$sudo systemctl status kafka
$sudo systemctl start mongodb
$sudo systemctl status mongodb
Step 2:
Open first terminal
Navigate to your "CMAK" directory and run the following commands:
$cd target/universal/cmak-3.0.0.5
$bin/cmak -java-home /usr/lib/jvm/java-11-openjdk-amd64/
Step 3:
Open second terminal
Navigate to "kafka" home directory:
$bin/zookeeper-shell.sh localhost:2181
Once the zookeeper shell opens and starts blinking for next commands:
$ls /kafka-manager
$create /kafka-manager/mutex ""
$create /kafka-manager/mutex/locks ""
$create /kafka-manager/mutex/leases ""
Go to web browser (localhost:9000) and add cluster with following details name:(any_name),host: localhost:2181, kafka-version:2.1.1 and save.
Step 4:
Open third terminal
Navigate to your "kafka" home directory:
$bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic topic_name
Navigate to your twitter folder:
$python3 producer_filename.py --broker-list localhost:9092 --topic topic_name > /dev/null
Step 5:
Open fourth terminal:
$mongodb-compass
Connect to your particular database.
Open fifth terminal and navigate to twitter folder:
$python3 consumer_filename.py --bootstrap-server localhost:9092 --topic topic_name --from-beginning
"Ctrl+C" after all the tweets have been consumed by the consumer.