Updated implementation of SocialSent, a method for generating domain specific sentiment scores proposed in "Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora" by Hamilton et al. Code adapted from https://github.com/williamleif/socialsent (Ported to Python 3 and simplified).
- Download comment data from http://files.pushshift.io/reddit/comments/ in zst format and save under
filename.zst
- Extract only the posts of the subreddits of interest by running
python filter.py filenmame.zst subreddit1 subreddit2 ...
- run
The polarity scores will be saved under
python -m subreddit1 subreddit2 ...
data/subreddit_name/stemmed-polarities.pkl
In this notebook we apply the method on data from various subreddits and perform some analysis that is beyond the scope of the paper.