GitHub - Aminoid/supervised-sentiment-analysis: Supervised Learning Techniques for Sentiment Analytics

Supervised Learning Techniques for Sentiment Analytics

Classify tweets or movie reviews as either positive or negative. Used logistic regression as well as a Naive Bayes classifier from python’s well regarded machine learning package scikit-learn. As a point of reference, Stanfords Recursive Neural Network code produced an accuracy of 51.1% on the IMDB dataset and 59.4% on the Twitter data.

Approaches

A more traditional NLP technique where the features are simply “important” words and the feature vectors are simple binary vectors.
Doc2Vec technique where document vectors are learned via artificial neural networks

Requirements

Python packages that you will need for this project are scikit-learn, nltk, and gensim. To install these, simply use the pip installer sudo pip install X or, if you are using Anaconda, conda install X, where X is the package name.

Datasets

The IMDB reviews and tweets can be found in the data folder. These have already been divided into train and test sets.

The IMDB dataset, originally found here, that contains 50,000 reviews split evenly into 25k train and 25k test sets. Overall, there are 25k pos and 25k neg reviews. In the labeled train/test sets, a negative review has a score <= 4 out of 10, and a positive review has a score >= 7 out of 10. Thus reviews with more neutral ratings are not included in the train/test sets.
The Twitter Dataset, taken from here, contains 900,000 classified tweets split into 750k train and 150k test sets. The overall distribution of labels is balanced (450k pos and 450k neg).

How to run

python sentiment.py <folder> <approach>

folder can be data/imdb/ or data/twitter/
approach can be 0 or 1 as mentioned in approaches.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
Project-SentimentAnalysis.pdf		Project-SentimentAnalysis.pdf
README.md		README.md
sentiment.py		sentiment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supervised Learning Techniques for Sentiment Analytics

Approaches

Requirements

Datasets

How to run

About

Releases

Packages

Languages

Aminoid/supervised-sentiment-analysis

Folders and files

Latest commit

History

Repository files navigation

Supervised Learning Techniques for Sentiment Analytics

Approaches

Requirements

Datasets

How to run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages