This comparative study aims to find the relation between the social popularity of a company and its stock price value
Interpretation of Problem Statement
Why build this?
Business use cases
Assumptions
The Dataset
Baseline Model
Features For Without Sentiment
Features For With Sentiment
Features For With Sentiment And The Tweet Encoded
Effects of tweets on stock price prediction
Comparision
Conclusion
Future scope
Finding statistical correlation between social media sentiment and stock prices of companies. First what we tried to do is take six features excluding social media sentiment and predict the stock prices and then use the social media sentiment as an additional feature and predict the same. The difference in accuracy in both the models proved the postive correlation between them.
Social media plays very important role in changing a company's perception towards customers. Every company is trying to create a positive image through social media, Companies have even started customer care services using their twitter handles. Even our government uses twitter for solving common man problems.A very recent example is our Honorable Railway Minister Piyush Goyal answering Does India actually need bullet train.
This can benifit companies in this manner
- This model can be used to test whether a particular post will affect positively or negatively before posting it on social platforms
- Also if positive social media presence is a factor for rising stock price of a company,every company will eventually try to build a good overall social media profile.
Given the limited time of the hackathon these were our constraints:
- One company's(Google) stock data was taken and evaluated
- Tweets were taken into consideration as they were easily available through the twitter API
- Tweets of the past day were taken into account.
- We have focused on only 1 company as the main aim is to find the relation between sentiment and the stock price
- We have used the stock prices of Google (GOOG) in the NASDAQ market
- The series that we monitored was 1 day long (9:30am-4pm EDT)
- The aim was to predict the stock price in the time interval of 2 mins
- The stock prices were scraped from yahoo finance
- For the sentiment analysis we took tweets related to google using the twitter's api.
The model that we used to predict the stock prices was stack many to one LSTM
The model remained the same for the comparision purpose, only the input features were changed
Loss : MSE
Error: Mean Absolute Error
Optimization: rmsprop
Epochs: 500
Activation: tanh
For the 1st task we used the following Features
- Open Stock price
- Low price
- High price
- Volume
- Running Max price
- Running Min price
The variable that was to be predcited was the closing price within that time span
Preprocessing
Along with the previous features we added some features related to sentiment
The below image explains it all
Then we took the average of the positiveness, and the negativeness across the different tweets in a given time span
So two new features
- positiveness
- negativeness
Total features - 8
Along with the previous features, we encoded each tweet using a variation of the Word2Vec model(encoded in 50 dimension)
Again we took weighted averages of the tweets belonging to one time span
And now along with the previous 8 features we added these 50 features totalling upto 58 features
Let's see how does tweet affect the stock value of google.Our aim here is to see does taking tweet into consideration improves the accuracy.
Tweet under test
Thanks to #Google for the extended warranty, but I have to say my #Pixel2XL is fantastic, no issues whatsoever. Best phone of 2017 IMHO.
numerically:
positiveness - 0.625, negativeness - 0.375
This tweet is positive in sense since it praises google pixel.
At first the price was predicted on the baseline model i.e without considering the tweet.We gave the features of the previous time step as the input so that the model could predict the stock value at the next time step
The actual stock value - 1032 USD
The value predicted by the baseline model - 965 USD
Now, we considered tweet as a input feature
The value predicted by the updated model - 1030 USD
Thus we could infer that considering the tweet sentiment does affect the stock price value positively
The Mae curve
The below curve compares the error of various model trained for prdicting the stock price value
So the conclusion that we deduced, was that there is some relation between the popularity on the social media and the companie's stock price (This relation can be evaluated statistically using the various correlation coefficient)
- Data can be taken from other social media platforms such as Facebook,LinkedIn and other news platforms
- Data for a longer period and multiple companies can be used to improve accuracy