Master Team Project at University of Manhheim for M.Sc. Business Informatics and M.Sc. Data Science in Cooperation with AbbVie Inc.
The project “Topic Monitoring in the Pharmaceutical Industry”, consists of two goals: to get a better insight into the analysis of public’s opinions, sentiments, evaluations, attitudes, and emotions from social media platforms, especially Facebook and Twitter, towards our client company - AbbVie Inc - and its competitors as well as the whole pharmaceutical industry; and to identify emerging topics in real-time and provide meaningful analytics that synthesize an accurate description of each topic.
This repository represents the files and codes which were used during working on this project. It contains everything from undocumented "Files which are there to do some stuff" to documented final versions of scripts which are used to generate meaningful results.
Develpoing
This directory contains all source code files separated by person. A proportion of everyones code inside this directory can also be found inside the FINAL
directory.
FINAL Inside this directory you can find all finalized codes separated by functionality.
- Data collection
- Data Preprocessing
- Sentiment Analysis
- Topic Detection
- Trend Detection
- Datasets
- sentitomo
Each file contains comments which instruct you how to use them. This is especially needed if some models are trained or evaluated with specific preprocessed data sets.
Data Collection Contains everyone's crawling code for Twitter and Facebook. To use those codes it is needed to generate API keys for the respective service.
Data preprocessing
Contains everything related to preprocessing texts. In addition it also contains one file evaluation_Alex
for evaluation confusin matrices.
Sentiment Analysis
Every file which is related to sentiment analysis can be found here. The files