Walks you through how to extract sentiment from quarterly conference calls, comparing three different approaches: Finbert vs Loughran Mcdonald vs Naive Bayes. Provides examples and practical considerations at every level of the process -- from data-collection to sentiment modeling to quantitative analysis.
These notebooks were part of a larger presentation titled "Hands-On Data Science in Investment Management," presented at the Columbus CFA Society.
- Data Collection - text from conference calls, universe, sectors, returns, growth/value indicies
- Sentiment Modeling - Finbert + Loughran & Mcdonald + Naive Bayes (via Textblob)
- Quantitative Analysis - Risk and Return Characteristics
Data_Collection.ipynb
-- steps required to build the corpus and other relevant data for this project. Data includes text from conference calls (detailed in a seperate repo), universe constituents, sector constituents, returns, growth/value indices.
sentiment_models.ipynb
-- how to build 3 sentiment models (finbert, Loughran & Mcdonald, Naive Bayes). Includes pre-processing steps like tokenization and lemmatization.
sentiment_analysis.ipynb
-- how to analyse and contextualize results with respect to returns, sectors, growth/value, etc. Connects sentiment models to market/economic data.
this is a test