This project was completed in the final week of the Data Science Bootcamp at Spiced Academy in Berlin.
This is a dictionary that measures organizational practices from employee reviews. The project has the following work-flow.
- Collect / prep data / corpus --- Cheers to Matthew Chatham
- Explore data & obtain seed words
- Topic modeling
- Joint Maximum Likelihood Estimation for High-Dimensional Item Factor Analysis
- deep artificial neural network model: importance-weighted autoencoder for exploratory IFA --- Cheers to Christopher J. Urban
- Build the dictionary with word2vec --- Cheers to Kai Li, Feng Mai, Rui Shen, Xinyan Yan
- Validate the dictionary
- Check for the pattern of correlations across train / test sets
- Dictionary (saliency) scores & employee ratings
- Repeat the step above with a different corpus
- Dictionary (saliency) scores & topic sentiments (estimated with joint sentiment topic models) & regular sentiment scores
- Check the associations between dictionary (saliency) scores & dimensions obtained with autoencoders
- Check for the pattern of correlations across train / test sets
- Visualizations
The measurement instrument is good for quantifying the saliency of the following in employee reviews.
- the conflict between the interests of employees vs. organization
- the conflict between employees’ independence to organize their own work vs. need for control & centralization
- the conflict between stability & change
There is simply no good or bad company culture! By utilizing this measurement instrument companies can see where they are in comparison to
and perhaps try to aim for a culture that supports their business strategies and goals.
This is an ongoing project, which may end up being a product. At this stage I'm sharing the dictionary as a .dic file in LIWC format.
- Explore the associations with some ground truth
- revenues / revenue %s for employee well-being
- hiring & firing %s
- ranking in global innovation index
- etc.
- The current version is based on "employees talking about their companies" type of text data
- collect & append "companies talking about themselves" type of text data for a more accurate picture of organizational culture
Presentation slides are available at https://1drv.ms/p/s!AuQR1pEfkazyliMRvAgVAqcpZ9Ze?e=6k6kS7