The Bechdel test is a measure of representation of women in fiction. It asks whether a work features at least two female characters who talk to each other about something other than a man.
Within the framework of this project I used text mining techniques on 1500+ movie scripts that I downloaded/parsed from the internet to explore long-term trend of female representation in movies by performing the test on them.
Data files can be downloaded from:
https://drive.google.com/drive/folders/1konx-AYGYk2zGTdHR97vgQAl_IB2r9Q2
Data files not included:
- Raw, unprocessed movie scripts (2071 txt/pdf/rtf/doc file, ~1.1GB) - can be downloaded through .py files
- Results evaluation from 3 judges
- Exported CSV files
The project report can be accessed at: https://www.dropbox.com/s/96czpl7e5xerhtp/IRTM_project_report.pdf?dl=0
Inspirations for the data structures were taken from: https://www.youtube.com/watch?v=jRKKPYDs44o