Skip to content

deanturpin/trumpasaurus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About to move to a static analysis version. Big changes coming!


Trumpasaurus - word usage analysis of political speeches

Motivation

I started this purely out of curiosity to see if Theresa May was really repeating herself a lot. But by way of comparison I quickly extended it to other speakers. I also wondered if I could capture "the essence" of a piece without actually having to read it.

It was originally called Theresaurus - which looks cool - but Trumpasaurus is much more enjoyable to say.

The analysis is client-side JavaScript but the speeches are loaded on demand by AJAX so it must be served by a web server.

Like this one: https://deanturpin.github.io/trumpasaurus/

During dev you can run one locally with python (one level up from your repo)

python -m SimpleHTTPServer

And connect with your web browser: http://0.0.0.0:8000/trumpasaurus/

There's also a Greasemonkey script.

Testing

Tested on recent Firefox, Safari and Chrome on the desktop. Chrome and Safari on iPhone.

Adding a new speech

This can be done entirely within github.

  • Fork this repo
  • Add a new text file in speeches
  • In index.html: add a new option in the select tag with the new file
  • In your repo settings: select "master" branch as the source in the GitHub Pages section
  • View it on your github.io

Preprocessing

Conversations need preprocessing to split them into separate files. I used the tools/split-speech.sh script. I've left the speeches largely untouched unless some anomalies jump out of the results. The PDF to text conversion of the Lib Dem manifesto for example was littered with ●● - an artifact of the PDF to text conversion - so I removed them by hand (in vim).

Keyword counts were generated by running the keywords.sh script in the speeches folder. Pass it a list of files to compare.

$ cd speeches
$ ../tools/keywords.sh 
Conserv	SinnFei	DUP	Labour	Green	UKIP	Libdem	
 	 	 	1	 	 	 	abortion
 	 	 	1	 	 	 	badger
 	 	 	 	 	6	 	blair
8	7	1	25	2	42	23	brexit
 	 	1	 	 	1	 	cameron
 	 	 	 	 	 	 	clegg
 	 	 	14	6	5	12	climate
28	4	6	66	11	21	68	community|communities
41	 	 	68	 	17	32	conservative
 	 	 	1	 	1	3	corbyn
13	1	2	23	 	25	15	crime
8	 	6	4	 	1	2	cyber
2	 	2	7	1	5	4	debt

Converting manifesto PDFs to text

$ pdftotext DUP_Wminster_Manifesto_2017_v5.{pdf,txt}

JavaScript dev

I find reloading the page automatically really useful during development.

	// Peridoically reload page if there's a "reload" token in the URL
	setInterval(function() {

		if (window.location.href.split("?").pop() === "reload")
			window.location.reload()
	}, 2000)