-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
154 lines (108 loc) · 5.73 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
warning = F,
message = F,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(demdebates2020)
```
```{r eval=F, echo=F}
## basic example code
debates <- readRDS("../dem_debates2/data/debates.RDS")
# use_data(debates)
m_chr <- c("Anderson Cooper", "Andrew Yang", "Bernie Sanders", "Beto O'Rourke", "Bill de Blasio", "Chuck Todd", "Cory Booker", "David Muir", "Don Lemon", "Eric Swalwell", "George Stephanopoulos", "Jake Tapper", "Jay Inslee", "Joe Biden", "John Delaney", "John Hickenlooper", "Jorge Ramos", "Joe Diaz-Balart", "Julian Castro", "Lester Holt", "Mark Lacey", "Michael Bennet", "Pete Buttigieg", "Steve Bullock", "Tim Alberta", "Tim Ryan", "Tom Steyer", "Wolf Blizer", "Jose Diaz-Balart")
f_chr <- c("Amy Klobuchar", "Andrea Mitchell", "Ashley Parker", "Elizabeth Warren", "Judy Woodruff", "Kamala Harris", "Kirsten Gillibrand", "Kristen Welker", "Marianne Williamson", "Rachel Maddow", "Tulsi Gabbard", "Abby Phillip", "Amna Nawaz", "Brianne Pfannenstiel", "Dana Bash", "Erin Burnett", "Lindsey Davis", "Savannah Guthrie", "Yamiche Alcindor")
debates <- debates %>%
# mutate(speaker = lagged) %>%
fill(type, .direction = "down") %>%
fill(speaker, .direction = "down") %>%
select(-lagged) %>%
mutate_all(str_trim) %>%
# mutate_all(~str_remove(.x, "<U\\+2009>") %>% str_trim) %>%
mutate_all(~stringi::stri_trans_general(.x, "latin-ascii")) %>%
mutate(speech = ifelse(!is.na(background), background, speech)) %>%
filter(!str_count(speech)==0) %>%
mutate(speech = ifelse(!is.na(background), NA, speech)) %>%
mutate(day = case_when(
debate == "1A" ~ 1,
debate == "1B" ~ 2,
debate == "2A" ~ 1,
debate == "2B" ~ 2,
T ~ 1
)) %>%
mutate(debate = case_when(
debate == "1A" ~ 1,
debate == "1B" ~ 1,
debate == "2A" ~ 2,
debate == "2B" ~ 2,
T ~ parse_number(debate)
)) %>%
mutate(gender = case_when(
speaker %in% m_chr ~ "male",
speaker %in% f_chr ~ "female",
T ~ "unknown"
)) %>%
select(speaker ,background, speech, type, gender, debate, day ) %>%
mutate(speaker = ifelse(speaker == "Protestor", "Protester", speaker))
debates <- readRDS("D:/Projects/git_proj/dem_debates2/data/debates9.RDS")
debates <- readRDS("D:/Projects/git_proj/dem_debates2/data/debates10.RDS")
debates <- readRDS("D:/Projects/git_proj/dem_debates2/data/debates11.RDS")
use_data(debates, overwrite = T)
debates %>%
dplyr::filter(debate == 11)
```
# demdebates2020
The goal of `demdebates2020` is to provide access to all transcripts of the Democratic debates of the 2020 Presidential Election.
## Usage
The package includes a single dataset: `debates`. This dataset represents the spoken words of all Democratic candidates for US president at eleven Democratic debates. The following [sources](https://github.com/favstats/demdebates2020#sources) have been used to compile the data: Washington Post, Des Moines Register and rev.com. The dataset has the following eight columns:
+ `speaker`: Who is speaking
+ `background`: Reactions from the audience, includes `(APPLAUSE)` or `(LAUGHTER)`
+ only availabe for the first seven and ninth debates
+ `speech`: Transcribed speech
+ `type`: Candidate, Moderator or Protester
+ `gender`: The gender of the person speaking
+ `debate`: Which debate
+ `day`: Which day of the debate
+ first and second debate were held on two separate days
+ `order`: The order in which the speech acts were delivered
There are two ways in which you can access the dataset.
1. Read .csv file directly from GitHub
```{r eval=F}
debates_url <- "https://raw.githubusercontent.com/favstats/demdebates2020/master/data/debates.csv"
debates <- readr::read_csv(debates_url)
```
2. Install and load the R package like this:
```{r eval=F}
devtools::install_github("favstats/demdebates2020")
library(demdebates2020)
```
## Example
```{r eval=T}
demdebates2020::debates %>%
dplyr::slice(1508:1510) %>%
knitr::kable()
```
## Blog Post
Check out this [blog post](http://www.favstats.eu/post/demdebates/) for some more data exploration.
![](https://raw.githubusercontent.com/favstats/demdebates2020/master/vignettes/images/tfidfplot.png)
## Sources
+ Debate 1, Day 1: https://www.washingtonpost.com/politics/2019/06/27/transcript-night-one-first-democratic-debate-annotated/
+ Debate 1, Day 2: https://www.washingtonpost.com/politics/2019/06/28/transcript-night-first-democratic-debate/
+ Debate 2, Day 1: https://www.washingtonpost.com/politics/2019/07/31/transcript-first-night-second-democratic-debate/
+ Debate 2, Day 2: https://www.washingtonpost.com/politics/2019/08/01/transcript-night-second-democratic-debate/
+ Debate 3: https://www.washingtonpost.com/politics/2019/09/13/transcript-third-democratic-debate/
+ Debate 4: https://www.washingtonpost.com/politics/2019/10/15/october-democratic-debate-transcript/
+ Debate 5: https://www.washingtonpost.com/politics/2019/11/21/transcript-november-democratic-debate/
+ Debate 6: https://www.washingtonpost.com/politics/2019/12/20/transcript-december-democratic-debate/
+ Debate 7: https://www.desmoinesregister.com/story/news/elections/presidential/caucus/2020/01/14/democratic-debate-transcript-what-the-candidates-said-quotes/4460789002/
+ Debate 8: https://www.rev.com/blog/transcripts/new-hampshire-democratic-debate-transcript
+ Debate 9: https://www.nbcnews.com/politics/2020-election/full-transcript-ninth-democratic-debate-las-vegas-n1139546
+ Debate 10: https://www.cbsnews.com/news/south-carolina-democratic-debate-full-transcript-text/
+ Debate 11: https://www.rev.com/blog/transcripts/march-democratic-debate-transcript-joe-biden-bernie-sanders