Mass Media Field Experiment Archive for Replication

This project is to provide dataset and analysis files for A Mass Media Experiment in Rural Uganda to violence against women by encouraging disclosure. The published paper is here. This multi-wave Randomized Controlled Trials (RCTs) have been conducted by Donald Green, Anna Wilke, and Jasper Cooper. I was responsible for building this archive repository to make this large-scale experiment dataset more accessible to academic commons and the general public.

Affiliation: Institute for Social and Economic Research and Policy, Columbia University

Keywords: Randomized Controlled Trials, Replication Archive

Software: R, LaTex, bash

This repository is also published on Harvard Dataverse

How to run the code

Install the required packages

   #!/usr/bin/bash
   while IFS=" " read -r package version; 
   do 
     Rscript -e "devtools::install_version('"$package"', version='"$version"')"; 
   done < "requirements.txt"

Open PV_replication.Rproj to ensure that all file paths are set relative to the replication archive
Open __main_script.R and run all scripts from here
True / false logics switch on and off scripts that take a long time to run

Content Menu

01_data/ = Contains raw data
- boostrap_extrapolation.Rdata = Stored output extrapolation
- cluster_level_data/ = Village-level data
  - film_festival.csv = Data on intervention
  - location_data.csv = Data that link cluster IDs to districts
  - sampling_radius.csv = Radii for sampling of Rs
  - treatment_assignment.csv = Random assignment
- codebooks/ = ODK tablet coding
  - endline_choices.csv = Endline survey answer choices
  - endline_Qs.csv = Endline survey questions
  - midline_choices.csv = Midline survey answer choices
  - midline_Qs.csv = Midline survey questions
  - vht_endline_choices.csv = VHT endline survey answer choices
  - vht_endline_Qs.csv = Endline VHT survey questions
  - vht_midline_choices.csv = VHT midline survey answer choices
  - vht_midline_Qs.csv = Midline VHT survey questions
- household/ = Household level data
  - distance_data.csv = Distance of Rs to video hall
  - endline_1.csv = Endline HH survey
  - endline_2.csv = Endline HH survey
  - endline_3.csv = Endline HH survey
  - endline_4.csv = Endline HH survey
  - midline_1.csv = Midline HH survey
  - midline_2.csv = Midline HH survey
  - midline_3_1.csv = Midline HH survey
  - midline_3_2.csv = Midline HH survey
  - midline_3.csv = Midline HH survey
  - midline_4.csv = Midline HH survey
  - midline_5.csv = Midline HH survey
  - random_sampling_1.csv = Random sampling of Rs in midline
  - random_sampling_2.csv = Random sampling of Rs in midline
- lasso_selected_covariates.Rdata = Covariates selected through lasso
- UG_VAW_data.Rdata = Stored full data
- vht/ = VHT datasets
  - vht_el_1.csv = VHT endline survey
  - vht_ml_1.csv = VHT midline survey
  - vht_ml_2.csv = VHT midline survey
  - vht_ml_3.csv = VHT midline survey
  - vht_ml_4.csv = VHT midline survey
02_code/ = Code scripts
- __main_script.R = Main script that runs others
- 00_useful_functions/ = Functions used throughout
- 01_codebook/ = Scripts that build codebooks
- 02_load_and_clean_data/ = Loading and cleaning datasets
- 03_variable_coding/ = Coding outcomes and covariates
- 04_merging/ = Merging datasets
- 05_covariate_selection/ = Lasso covariate selection
- 06_analyses_paper/ = Analysis scripts
- 07_analyses_appendix/ = Robustness check scripts
03_tables/ = Tables are output here
04_figures/ = Figures are output here
IPV_replication.Rproj = Run everything from here

Notes

Most analyses are based on a panel of "compliers" interviewed in both the midline and endline surveys. This subset can be identified using the subset respondent_category == "Complier".
We did not re-ask all questions of those in the panel, so their endline responses are merged in from midline.
The multiple versions of the raw data correspond to the different datasets output by ODK / CSO when a change is made to the survey. Each change requires a new survey version, thus producing a new dataset.
The only change made to the raw data files is the removal of PII and variables that are not used in the analysis.
All other modifications made to data (changing of values, etc.) in cleaning scripts were implemented by field manager over course of field work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mass Media Field Experiment Archive for Replication

How to run the code

Content Menu

Notes

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
01_data		01_data
02_code		02_code
03_tables		03_tables
04_figures		04_figures
.DS_Store		.DS_Store
.gitignore		.gitignore
IPV_replication.Rproj		IPV_replication.Rproj
README.md		README.md
requirements.txt		requirements.txt

yintellect/media-experiment

Folders and files

Latest commit

History

Repository files navigation

Mass Media Field Experiment Archive for Replication

How to run the code

Content Menu

Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages