README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
library(LexOPS)
```

# LexOPS <img src="man/figures/hex.png" align="right" style="padding-left:10px;background-color:white" />

<!-- badges: start -->
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://www.tidyverse.org/lifecycle/#stable)
[![Version: 0.3.1](https://img.shields.io/badge/version-0.3.1-blue.svg)](https://github.com/JackEdTaylor/LexOPS/releases)
[![DOI: 10.3758/s13428-020-01389-1](https://zenodo.org/badge/DOI/10.3758/s13428-020-01389-1.svg)](https://doi.org/10.3758/s13428-020-01389-1)
<!-- badges: end -->

LexOPS is an R package for generating matched stimuli for factorial design experiments. You can use the functions on any dataframe, but there is an inbuilt database of example features for English words for psycholinguistics studies in English (`LexOPS::lexops`).

## Installation

LexOPS can be installed as an R package with:

```{r, eval=FALSE}
devtools::install_github("JackEdTaylor/LexOPS@*release")
```

## How to Use

:book: In-depth walkthrough of the package: https://jackedtaylor.github.io/LexOPSdocs/

:mortar_board: Paper about the package: [Taylor, Beith, and Sereno (2020)](https://doi.org/10.3758/s13428-020-01389-1)

## TL;DR

LexOPS makes it easy to generate matched stimuli in a reproducible way. The functions work on any dataframe, but there is an inbuilt dataset, `LexOPS::lexops`, containing psycholinguistic variables for English words.

### The "Generate Pipeline"

The following example pipeline generates 50 words per condition (200 in total), for a study with a 2 x 2, syllables (1, 2) by concreteness (low, high) design. Words are matched by length exactly, and by word frequency within a tolerance of ±0.2 Zipf.

```{r, include=FALSE}
set.seed(100)
```

```{r, eval = FALSE}
library(LexOPS)

stim <- lexops |>
  split_by(Syllables.CMU, 1:1 ~ 2:2) |>
  split_by(CNC.Brysbaert, 1:2 ~ 4:5) |>
  control_for(Zipf.SUBTLEX_UK, -0.2:0.2) |>
  control_for(Length) |>
  generate(n = 50, match_null = "balanced")
```

```{r, warning = FALSE, echo = FALSE}
stim <- lexops |>
  split_by(Syllables.CMU, 1:1 ~ 2:2) |>
  split_by(CNC.Brysbaert, 1:2 ~ 4:5) |>
  control_for(Zipf.SUBTLEX_UK, -0.2:0.2) |>
  control_for(Length) |>
  generate(n = 50, match_null = "balanced", silent = TRUE)
cat(sprintf("Generated 50/50 (100%%). 157 total iterations, 0.32 success rate.\n"))  # pseudoprint progress
```

A preview of what was generated:

```{r}
# create a table of the first 5 rows of the output
stim |>
  head(5) |>
  knitr::kable()
```

### Review Generated Stimuli

The `plot_design()` function produces a plot summarising the generated stimuli.

```{r fig1, dpi=400}
plot_design(stim)
```

### Convert to Long Format

The `long_format()` function coerces the generated stimuli into long format.

```{r}
# present the same 20 words as in the earlier table
long_format(stim) |>
  head(20) |>
  knitr::kable()
```

### Shiny App

The package has an interactive shiny app, which supports most code functionality, with useful additional features like visualising distributions and relationships. It's a friendly front-end to the package's functions. A demo version of the LexOPS shiny app is available online at [https://jackt.shinyapps.io/lexops/](https://jackt.shinyapps.io/lexops/), but it is faster and more reliable to run it locally, with:

```{r, eval=FALSE}
LexOPS::run_shiny()
```

![](man/figures/shiny-preview.png)

### Matching on Custom Dataframes

As well as the built-in dataframe `LexOPS::lexops`, you can generate matches from any dataframe object.

Here is an example using `mtcars`. We pick five automatic and five manual models of car, matched for acceleration (within ±5 `qsec`) and the number of carburetor barrels (`carb`; exactly).

```{r, eval=FALSE}
mtcars |>
  tibble::as_tibble(rownames = "car_id") |>
  set_options(id_col = "car_id") |>
  split_by(am, 0:0 ~ 1:1) |>
  control_for(qsec, -5:5) |>
  control_for(carb, 0:0) |>
  generate(5)
```

```{r, warning=FALSE, echo=FALSE}
mtcars_res <- mtcars |>
  tibble::as_tibble(rownames = "car_id") |>
  set_options(id_col = "car_id") |>
  split_by(am, 0:0 ~ 1:1) |>
  control_for(qsec, -5:5) |>
  control_for(carb, 0:0) |>
  generate(5, silent=TRUE)

print(mtcars_res)
```