planet
is an R package for inferring ethnicity (1), gestational
age (2), and cell composition (3) from placental DNA methylation
data.
See full documentation at https://victor.rbind.io/planet
Latest Bioconductor release
if(!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("planet")
Or the development version of planet
:
devtools::install_github('wvictor14/planet')
See vignettes for more detailed usage.
All functions in this package take as input DNAm data the 450k and EPIC
DNAm microarray. For best performance I suggest providing unfiltered
data normalized with noob and BMIQ. A processed example dataset,
plBetas
, is provided to show the format that this data should be in.
The output of all planet
functions is a data.frame
.
A quick example of each major function is illustrated with this example data:
library(minfi)
library(planet)
#load example data
data(plBetas)
data(plPhenoData) # sample information
predictEthnicity(plBetas) %>%
head()
#> 1860 of 1860 predictors present.
#> # A tibble: 6 × 7
#> Sample_ID Predicted_ethnicity_n…¹ Predicted_ethnicity Prob_African Prob_Asian
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 GSM1944936 Caucasian Caucasian 0.00331 0.0164
#> 2 GSM1944939 Caucasian Caucasian 0.000772 0.000514
#> 3 GSM1944942 Caucasian Caucasian 0.000806 0.000699
#> 4 GSM1944944 Caucasian Caucasian 0.000883 0.000792
#> 5 GSM1944946 Caucasian Caucasian 0.000885 0.00130
#> 6 GSM1944948 Caucasian Caucasian 0.000852 0.000973
#> # ℹ abbreviated name: ¹Predicted_ethnicity_nothresh
#> # ℹ 2 more variables: Prob_Caucasian <dbl>, Highest_Prob <dbl>
There are 3 gestational age clocks for placental DNA methylation data
from Lee Y. et al. 2019 (2). To use a specific one, we can use the
type
argument in predictAge
:
predictAge(plBetas, type = 'RPC') %>%
head()
#> 558 of 558 predictors present.
#> [1] 38.46528 33.09680 34.32520 35.50937 37.63910 36.77051
Reference data to infer cell composition on placental villi DNAm samples
(3) can be used with cell deconvolution from minfi or EpiDISH. These are
provided in this package as plCellCpGsThird
and plCellCpGsFirst
for
third trimester (term) and first trimester samples, respectively.
data('plCellCpGsThird')
minfi:::projectCellType(
# subset your data to cell cpgs
plBetas[rownames(plCellCpGsThird),],
# input the reference cpg matrix
plCellCpGsThird,
lessThanOne = FALSE) %>%
head()
#> Trophoblasts Stromal Hofbauer Endothelial nRBC
#> GSM1944936 0.1091279 0.04891919 0.000000e+00 0.08983998 0.05294062
#> GSM1944939 0.2299918 0.00000000 9.725560e-19 0.07888007 0.03374149
#> GSM1944942 0.1934287 0.03483540 0.000000e+00 0.09260353 0.02929310
#> GSM1944944 0.2239896 0.06249135 1.608645e-03 0.11040693 0.04447951
#> GSM1944946 0.1894152 0.07935955 0.000000e+00 0.10587439 0.05407587
#> GSM1944948 0.2045124 0.07657717 0.000000e+00 0.09871149 0.02269798
#> Syncytiotrophoblast
#> GSM1944936 0.6979477
#> GSM1944939 0.6377822
#> GSM1944942 0.6350506
#> GSM1944944 0.5467642
#> GSM1944946 0.6022329
#> GSM1944948 0.6085825