Skip to content

Commit

Permalink
Add pruning function
Browse files Browse the repository at this point in the history
  • Loading branch information
Jakob Russel committed Jul 13, 2020
1 parent da877f5 commit e813a01
Show file tree
Hide file tree
Showing 6 changed files with 112 additions and 18 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: MicEco
Title: Various functions for microbial community data
Version: 0.9.10
Version: 0.9.11
Authors@R: person("Jakob", "Russel", email = "russel2620@gmail.com", role = c("aut", "cre"))
Description: Collection of functions for microbiome analyses. E.g. fitting neutral models and standardized effect sizes of phylogenetic beta diversities, and much more.
Depends: R (>= 3.2.5)
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ export(neutral.rand)
export(proportionality)
export(ps_euler)
export(ps_pheatmap)
export(ps_prune)
export(ps_refactor)
export(ps_venn)
export(rarefy_rrna)
Expand Down
2 changes: 1 addition & 1 deletion R/ps_pheatmap.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ ps_pheatmap <- function(ps, annot_samp=NULL, annot_taxa=NULL, relative=TRUE, log

# Filter
if(min_samples > 1 | min_reads > 1 | min_abundance > 0){
try(ps <- DAtest::preDA(ps, min.samples = min_samples, min.reads = min_reads, min.abundance = min_abundance),
try(ps <- ps_prune(ps, min.samples = min_samples, min.reads = min_reads, min.abundance = min_abundance),
silent = TRUE)
}

Expand Down
65 changes: 65 additions & 0 deletions R/ps_prune.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#' Prune taxa from phyloseq object by their prevalence or abundance
#'
#' @param data \code{phyloseq} object.
#' @param min.samples Minimum number of samples the features should be present in. Default 0
#' @param min.reads Minimum number of total reads the features should have. Default 0
#' @param min.abundance Minimum mean relative abundance features should have. Default 0
#' @return Similar to input, but with features not reaching the criteria given grouped as "Others"
#' @export

ps_prune <- function(data, min.samples = 0, min.reads = 0, min.abundance = 0){

# Extract from phyloseq
if(class(data) == "phyloseq"){
loadNamespace("phyloseq")
count_table <- phyloseq::otu_table(data)
if(!phyloseq::taxa_are_rows(data)) count_table <- t(count_table)
} else {
stop("Input should be a phyloseq object")
}

# Check
if(any("Others" %in% rownames(count_table))) stop("There is already a feature named 'Others' in data. Maybe data has been pre-processed already")

# Exclude by samples
exclude.samples <- which(rowSums(count_table > 0) < min.samples)

# Exclude by reads
exclude.reads <- which(rowSums(count_table) < min.reads)

# Exclude by abundance
count_rel <- apply(count_table, 2, function(x) x/sum(x))
exclude.abundance <- which(rowMeans(count_rel) < min.abundance)

# Which to exclude
exclude <- unique(c(exclude.samples,exclude.reads,exclude.abundance))
if(length(exclude) == 0) stop("No features to group!")
if(length(exclude) == nrow(count_table)) stop("All features would be grouped as 'Others'!")
message(paste(length(exclude),"features grouped as 'Others' in the output"))

# Make new count_table
count.keep <- count_table[-exclude,]
count.other <- count_table[exclude,]
count.other <- colSums(count.other)
count.new <- rbind(count.keep,count.other)
rownames(count.new)[nrow(count.new)] <- "Others"

# Output
# Fix tax_table
tax <- as.data.frame(unclass(phyloseq::tax_table(data)))
tax.keep <- tax[-exclude,]
tax.new <- rbind(tax.keep,NA)
rownames(tax.new)[nrow(tax.new)] <- "Others"
if(phyloseq::taxa_are_rows(data)){
data.new <- phyloseq::phyloseq(phyloseq::otu_table(count.new, taxa_are_rows = TRUE),
phyloseq::sample_data(data),
phyloseq::tax_table(as.matrix(tax.new)))
}
if(!phyloseq::taxa_are_rows(data)){
data.new <- phyloseq::phyloseq(phyloseq::otu_table(t(count.new), taxa_are_rows = FALSE),
phyloseq::sample_data(data),
phyloseq::tax_table(as.matrix(tax.new)))
}
return(data.new)

}
37 changes: 21 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ MicEco: Various functions for analysis for microbial community data
### Citation
[![DOI](https://zenodo.org/badge/83547545.svg)](https://zenodo.org/badge/latestdoi/83547545)

## Phyloseq extensions
#### ps_prune

Prune taxa (ASVs, OTUs) from a phyloseq object based on their abundance and/or prevalence

#### ps_venn

Make Venn diagram of shared taxa (ASVs, OTUs) across sample groups from a phyloseq object. Overlap can be weighted by relative abundance
Expand All @@ -24,6 +29,11 @@ Make Euler diagram of shared taxa (ASVs, OTUs) across sample groups from a phylo

Make pretty heatmap directly from a phyloseq object. Built-in agglomoration, filtering, ordering, scaling, transformation, and annotation.

#### rcurve

Rarefaction curve (theoretical and fast) from a phyloseq object. Output ready for plotting in ggplot2

## Miscellaneous functions
#### adonis_OmegaSq

Calculate the unbiased effect size estimation (partial) omega-squared for adonis (PERMANOVA) models
Expand All @@ -32,20 +42,20 @@ Calculate the unbiased effect size estimation (partial) omega-squared for adonis

Wd* - robust distance-based multivariate analysis of variance (https://doi.org/10.1186/s40168-019-0659-9). This code is taken from https://github.com/alekseyenko/WdStar/. An alternative to PERMANOVA.

#### rcurve

Rarefaction curve (theoretical and fast) from a phyloseq object. Output ready for plotting in ggplot2

#### ps_refactor

Relevel the Sample variable in a psmelted phyloseq object, such that similar samples are plotted together with ggplot barcharts.

#### UniFrac.multi

With unrooted phylogenies UniFrac sets the root randomly on the tree.
The position of the root affects the results.
This function runs UniFrac multiple times in parallel, with different roots, and takes the average to smooth potential bias.

#### proportionality

Calculate proportionality on a phyloseq object or otu-table. Proposed by
Lovell et al. 2016 Proportionality: a valid alternative to correlation
for relative data
(<http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004075>)

## 16S rRNA gene copy number analyses
#### community\_rrna

Calculate the average 16S rRNA copy number of the OTUs in a
Expand All @@ -60,6 +70,7 @@ counts with a probability of the inverse 16S rRNA copy number, such that
besides rarefying the read counts the otu-table will be corrected for
the varying 16S rRNA copy numbers of the OTUs.

## Neutral model
#### neutral.fit

Fit neutral model developed by Sloan et al. (2006, Environ Microbiol 8(4):732-740) and implemented by Burns et al. (2015, ISME J 10(3):655-664).
Expand All @@ -68,13 +79,7 @@ Fit neutral model developed by Sloan et al. (2006, Environ Microbiol 8(4):732-74

Fit neutral model developed by Sloan et al. (2006, Environ Microbiol 8(4):732-740) and implemented by Burns et al. (2015, ISME J 10(3):655-664) several times on ramdomly picked samples and with 16S rRNA gene copy number corrected rarefaction (rarefy_rrna).

#### proportionality

Calculate proportionality on a phyloseq object or otu-table. Proposed by
Lovell et al. 2016 Proportionality: a valid alternative to correlation
for relative data
(<http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004075>)

## Beta diversity null models
#### ses.UniFrac

Standardized effect size of UniFrac, based on null models created with
Expand Down Expand Up @@ -127,7 +132,7 @@ A parallel version of the ses.mntd function from the picante package for signifi

Permutation test of z-matrix from `ses.comdist`, `ses.comdist2`, `ses.comdistnt`, `ses.comdistnt2` and `ses.UniFrac`.

### Copyright notice
## Copyright notice

`rarefy_rrna`: Some code is from vegan licensed under GPL-2
(<https://github.com/vegandevs/vegan>)
Expand Down
23 changes: 23 additions & 0 deletions man/ps_prune.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit e813a01

Please sign in to comment.