Skip to content

Commit

Permalink
Merge pull request #306 from rformassspectrometry/processingChunkSize
Browse files Browse the repository at this point in the history
Enable chunk-wise processing for all peaks data functions
  • Loading branch information
jorainer authored Nov 30, 2023
2 parents e299eb5 + 3ba908b commit 9ac911b
Show file tree
Hide file tree
Showing 22 changed files with 755 additions and 203 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: Spectra
Title: Spectra Infrastructure for Mass Spectrometry Data
Version: 1.13.1
Version: 1.13.2
Description: The Spectra package defines an efficient infrastructure
for storing and handling mass spectrometry spectra and functionality to
subset, process, visualize and compare spectra data. It provides different
Expand Down
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Generated by roxygen2: do not edit by hand

export("processingChunkSize<-")
export(MsBackendCached)
export(MsBackendDataFrame)
export(MsBackendHdf5Peaks)
Expand Down Expand Up @@ -27,6 +28,8 @@ export(plotMzDelta)
export(plotSpectra)
export(plotSpectraOverlay)
export(ppm)
export(processingChunkFactor)
export(processingChunkSize)
export(processingLog)
export(reduceSpectra)
export(scalePeaks)
Expand Down Expand Up @@ -63,6 +66,7 @@ exportMethods(addProcessing)
exportMethods(backendBpparam)
exportMethods(backendInitialize)
exportMethods(backendMerge)
exportMethods(backendParallelFactor)
exportMethods(bin)
exportMethods(c)
exportMethods(centroided)
Expand Down
12 changes: 12 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
# Spectra 1.13

## Changes in 1.13.2

- Add possibility to enable and perform chunk-wise (parallel) processing to
`Spectra`: add functions `processingChunkSize`, `backendParallelFactor` and
`processingChunkFactor` to set or get definition of chunks for parallel
processing. All functions working on peaks data use this mechanism which
is implemented in the internal `.peaksapply` function. The `Spectra` object
gains a new slot `"processingChunkSize"` that is used to define the
size of the processing chunks for the `Spectra`. See also [issue
#304](https://github.com/rformassspectrometry/Spectra/issues/304).
This ensures processing also of very large data sets.

## Changes in 1.13.1

- Fix issue with `bin` function (see
Expand Down
3 changes: 3 additions & 0 deletions R/AllGenerics.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ setGeneric("backendMerge", def = function(object, ...)
standardGeneric("backendMerge"),
valueClass = "MsBackend")
#' @rdname hidden_aliases
setGeneric("backendParallelFactor", def = function(object, ...)
standardGeneric("backendParallelFactor"))
#' @rdname hidden_aliases
setMethod("bin", "numeric", MsCoreUtils::bin)
setGeneric("combinePeaks", function(object, ...)
standardGeneric("combinePeaks"))
Expand Down
21 changes: 20 additions & 1 deletion R/MsBackend.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
#' @aliases supportsSetBackend
#' @aliases backendBpparam
#' @aliases backendInitialize
#' @aliases backendParallelFactor,MsBackendMzR-method
#' @aliases backendParallelFactor,MsBackendHdf5Peaks-method
#'
#' @description
#'
Expand Down Expand Up @@ -212,7 +214,9 @@
#' because they contain a connection to a database that can not be
#' shared across processes) should extend this method to return only
#' `SerialParam()` and hence disable parallel processing for (most)
#' methods and functions.
#' methods and functions. See also `backendParallelFactor` for a
#' function to provide a preferred splitting of the backend for parallel
#' processing.
#'
#' - `backendInitialize`: initialises the backend. This method is
#' supposed to be called rights after creating an instance of the
Expand All @@ -233,6 +237,14 @@
#' instance. All objects to be merged have to be of the same type (e.g.
#' [MsBackendDataFrame()]).
#'
#' - `backendParallelFactor`: returns a `factor` defining an optimal
#' (preferred) way how the backend can be split for parallel processing
#' used for all peak data accessor or data manipulation functions.
#' The default implementation returns a factor of length 0 (`factor()`)
#' providing thus no default splitting. A `backendParallelFactor` for
#' `MsBackendMzR` on the other hand returns `factor(dataStorage(object))`
#' hence suggesting to split the object by data file.
#'
#' - `dataOrigin`: gets a `character` of length equal to the number of spectra
#' in `object` with the *data origin* of each spectrum. This could e.g. be
#' the mzML file from which the data was read.
Expand Down Expand Up @@ -849,6 +861,13 @@ setMethod("backendMerge", "MsBackend", function(object, ...) {
stop("Not implemented for ", class(object), ".")
})

#' @exportMethod backendParallelFactor
#'
#' @rdname MsBackend
setMethod("backendParallelFactor", "MsBackend", function(object, ...) {
factor()
})

#' @rdname MsBackend
setMethod("export", "MsBackend", function(object, ...) {
stop(class(object), " does not support export of data; please provide a ",
Expand Down
4 changes: 4 additions & 0 deletions R/MsBackendHdf5Peaks.R
Original file line number Diff line number Diff line change
Expand Up @@ -306,3 +306,7 @@ setMethod("backendMerge", "MsBackendHdf5Peaks", function(object, ...) {
validObject(res)
res
})

setMethod("backendParallelFactor", "MsBackendHdf5Peaks", function(object) {
factor(dataStorage(object), levels = unique(dataStorage(object)))
})
4 changes: 4 additions & 0 deletions R/MsBackendMzR.R
Original file line number Diff line number Diff line change
Expand Up @@ -210,3 +210,7 @@ setMethod("export", "MsBackendMzR", function(object, x, file = tempfile(),
MoreArgs = list(format = format, copy = copy),
BPPARAM = BPPARAM)
})

setMethod("backendParallelFactor", "MsBackendMzR", function(object) {
factor(dataStorage(object), levels = unique(dataStorage(object)))
})
Loading

0 comments on commit 9ac911b

Please sign in to comment.