-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.qmd
218 lines (162 loc) · 4.88 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
---
title: "massSight"
format:
gfm:
toc: true
---
<!-- README.md is generated from README.qmd. Please edit that file -->
```{r, include = FALSE}
devtools::load_all()
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
options(cli.hyperlink = FALSE)
```
<img src="man/figures/massSight.png" align="right" width="30%"/></a>
[![](https://zenodo.org/badge/608216683.svg)](https://zenodo.org/badge/latestdoi/608216683)
`massSight` is an R package for combining and scaling LC-MS metabolomics data.
- Citation: if you use `massSight`, please cite our manuscript: Chiraag Gohel and Ali Rahnavard. (2023). massSight: Metabolomics meta-analysis through multi-study data scaling, integration, and harmonization. <https://github.com/omicsEye/massSight>
## Examples
Examples and extensive documentation can be found [here](omicseye.github.io/massSight/)
## Description
## Installation
First, if you don't have it installed, install `devtools` using:
```{r, eval = F}
install.packages("devtools")
```
Then, in an `R` console, run:
```{r, eval = F}
devtools::install_github("omicsEye/massSight")
```
You can then load the library using:
```{r, eval = F}
library(massSight)
```
## Data Preparation
`massSight` works with the output of LC-MS experiments, which should contain columns corresponding to:
1. Compound ID
2. Retention Time
3. Mass to Charge Ratio
4. (Optional) Average Intensity across all samples
5. (Optional) Metabolite Name
```{r, eval = T, echo = F}
data(hp1)
head(hp1) |>
knitr::kable()
```
### The `massSight` Object
`massSight` creates and uses the `MSObject` class to store data and results pertaining to individual LC-MS experiments. Prior to alignment, LC-MS data frames or tibbles should be converted into an `MSObject` using `create_ms_obj`:
```{r}
data(hp1)
data(hp2)
ms1 <-
create_ms_obj(
df = hp1,
name = "hp1",
id_name = "Compound_ID",
rt_name = "RT",
mz_name = "MZ",
int_name = "Intensity"
)
ms2 <-
create_ms_obj(
df = hp2,
name = "hp2",
id_name = "Compound_ID",
rt_name = "RT",
mz_name = "MZ",
int_name = "Intensity"
)
```
An `MSObject` provides the following functions:
* `raw_df()` to access the experiment's raw LC-MS data
* `isolated()` to access the experiment's isolated metabolites, which is important for downstream alignment tasks
* `scaled_df()` to access the experiment's scaled LC-MS data
* `consolidated()` to access the experiment's consolidated data
* `metadata()` to access the experiment's metadata
```{r}
ms2 |>
raw_df() |>
head() |>
knitr::kable(format = "simple")
```
## Alignment
### `auto_combine()`
Alignment is performed using `auto_combine()`
```{r, eval = T}
aligned <- auto_combine(
ms1,
ms2,
rt_lower = -.5,
rt_upper = .5,
mz_lower = -15,
mz_upper = 15,
smooth_method = "gam",
log = NULL
)
```
More information on the `auto_combine()` function can be found in the [package documentation](https://omicseye.github.io/massSight/reference/auto_combine.html)
### `ml_match()`
The `ml_match()` function is an alternative method for merging LC-MS experiments with semi-annotated data sets.
```{r, eval = F}
ml_match_aligned <- ml_match(
ms1,
ms2,
mz_thresh = 15,
rt_thresh = 0.5,
prob_thresh = .5,
seed = 72
)
```
## Results
Results from an alignment function are stored as a `MergedMSObject`. This object contains the following slots:
- `all_matched()`: All of the final matched metabolites between the two datasets. This is the main result of the various matching functions.
```{r}
all_matched(aligned) |>
head() |>
knitr::kable()
```
- `iso_matched()`: The matched isolated metabolites between the two datasets.
```{r}
iso_matched(aligned) |>
head() |>
knitr::kable()
```
### Plotting results from alignment
The `final_plots()` function returns plots containing information on RT and MZ drift for pre isolation, isolation, and final matching results. These plots can be used for diagnostic purposes.
```{r, eval = F}
plots <- final_plots(aligned,
rt_lim = c(-.5, .5),
mz_lim = c(-15, 15)
)
plots
```
![](man/figures/final_plot_out.png)
This plot can be saved locally using `ggsave()` from the `ggplot2` package:
```{r, eval = F}
ggplot2::ggsave(
filename = "plot.png",
plot = plots
)
```
### Using `massSight` to annotate unknown metabolites
```{r}
merged_df <- all_matched(aligned)
hp2_annotated <- merged_df |>
dplyr::select(Compound_ID_hp2, rep_Metabolite) |>
dplyr::inner_join(hp2, by = c("Compound_ID_hp2" = "Compound_ID"))
hp2_annotated |>
dplyr::filter(rep_Metabolite != "") |>
dplyr::arrange(rep_Metabolite) |>
head(10) |>
knitr::kable()
```
Here, `rep_Metabolite` is the metabolite name from the reference dataset.
## Dev Instructions
### Installation
1. Clone/pull `massSight`
2. Open the R project `massSight.Rproj`
3. Build package using `devtools::build()`
4. Install package using `devtools::install()`