input data normalisation #48

bjstewart1 · 2023-10-25T14:25:10Z

Can you clarify what preprocessing is expected for the RNA assay (gene expression) data.
Is the expected input integer (raw) counts or normalised&log transformed counts or similar?

this isn't totally clear in the vignette.

zhenzuo2 · 2023-11-01T17:12:04Z

I also have a similar question. I am not sure if the returned seurat object contaions the data used for genearlized liear regression, usch as vecotor from RNA assay, or ATAC assay.

zhenzuo2 · 2023-11-01T19:41:17Z

I also have a similar question. I am not sure if the returned seurat object contaions the data used for genearlized liear regression, usch as vecotor from RNA assay, or ATAC assay.

Here is an example. I can access normalized expression level of PAX6 but I am not sure how to access value of "chr1-911275-911316".

zhenzuo2 · 2023-11-01T19:43:10Z

I also have a similar question. I am not sure if the returned seurat object contaions the data used for genearlized liear regression, usch as vecotor from RNA assay, or ATAC assay.

Here is an example. I can access normalized expression level of PAX6 but I am not sure how to access value of "chr1-911275-911316".

@bjstewart1 Using this as an exmaple, you question is what is the values for PAX6. Are they norlized values, scaled values, or raw counts.

bjstewart1 · 2023-11-01T20:56:59Z

I also have a similar question. I am not sure if the returned seurat object contaions the data used for genearlized liear regression, usch as vecotor from RNA assay, or ATAC assay.

Here is an example. I can access normalized expression level of PAX6 but I am not sure how to access value of "chr1-911275-911316".

@bjstewart1 Using this as an exmaple, you question is what is the values for PAX6. Are they norlized values, scaled values, or raw counts.

no my question is what are the input data for the tool. Are the RNA integer counts meant to be processed to normalised/log transformed values?

joschif · 2023-11-09T08:41:33Z

Hi @bjstewart1, currently Pando would expect log-normalized data (for RNA) and tf-idf-normalized data (for ATAC) as input and would also generally use that by default if it's in the data slot of your assay. I've thought about implementing an option to run it on raw counts though - essentially that would require other noise models for the GLMs and accounting for library size covariates in the models.

bjstewart1 · 2023-11-09T09:36:19Z

Thanks @joschif really helpful .. - can I suggest that you make it a bit clearer what these input requirements are in the readme/vignettes?

elhaam · 2024-04-17T16:13:19Z

Hi @joschif ,

Thanks for your response above! I have a follow-up question. When pre-processing data, do you suggest standard QC and filtering (for example min.cells = 3, min.features = 200) in RNA-Seq? I believe this tutorial has not performed QC steps to filter out genes since I see ~31k genes for RNA data. Could you please clarify if we need to keep all the genes, then normalize and get log1p?

Thanks,
Elham

bjstewart1 closed this as completed Nov 1, 2023

bjstewart1 reopened this Nov 1, 2023

JABioinf mentioned this issue Jan 24, 2024

Questions on integration and interpretation #55

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input data normalisation #48

input data normalisation #48

bjstewart1 commented Oct 25, 2023

zhenzuo2 commented Nov 1, 2023

zhenzuo2 commented Nov 1, 2023

zhenzuo2 commented Nov 1, 2023

bjstewart1 commented Nov 1, 2023

joschif commented Nov 9, 2023

bjstewart1 commented Nov 9, 2023

elhaam commented Apr 17, 2024 •

edited

Loading

input data normalisation #48

input data normalisation #48

Comments

bjstewart1 commented Oct 25, 2023

zhenzuo2 commented Nov 1, 2023

zhenzuo2 commented Nov 1, 2023

zhenzuo2 commented Nov 1, 2023

bjstewart1 commented Nov 1, 2023

joschif commented Nov 9, 2023

bjstewart1 commented Nov 9, 2023

elhaam commented Apr 17, 2024 • edited Loading

elhaam commented Apr 17, 2024 •

edited

Loading