Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scSubtype related code,Is it possible to call directly? #16

Open
zhangjl-work opened this issue Jun 6, 2022 · 7 comments
Open

scSubtype related code,Is it possible to call directly? #16

zhangjl-work opened this issue Jun 6, 2022 · 7 comments

Comments

@zhangjl-work
Copy link

zhangjl-work commented Jun 6, 2022

hi, I have a batch of breast cancer single cell data, can I directly call the Highest_calls.R script?  

I call this code directly, and use the NatGen_Supplementary_table_S4.csv file, and the results don't seem right, it seems to divide the cells in each sample (after extracting the cancer cells) into the four subtypes, and the dozen or so samples are So, do you know why,Is this result reasonable?  At present, it is not consistent with the clinical results. But I don't know where is the problem。

Looking forward to your reply!

@zhangjl-work zhangjl-work changed the title scSubtype ,Is it possible to call directly? scSubtype related code,Is it possible to call directly? Jun 6, 2022
@dlroden
Copy link
Collaborator

dlroden commented Jun 27, 2022

Hi, thanks for your interest in our work. Could you please provide more detail to allow us to understand your question more clearly? At a minimum, a reproducible example of the input data, the specific code that you are running and the output would be needed to help further. Thanks

@zhangjl-work
Copy link
Author

zhangjl-work commented Jun 27, 2022 via email

@RegnerM2015
Copy link

Hi @zhangjl-work,

You actually have to integrate your data with their training data first. Then you apply the scSubtyping script to the rescaled integrated expression data (your data + their training data).

Applying the scSubtyping script directly to your data alone will result in a random or even distribution of subtype calls.

@zhangjl-work
Copy link
Author

zhangjl-work commented Jun 28, 2022 via email

@RegnerM2015
Copy link

RegnerM2015 commented Jun 28, 2022

Hi @zhangjl-work,

Question 1: Yes, use the training set data from the article.

Question 2: The format of the training set should be four Seurat objects (saved as RDS files). There should be one rds file for each subtype (LumA, LumB, Her2, and Basal). I am not familiar with the file TNBCmerged_Training_SwarbrickInferCNV.txt, perhaps @dlroden can clarify.

Question 3: I think SinglecellMolecularSubtypesignaturesonlytumor_SEPTEMBER2019.txt holds the gene signatures for each molecular subtype and Calculatingscoresandplotting.R computes the signature enrichment scores for these gene signatures.

I hope @dlroden and @sunnyzwu can clarify these.

@zhangjl-work
Copy link
Author

Thank you very much for your reply, I will communicate with the person you recommended! Thanks again!

@shufanzhang
Copy link

Hi @zhangjl-work,
Is your problem solved? I also want to know the format of the input data for scSubtype and where to get the training set file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants