Monopogen treats multiple samples as single sample in germline SNV calling #72

yu-tong-wang · 2024-08-14T20:13:09Z

I've encountered an issue where Monopogen appears to process multiple samples as a single sample during germline SNV calling, despite correctly handling them as separate samples in the preprocessing step, with multiple samples across different rows.

Key observations:
Preprocessing correctly identifies 4 samples.
Germline calling reports "1 samples in 4 input files".
All filtered BAM files have identical RG tags (SM:atac_possorted_bam).

I suspect the identical RG tags may cause this behavior. Is there a way to force Monopogen to treat each input file as a separate sample? Alternatively, should we modify the RG tags in our original BAM files? Any guidance on resolving this issue would be appreciated. Let me know if you need any additional information.

Update: the RG tags are unique in my original BAM files, and preprocessing module took 4 samples as separate samples, but also modified the RG info to be the same in the preprocessing step. It is important to modify the BamFilter function in germline.py to preserve the original Read Group information when creating the filtered BAM files. The problem most likely arose because my original BAM files are of the same name across different folders, which is very common.

bash atac_out/Script/runGermline_chr20.sh
[mpileup] 1 samples in 4 input files
(mpileup) Max depth is above 1M. Potential memory hog!
Lines total/split/realigned/skipped: 61434549/866633/85422/0
[2024-08-14 12:29:23,079] INFO Monopogen.py Success! See instructions above.

jinzhuangdou · 2024-09-14T14:44:28Z

Good suggestion. We will add different RG groups during the bamFilter step soon.

SiyuanHuang1 · 2024-11-07T18:42:24Z

Same issue! Are there any updates? @jinzhuangdou

SiyuanHuang1 · 2024-11-07T18:46:31Z

hello, Yutong, have you addressed this problem? @yu-tong-wang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monopogen treats multiple samples as single sample in germline SNV calling #72

Monopogen treats multiple samples as single sample in germline SNV calling #72

yu-tong-wang commented Aug 14, 2024 •

edited

Loading

jinzhuangdou commented Sep 14, 2024

SiyuanHuang1 commented Nov 7, 2024

SiyuanHuang1 commented Nov 7, 2024

Monopogen treats multiple samples as single sample in germline SNV calling #72

Monopogen treats multiple samples as single sample in germline SNV calling #72

Comments

yu-tong-wang commented Aug 14, 2024 • edited Loading

jinzhuangdou commented Sep 14, 2024

SiyuanHuang1 commented Nov 7, 2024

SiyuanHuang1 commented Nov 7, 2024

yu-tong-wang commented Aug 14, 2024 •

edited

Loading