Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convergence plot showing bizarre reading, data okay to use? #241

Open
jcshuy opened this issue Feb 14, 2024 · 2 comments
Open

Convergence plot showing bizarre reading, data okay to use? #241

jcshuy opened this issue Feb 14, 2024 · 2 comments

Comments

@jcshuy
Copy link

jcshuy commented Feb 14, 2024

Hi, sorry if this was already addressed in a previous issue-- I was not able to source a solution to this. I ran Harmony on a relatively large (~21k cells) merged Seurat v4 object with the following command:
seurObj <- RunHarmony(seurObj, 'orig.ident', plot_convergence = T, early_stop = F, max_iter = 40)
and afterwards the convergence plot showed a result like this:
image
I also checked the UMAP and believe the two did not converge correctly. However, I am confused why the result turned out like this. I have used Harmony in the past and usually they will converge without issue if setting a larger max_iter value along with early_stop = F. Does anybody have a solution to remedy this?

Session info:

R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.42.0               reticulate_1.35.0           pheatmap_1.0.12            
 [4] reshape2_1.4.4              viridis_0.6.5               viridisLite_0.4.2          
 [7] topGO_2.54.0                SparseM_1.81                GO.db_3.18.0               
[10] graph_1.80.0                plyr_1.8.9                  biomaRt_2.58.2             
[13] magrittr_2.0.3              zinbwave_1.24.0             Matrix_1.6-5               
[16] edgeR_4.0.15                limma_3.58.1                cowplot_1.1.3              
[19] UpSetR_1.4.0                org.Hs.eg.db_3.18.0         org.Mm.eg.db_3.18.0        
[22] AnnotationDbi_1.64.1        clusterProfiler_4.10.0      randomcoloR_1.1.0.1        
[25] DropletUtils_1.22.0         patchwork_1.2.0             RColorBrewer_1.1-3         
[28] EnhancedVolcano_1.20.0      ggrepel_0.9.5               lubridate_1.9.3            
[31] forcats_1.0.0               stringr_1.5.1               dplyr_1.1.4                
[34] purrr_1.0.2                 readr_2.1.5                 tidyr_1.3.1                
[37] tibble_3.2.1                tidyverse_2.0.0             SCpubr_2.0.2               
[40] harmony_1.2.0               Rcpp_1.0.12                 scRNAseq_2.16.0            
[43] scater_1.30.1               ggplot2_3.4.4               scran_1.30.2               
[46] scuttle_1.12.0              SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0
[49] Biobase_2.62.0              GenomicRanges_1.54.1        GenomeInfoDb_1.38.6        
[52] IRanges_2.36.0              S4Vectors_0.40.2            MatrixGenerics_1.14.0      
[55] matrixStats_1.2.0           BiocGenerics_0.48.1         SeuratObject_4.1.4         
[58] Seurat_4.4.0               

loaded via a namespace (and not attached):
  [1] R.methodsS3_1.8.2             progress_1.2.3                goftest_1.2-3                
  [4] Biostrings_2.70.2             HDF5Array_1.30.0              vctrs_0.6.5                  
  [7] spatstat.random_3.2-2         digest_0.6.34                 png_0.1-8                    
 [10] deldir_2.0-2                  parallelly_1.36.0             MASS_7.3-60.0.1              
 [13] httpuv_1.6.14                 qvalue_2.34.0                 withr_3.0.0                  
 [16] ggrastr_1.0.2                 ggfun_0.1.4                   ellipsis_0.3.2               
 [19] survival_3.5-7                memoise_2.0.1                 ggbeeswarm_0.7.2             
 [22] gson_0.1.0                    tidytree_0.4.6                zoo_1.8-12                   
 [25] V8_4.4.1                      pbapply_1.7-2                 R.oo_1.26.0                  
 [28] prettyunits_1.2.0             KEGGREST_1.42.0               promises_1.2.1               
 [31] httr_1.4.7                    restfulr_0.0.15               globals_0.16.2               
 [34] fitdistrplus_1.1-11           rhdf5filters_1.14.1           rhdf5_2.46.1                 
 [37] rstudioapi_0.15.0             miniUI_0.1.1.1                generics_0.1.3               
 [40] DOSE_3.28.2                   curl_5.2.0                    zlibbioc_1.48.0              
 [43] ScaledMatrix_1.10.0           ggraph_2.1.0                  polyclip_1.10-6              
 [46] GenomeInfoDbData_1.2.11       ExperimentHub_2.10.0          SparseArray_1.2.4            
 [49] interactiveDisplayBase_1.40.0 xtable_1.8-4                  S4Arrays_1.2.0               
 [52] BiocFileCache_2.10.1          hms_1.1.3                     irlba_2.3.5.1                
 [55] colorspace_2.1-0              filelock_1.0.3                ROCR_1.0-11                  
 [58] spatstat.data_3.0-4           lmtest_0.9-40                 later_1.3.2                  
 [61] ggtree_3.10.0                 lattice_0.22-5                spatstat.geom_3.2-8          
 [64] future.apply_1.11.1           genefilter_1.84.0             scattermore_1.2              
 [67] XML_3.99-0.16.1               shadowtext_0.1.3              RcppAnnoy_0.0.22             
 [70] pillar_1.9.0                  nlme_3.1-164                  compiler_4.3.2               
 [73] beachmat_2.18.0               stringi_1.8.3                 tensor_1.5                   
 [76] GenomicAlignments_1.38.2      crayon_1.5.2                  abind_1.4-5                  
 [79] BiocIO_1.12.0                 gridGraphics_0.5-1            locfit_1.5-9.8               
 [82] sp_2.1-3                      graphlayouts_1.1.0            bit_4.0.5                    
 [85] fastmatch_1.1-4               codetools_0.2-19              BiocSingular_1.18.0          
 [88] plotly_4.10.4                 mime_0.12                     splines_4.3.2                
 [91] dbplyr_2.4.0                  sparseMatrixStats_1.14.0      HDO.db_0.99.1                
 [94] blob_1.2.4                    utf8_1.2.4                    BiocVersion_3.18.1           
 [97] AnnotationFilter_1.26.0       fs_1.6.3                      listenv_0.9.1                
[100] DelayedMatrixStats_1.24.0     ggplotify_0.1.2               statmod_1.5.0                
[103] tzdb_0.4.0                    tweenr_2.0.2                  pkgconfig_2.0.3              
[106] tools_4.3.2                   cachem_1.0.8                  RhpcBLASctl_0.23-42          
[109] RSQLite_2.3.5                 DBI_1.2.1                     fastmap_1.1.1                
[112] scales_1.3.0                  grid_4.3.2                    ica_1.0-3                    
[115] Rsamtools_2.18.0              AnnotationHub_3.10.0          BiocManager_1.30.22          
[118] RANN_2.6.1                    farver_2.1.1                  tidygraph_1.3.1              
[121] scatterpie_0.2.1              yaml_2.3.8                    rtracklayer_1.62.0           
[124] cli_3.6.2                     leiden_0.4.3.1                lifecycle_1.0.4              
[127] uwot_0.1.16                   bluster_1.12.0                BiocParallel_1.36.0          
[130] annotate_1.80.0               timechange_0.3.0              gtable_0.3.4                 
[133] rjson_0.2.21                  ggridges_0.5.6                progressr_0.14.0             
[136] parallel_4.3.2                ape_5.7-1                     softImpute_1.4-1             
[139] jsonlite_1.8.8                bitops_1.0-7                  bit64_4.0.5                  
[142] Rtsne_0.17                    yulab.utils_0.1.4             spatstat.utils_3.0-4         
[145] BiocNeighbors_1.20.2          metapod_1.10.1                GOSemSim_2.28.1              
[148] dqrng_0.3.2                   R.utils_2.12.3                lazyeval_0.2.2               
[151] shiny_1.8.0                   htmltools_0.5.7               enrichplot_1.22.0            
[154] sctransform_0.4.1             rappdirs_0.3.3                ensembldb_2.26.0             
[157] glue_1.7.0                    XVector_0.42.0                RCurl_1.98-1.14              
[160] treeio_1.26.0                 gridExtra_2.3                 igraph_2.0.1.1               
[163] R6_2.5.1                      labeling_0.4.3                GenomicFeatures_1.54.3       
[166] cluster_2.1.6                 Rhdf5lib_1.24.2               aplot_0.2.2                  
[169] DelayedArray_0.28.0           tidyselect_1.2.0              vipor_0.4.7                  
[172] ProtGenerics_1.34.0           ggforce_0.4.1                 xml2_1.3.6                   
[175] future_1.33.1                 rsvd_1.0.5                    munsell_0.5.0                
[178] KernSmooth_2.23-22            data.table_1.15.0             htmlwidgets_1.6.4            
[181] fgsea_1.28.0                  rlang_1.1.3                   spatstat.sparse_3.0-3        
[184] spatstat.explore_3.2-6        fansi_1.0.6                   beeswarm_0.4.0     ```
@pati-ni
Copy link
Collaborator

pati-ni commented Feb 16, 2024

The convergence plot looks strange.

Can you share some extra information about the datasets (number of batches, differences expected between batches)? From the top of my head I would encourage you to set lambda=NULL to see whether this improves things.

Also, how many PCs are you using for this?

@jcshuy
Copy link
Author

jcshuy commented Feb 22, 2024

Sorry for the late response. To be honest I am still quite new to single cell analysis and I am not too familiar of how this dataset was processed; I believe it may be two batches (there are 2 control samples and two treated samples).

After QC I have run 75 PCs for this dataset
seurObj <- NormalizeData(seurObj) %>% FindVariableFeatures(selection.method = "vst") %>% ScaleData(vars.to.regress = c('nCount_RNA','nFeature_RNA')) %>% RunPCA(npcs = 75, verbose = T)
and the resulting PCA plot looks like this:
Screenshot 2024-02-21 at 9 18 05 PM

Running harmony with lambda set to NULL results in this following convergence plot:
Screenshot 2024-02-21 at 9 23 32 PM

It also appears that the harmony plot is unchanged from the original unintegrated plot.
Screenshot 2024-02-21 at 9 25 11 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants