Continuous pancreas notebook not reproducible #4

rrydbirk · 2023-10-25T10:41:58Z

I just downloaded and ran the entire notebook for pancreas using the continuous model and CPUs instead of GPU. The velocities are reversed.

Complete notebook as PDF: veloVAE_panc… - JupyterLab.pdf

/work/01_notebooks via 🅒 velovae 
[ 12:36:44 ] ➜  pip freeze
anndata==0.9.2
anyio==4.0.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.4.0
async-lru==2.0.4
attrs==23.1.0
Babel==2.12.1
backcall==0.2.0
beautifulsoup4==4.12.2
bleach==6.0.0
certifi==2023.7.22
cffi==1.16.0
charset-normalizer==3.2.0
click==8.1.7
cmake==3.27.5
comm==0.1.4
contourpy==1.1.1
cycler==0.11.0
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
executing==1.2.0
fastjsonschema==2.18.0
filelock==3.12.4
fonttools==4.42.1
fqdn==1.5.1
h5py==3.9.0
hnswlib==0.7.0
idna==3.4
igraph==0.10.8
ipykernel==6.25.2
ipython==8.15.0
ipython-genutils==0.2.0
ipywidgets==8.1.1
isoduration==20.11.0
jedi==0.19.0
Jinja2==3.1.2
joblib==1.3.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
jupyter-contrib-core==0.4.2
jupyter-contrib-nbextensions==0.7.0
jupyter-events==0.7.0
jupyter-highlight-selected-word==0.2.0
jupyter-lsp==2.2.0
jupyter-nbextensions-configurator==0.6.3
jupyter_client==8.3.1
jupyter_core==5.3.2
jupyter_server==2.7.3
jupyter_server_terminals==0.4.4
jupyterlab==4.0.6
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.9
jupyterlab_server==2.25.0
kiwisolver==1.4.5
lit==17.0.1
llvmlite==0.41.0
loess==2.1.2
loompy==3.0.7
lxml==4.9.3
MarkupSafe==2.1.3
matplotlib==3.5.1
matplotlib-inline==0.1.6
mistune==3.0.1
mpmath==1.3.0
natsort==8.4.0
nbclient==0.8.0
nbconvert==7.8.0
nbformat==5.9.2
nest-asyncio==1.5.8
networkx==3.1
notebook==7.0.4
notebook_shim==0.2.3
numba==0.58.0
numpy==1.23.5
numpy-groupies==0.10.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
overrides==7.4.0
packaging==23.1
pandas==1.4.1
pandocfilters==1.5.0
parso==0.8.3
patsy==0.5.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==10.0.1
platformdirs==3.10.0
plotbin==3.1.5
prometheus-client==0.17.1
prompt-toolkit==3.0.39
psutil==5.9.5
ptyprocess==0.7.0
pure-eval==0.2.2
pycparser==2.21
Pygments==2.16.1
pynndescent==0.5.10
pyparsing==3.1.1
pyspark==3.5.0
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==25.1.1
referencing==0.30.2
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.10.3
scanpy==1.9.5
scikit-learn==1.3.1
scipy==1.11.3
scvelo==0.2.5
seaborn==0.12.2
Send2Trash==1.8.2
session-info==1.0.0
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
stack-data==0.6.2
statsmodels==0.14.0
stdlib-list==0.9.0
sympy==1.12
tbb==2021.10.0
tensorly==0.8.1
terminado==0.17.1
texttable==1.6.7
threadpoolctl==3.2.0
tinycss2==1.2.1
torch==2.0.1
tornado==6.3.3
tqdm==4.62.3
traitlets==5.10.1
triton==2.0.0
typing_extensions==4.8.0
tzdata==2023.3
umap-learn==0.5.4
uri-template==1.3.0
urllib3==2.0.5
velovae @ file:///work/02_data/VeloVAE
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.3
widgetsnbextension==4.0.9

/work/01_notebooks via 🅒 velovae 
[ 12:36:46 ] ➜  python --version
Python 3.11.5

The text was updated successfully, but these errors were encountered:

g-yichen · 2023-10-27T14:16:08Z

Thank you for the feedback. I can reproduce similar results without using the same package versions you provided, so it's not likely a problem with your package versions.

The model performs undesirably mainly because default hyperparameters do not fit some datasets well. From my experience, the learning rate plays an important role. VeloVAE has two optimizers with different learning rates for neural network weights and ODE parameters. In this example, I changed the neural networks to 1e-4 and ODE parameter learning rate to 5e-3 and the model generated expected results(See the figure below). Here's what you can try. When you call the training function, do:

config = {
    'learning_rate': 1e-4,
    'learning_rate_ode': 5e-3,
    'learning_rate_post': 1e-4
}
vae.train(adata,
          config=config,
          plot=False,
          gene_plot=gene_plot,
          figure_path=figure_path,
          embed="umap")

Here, we have an extra learning rate 'learning_rate_post' is the neural network learning rate in the second training stage. Usually we set it to be the same as learning_rate.

Unfortunately, the model does not always perform well with default parameters. Here are some general suggestions about tuning the model:

Usually, increasing learning_rate_ode and decreasing learning_rate will improve the performance if model behaves poorly.
Consider using VAE with a rate prior by setting full_vb=True when you create a VAE object. This helps regularize the ODE rate parameters by including a prior distribution on them.
When you see cell latent time is reversed, maybe try setting reverse_gene_mode to True when you create a VAE objcet.

There are still engineering efforts we need to make regarding the best default hyperparameter. I hope this helps!

jw156605 · 2023-10-27T14:46:56Z

To clarify, Yichen only tested the model on single-precision GPUs. Moving to double precision on CPU likely changes when the early stopping criterion is triggered or something about the relative balance of ODE and neural net losses.

rrydbirk · 2023-10-30T09:04:25Z

Thanks for the comments.

Not sure if you could include this in the vignette/notebook, or just label this issue as good for beginners. Otherwise, feel free to close.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous pancreas notebook not reproducible #4

Continuous pancreas notebook not reproducible #4

rrydbirk commented Oct 25, 2023

g-yichen commented Oct 27, 2023

jw156605 commented Oct 27, 2023

rrydbirk commented Oct 30, 2023

Continuous pancreas notebook not reproducible #4

Continuous pancreas notebook not reproducible #4

Comments

rrydbirk commented Oct 25, 2023

g-yichen commented Oct 27, 2023

jw156605 commented Oct 27, 2023

rrydbirk commented Oct 30, 2023