Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 [Feature] How to launch dask workers on a second machine from a Jupyter environment #350

Open
huard opened this issue Jul 6, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@huard
Copy link
Collaborator

huard commented Jul 6, 2023

Description

At Ouranos we have a second server that could provide additional CPUs for large computations. The idea would be to be able to create a dask client in a notebook, and have an option to create dask workers on the second machine.

Use case:

  • User opens Jupyter notebook
  • User launches Scheduler and n workers on 2nd machine
  • User creates client connected to Scheduler
  • User launches job using client

Questions

If jobs are IO bounds, the network between the two machines might become the bottleneck.
This could eventually be improved by the installation of a 10GB network, or a private 1GB connection between the two machines.

References

Dask cluster deployment documentation: https://docs.dask.org/en/stable/deploying.html

I suspect the workers need to live in the same conda environment as the notebook. This might mean running a second docker container on the 2nd machine, and creating a scheduler and a bunch of workers.
Docker deployment: https://docs.dask.org/en/stable/deploying-docker.html

SSH deployment: https://docs.dask.org/en/stable/deploying-ssh.html

Concerned Organizations

@tlvu (Ouranos)

@huard huard added the enhancement New feature or request label Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants