Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building software on compute nodes #146

Open
pmenstrom opened this issue Oct 15, 2024 · 2 comments
Open

Building software on compute nodes #146

pmenstrom opened this issue Oct 15, 2024 · 2 comments

Comments

@pmenstrom
Copy link
Collaborator

I am not sure if the documentation mentions that for large, compute intensive compilations the users are probably better off compiling their code on a compute node. In ticket SUP-6253 the user's compilation was much faster on a compute node.

@lhelms2
Copy link
Collaborator

lhelms2 commented Oct 15, 2024

@pmenstrom, I can't view tickets in the new Jira, can you please include a screenshot of the most relevant portion of the discussion?

@pmenstrom
Copy link
Collaborator Author

The user thought the compile wasn't working, I had him watch for files in his temp directory and he saw them being created and increasing in size. I responded:

Peter Enstrom: Yeah, I think that it is working but just taking a long time. You can get more power on a compute node instead of sharing with everyone else on the login node. If you use srun to start an interactive job you will start a shell on a compute node and it will have access to all of the same files and your home directory.

Deep Chatterjee: Need to signoff for the day, but I wanted to point out that the max disk usage of the build below, following my comment was 17G

17G /scratch/bcse/deep1018/build-temp-2864596409

And the build did proceed, albeit taking way long time. However, it still does not explain the 26G limit on the tmpfs partition giving a space error. It will be good to investigate the same.

Peter Enstrom: I am thinking that there might be some intermediary files?
Glad that it is working.

Deep Chatterjee: I am letting you know that re-trying the builds on a compute node work without issues with the default APPTAINER_TMPDIR i.e. /tmp. Also, takes less time (or at least as much time that I had seen prior to I hit the error, when I created this issue). So I am happy to mark this as resolved. But wanted to suggest that it will be nice to have this suggestion of trying the build on a compute node be added to the Delta documentation. Many thanks for you help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants