Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/nl format #96

Closed
wants to merge 16 commits into from
Closed

Feature/nl format #96

wants to merge 16 commits into from

Conversation

drnimbusrain
Copy link
Member

@drnimbusrain drnimbusrain commented Nov 7, 2023

@zmoon @angehung5 Since we have over 40 NL options and growing, need to organize better. I also added some python/README changes for running global data and an example slurm script to help. Please let me know ASAP what you think so we can update develop going forward.

@angehung5
Copy link
Collaborator

I changed the required cpu time and mem in the slurm script and readme based on my slurm settings on Hopper. Time might be a little off since I set ntask=12, but mem should be accurate (could be less depending on the case but it would be safer to put 60G).

Copy link
Member Author

@drnimbusrain drnimbusrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angehung5 Please see my comments on your suggested memory updates here.

python/README.md Outdated
#SBATCH --nodes=1 # Request N nodes
#SBATCH --ntasks=1 # Request n tasks
#SBATCH --mem-per-cpu=1000GB # Request nGB RAM per core
#SBATCH --mem-per-cpu=60GB # Request nGB RAM per core
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angehung5 A single node/task with your suggestion of 60GB memory may not work, please test this example.

I don't think we really need to increase ntasks because its not parallel code.

From my experience using a single node/task requires like about 256 GB memory.

#SBATCH --nodes=1 # Request N nodes
#SBATCH --exclude=hop006,hop010,hop011 # Exclude some nodes (optional)
#SBATCH --ntasks=1 # Request n tasks
#SBATCH --mem-per-cpu=60GB # Request nGB RAM per core
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angehung5 A single node/task with your suggestion of 60GB memory may not work, please test this example.

I don't think we really need to increase ntasks because its not parallel code.

From my experience using a single node/task requires like about 256 GB memory.

Copy link
Collaborator

@angehung5 angehung5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Hopper. For three time steps with node=1 and ntask=1:

Cores: 1
CPU Utilized: 01:12:51
CPU Efficiency: 95.37% of 01:16:23 core-walltime
Job Wall-clock time: 01:16:23
Memory Utilized: 9.57 GB

I put 12G just in case.

@drnimbusrain
Copy link
Member Author

1:16

@angehung5 OK, thank you. Is this for running both the python global process script and canopy-app for those times?

@angehung5
Copy link
Collaborator

1:16

@angehung5 OK, thank you. Is this for running both the python global process script and canopy-app for those times?

global process script only.

@drnimbusrain
Copy link
Member Author

drnimbusrain commented Nov 8, 2023

1:16

@angehung5 OK, thank you. Is this for running both the python global process script and canopy-app for those times?

global process script only.

@angehung5 Right, then the memory suggestion is not correct, and would need to be largely increased for canopy-app too. We need to test run both at same time/job.

@drnimbusrain
Copy link
Member Author

Closing this PR due to updated new PR from @angehung5 built on latest commit.

@drnimbusrain drnimbusrain deleted the feature/nl_format branch November 9, 2023 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants