Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to install.sh to just produce the parallelcluster config file. #273

Open
gwolski opened this issue Oct 27, 2024 · 2 comments
Open
Assignees

Comments

@gwolski
Copy link

gwolski commented Oct 27, 2024

I have spent countless hours trying to figure out proper syntax for the config file to get me a proper parallelcluster deployment. In thinking about the use case of aws-eda-slurm-cluster, once the cluster is up and all the nice infrastructure is in place, one really just wants to mess with the cluster config.

One could do that by exporting the config with the PCUI and modifying it and then giving the modification back to pcluster, but then my aws-eda-slurm-cluster config is out of sync.

If I do something wrong in the aws-eda-slurm-cluster config file and submit it via CloudFormation, I have ended up hurting my stack (see #271), if I could review the config file that would be fed to pcluster, that would be helpful.

I would like to be able to put modifications into the aws-eda-slurm-cluster config file and have it just create a new config file that I can review or feed to parallelcluster update-cluster.

@cartalla
Copy link
Contributor

That's a great idea. I was thinking about this just a couple of weeks ago as I was debugging issues and had the same thought.
The behavior would be the same as currently, but it wouldn't update the ParallelCluster resource; it would only update the config file.
The only tricky thing here, is that if the cluster has already been deployed then I'm assuming you don't want it deleted.
You just don't want it updated.
So I have to pass a flag to the custom resource telling it to not do a create/update.

@cartalla cartalla self-assigned this Oct 28, 2024
@gwolski
Copy link
Author

gwolski commented Oct 29, 2024

It sounds like you want to put the new config file in S3? I think that might be bad as then a user might be out of sync with their S3 buckets and what is deployed. I was thinking to dump it to a local file - maybe you should only update the S3 buckets when there is a create/update command? It looks like the S3 bucket has versioning enabled, but I would be very concerned the user (i.e. me) will forget what's deployed - unless there is someway to tell?

That said, if you dump it locally and then I apply it with pcluster, now my S3 bucket is out of sync. Hmm.

I had envisioned a work flow by which I create my cluster with aws-eda-slurm-cluster, get the infra all set up, then given the issues that I seem to have with CloudFormation and errors, I would just modify my aws-eda-slurm-cluster, make sure a generated config file is good, then feed that to pcluster directly so I could get better/easier error messages. I would of course have the revisions in my aws-eda-slurm-cluster config file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants