-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jobs for microsalt analyses not tracked correctly #2896
Comments
For example, for one case a job ids file path like this is stored: |
Decided fixChanging the output dir in microsalt is not viable given lack of tests and that the entire pipeline is being replaced. The least error prone path is to revert the old logic and just create the missing directory on Hasta for the slurm job id files. This is the relevant logic creating the slurm job ids file: try:
#Generates file with all slurm ids
slurmname = "{}_slurm_ids.yaml".format(self.name)
slurmreport_storedir = Path(self.config["folders"]["reports"],
"trailblazer", slurmname)
slurmreport_workdir = Path(self.finishdir, slurmname)
yaml.safe_dump(
data={"jobs": [str(job) for job in joblist]},
stream=open(slurmreport_workdir, "w"))
shutil.copyfile(slurmreport_workdir, slurmreport_storedir)
self.logger.info(
"Saved Trailblazer slurm report file to %s and %s",
slurmreport_storedir,
slurmreport_workdir,
)
except Exception as e:
self.logger.info("Unable to generate Trailblazer slurm report file")
|
If a microsalt analysis is re-run, will the old slurm ids be overwritten in the trailblazer directory? |
The name used for the job ids file seems to differ depending on the number of samples in the case 🤢 😭 if isinstance(self.sampleinfo, list) and len(self.sampleinfo) > 1:
self.name = self.sampleinfo[0].get("CG_ID_project")
self.sample = self.sampleinfo[0]
for entry in self.sampleinfo:
if entry.get("CG_ID_sample") == self.name:
raise Exception(
"Mixed projects in samples_info file. Do not know how to proceed"
)
else:
if isinstance(self.sampleinfo, list):
self.sampleinfo = self.sampleinfo[0]
self.name = self.sampleinfo.get("CG_ID_sample")
self.sample = self.sampleinfo I'm going to disregard this since cases with one sample are rare in microsalt. And why would you even use different paths? 🤦 Added to backlog in microsalt Clinical-Genomics/microSALT#170 |
Description
For some microsalt analyses, the slurm jobs are not tracked. When checking the path to the job ids file for those analyses, it does not exist, which explains why no jobs show up.
It turns out for the analyses where jobs were being reported (over the past month), the jobs displayed were actually from the previous analysis of the case.
The underlying issue is that microsalt outputs a directory with a timestamp, and it is only created once the analysis is completed. So the pending analysis in trailblazer cannot be provided with the correct path.
Suggested solution
After digging in the microsalt codebase, it was discovered that it attempts to write a job ids file to
/microbial/results/reports/trailblazer/<project_id>_slurm_ids.yaml
. The trailblazer directory does not exist, so it fails. This is the file we need to use./microbial/results/reports/trailblazer
/microbial/results/reports/trailblazer/<project_id>_slurm_ids.yaml
The text was updated successfully, but these errors were encountered: