Skip to content

Conversation

@tcutts
Copy link

@tcutts tcutts commented Nov 17, 2025

This patch adds support for the squeue --only-job-state option, which allows admins of recent SLURM versions to enable job state caching, and thereby reduce the RPC call load on SLURM. Because not all SLURM versions support this, and it does cause other filtering options (-p and -u) to be ignored, this has a new config parameter for the executor, onlyJobState, which defaults to "false" to stay the same as current nextflow behaviour.

I have written unit tests for the code, included in the PR, plus some docs for the users.

I have run the code on AstraZeneca's SLURM cluster, and it appears to behave correctly, at least for a very simple test workflow.

bentsherman and others added 30 commits July 11, 2025 09:16
Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
…-io#6272)

Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>

* Bump org.apache.commons:commons-lang3 from 3.12.0 to 3.18.0

Bumps org.apache.commons:commons-lang3 from 3.12.0 to 3.18.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-lang3
  dependency-version: 3.18.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Christopher Hakkaart <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
…nextflow-io#6284)

Signed-off-by: Author Name <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Nathan Johnson <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
…i fast]

Changed Azure API call from JSON-embedded content to direct binary download
using 'download: true' parameter. This prevents binary data corruption that
occurred when converting JSON-escaped strings back to bytes.

Signed-off-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Nathan Johnson <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Claude <[email protected]>
---------

Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
) [ci fast]

Signed-off-by: jorgee <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: jorgee <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
… [ci fast]

Signed-off-by: Nathan Johnson <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Nathan Johnson <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
---------

Signed-off-by: jorgee <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
---------

Signed-off-by: Robrecht Cannoodt <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Add synchronized to writeObject method to prevent concurrent
modification of the shared stack field when multiple executor
threads serialize JSON simultaneously.

Signed-off-by: Tim Cutts <[email protected]>
…xtflow-io#6618) [ci fast]

* Add stageFileEnabled flag to control .command.stage file creation (nextflow-io#4279)

Improvement of nextflow-io#6558 providing:
1. More declarative approach
2. Better control on enabling/disabling stage file capability
3. Fix Google Batch stage file method

- Add stageFileEnabled field to TaskBean
- Add isStageFileEnabled() to TaskRun delegating to executor
- Add isStageFileEnabled() to Executor with NXF_WRAPPER_STAGE_FILE_ENABLED
  env var support, defaulting to true for AbstractGridExecutor
- Enable stageFileEnabled for GoogleBatchExecutor
- Add comprehensive tests

Signed-off-by: Tim Cutts <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Signed-off-by: Tim Cutts <[email protected]>
Copy link
Contributor

@jorgee jorgee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new property must also be added to the ExecutorConfig as the perCpuMemAllocation1. It would also be good to add a test in the ExecutorConfigTest for this property

@tcutts tcutts requested a review from jorgee December 2, 2025 09:14
Copy link
Contributor

@jorgee jorgee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested with a local Slurm installation, and it works fine.

In the squeue documentation, it doesn't mention that can't be used with -u or -p. I have tested it and despite the command doesn't fail, it is not returning any value. So, ignoring this flags when using --only-job-state is correct.

Changes look fine to me, I just have added some suggestions in docs to make versionadded blocks point to the next edge release and links to the Slurm documentation. @christopher-hakkaart and @tcutts look at them and accept if they are correct.

@tcutts there is a commit that is not signed-off. Could you fix it?

@tcutts tcutts force-pushed the 6570-slurm-job-state-query branch from e34ef6f to 3c36143 Compare December 13, 2025 08:20
@tcutts tcutts requested a review from a team as a code owner December 13, 2025 08:20
@tcutts
Copy link
Author

tcutts commented Dec 13, 2025

I've made a mess in my git trying to update the attestations. I'll close this pull request and start again.

@tcutts tcutts closed this Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Accelerate SLURM job state queries with --only-job-state option to squeue