Fix: requeue a run if the DB is unavailable during dequeuing #2938

matt-aitken · 2026-01-24T19:59:12Z

In the DequeueSystem if the database is unavailable we were dequeuing from Redis and then failing to requeue in the error catcher – this was because the requeuing required DB access.

Now if in the catch we encounter a DB error we requeue directly using Redis, putting it back in the queue.

In the DequeueSystem if the database is unavailable we were dequeuing from Redis and then failing to requeue in the error catcher – this was because the requeuing required DB access. Now if in the `catch` we encounter a DB error we requeue directly using Redis, putting it back in the queue.

changeset-bot · 2026-01-24T19:59:16Z

⚠️ No Changeset found

Latest commit: 3d2bd66

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-01-24T19:59:44Z

Walkthrough

The dequeue system is modified to handle database lookup failures more robustly. A tryCatch wrapper is added around the Prisma query for finding the task run, and error handling is updated to immediately nack messages via Redis when the database lookup fails or the run is not found, bypassing subsequent retry logic. A corresponding end-to-end test is added to verify the concurrency reset and recovery flow following a direct nack.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description explains the problem and solution but lacks required template sections: no issue reference, incomplete checklist, missing Testing section details, and no Changelog section.	Add issue reference (Closes `#XXXX`), complete the checklist items, describe testing steps in the Testing section, and add a Changelog entry summarizing the changes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: handling database unavailability during dequeuing by requeuing the run, which directly matches the pull request's core objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional flags.

vibe-kanban-cloud · 2026-01-24T20:02:23Z

Review Complete

Your review story is ready!

View Story

Comment !reviewfast on this PR to re-generate the story.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@internal-packages/run-engine/src/engine/systems/dequeueSystem.ts`:
- Line 3: The import in dequeueSystem.ts currently pulls assertExhaustive and
tryCatch from the package root; update the import to use the subpath export by
importing those symbols from "@trigger.dev/core/utils" instead of
"@trigger.dev/core" so the file imports assertExhaustive and tryCatch from the
utils subpath.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4093883 and 3d2bd66.

📒 Files selected for processing (2)

internal-packages/run-engine/src/engine/systems/dequeueSystem.ts
internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

🧰 Additional context used

📓 Path-based instructions (7)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

**/*.{ts,tsx}: Always import tasks from @trigger.dev/sdk, never use @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Every Trigger.dev task must be exported and have a unique id property with no timeouts in the run function

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Import from @trigger.dev/core using subpaths only, never import from root

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

**/*.test.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.test.{ts,tsx,js,jsx}: Test files should live beside the files under test and use descriptive describe and it blocks
Tests should avoid mocks or stubs and use the helpers from @internal/testcontainers when Redis or Postgres are needed
Use vitest for running unit tests

**/*.test.{ts,tsx,js,jsx}: Use vitest exclusively for testing and never mock anything - use testcontainers instead
Place test files next to source files with naming pattern: source file (e.g., MyService.ts) → MyService.test.ts

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use testcontainers helpers (redisTest, postgresTest, containerTest) from @internal/testcontainers for Redis/PostgreSQL testing instead of mocks

Files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

🧠 Learnings (13)

📓 Common learnings

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

📚 Learning: 2025-11-27T16:26:44.496Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/executing-commands.mdc:0-0
Timestamp: 2025-11-27T16:26:44.496Z
Learning: For running tests, navigate into the package directory and run `pnpm run test --run` to enable single-file test execution (e.g., `pnpm run test ./src/engine/tests/ttl.test.ts --run`)

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2026-01-15T11:50:06.067Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to **/*.test.{ts,tsx} : Use testcontainers helpers (`redisTest`, `postgresTest`, `containerTest`) from `internal/testcontainers` for Redis/PostgreSQL testing instead of mocks

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2026-01-15T10:48:02.687Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Tests should avoid mocks or stubs and use the helpers from `internal/testcontainers` when Redis or Postgres are needed

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2026-01-15T11:50:06.067Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Use vitest exclusively for testing and never mock anything - use testcontainers instead

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2025-11-27T16:26:37.432Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to **/*.{test,spec}.{ts,tsx} : Use vitest for all tests in the Trigger.dev repository

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2025-10-08T11:48:12.327Z

Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2025-11-27T16:26:37.432Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2026-01-15T11:50:06.067Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to apps/webapp/**/*.test.{ts,tsx} : For testable code in the webapp, never import env.server.ts in test files - pass configuration as options instead

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2026-01-15T11:50:06.067Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to **/*.{ts,tsx} : Every Trigger.dev task must be exported and have a unique `id` property with no timeouts in the run function

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2026-01-15T10:48:02.687Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Use vitest for running unit tests

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts

📚 Learning: 2025-07-12T18:06:04.133Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

Applied to files:

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

📚 Learning: 2024-10-18T15:41:52.352Z

Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 1418
File: packages/core/src/v3/errors.ts:364-371
Timestamp: 2024-10-18T15:41:52.352Z
Learning: In `packages/core/src/v3/errors.ts`, within the `taskRunErrorEnhancer` function, `error.message` is always defined, so it's safe to directly call `error.message.includes("SIGTERM")` without additional checks.

Applied to files:

internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

🧬 Code graph analysis (1)

internal-packages/run-engine/src/engine/systems/dequeueSystem.ts (2)

packages/core/src/v3/tryCatch.ts (1)

tryCatch (8-15)

apps/webapp/app/db.server.ts (1)

prisma (101-101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
GitHub Check: typecheck / typecheck

🔇 Additional comments (2)

internal-packages/run-engine/src/engine/systems/dequeueSystem.ts (1)

614-635: Solid DB-unavailable fallback.
The direct Redis nack on lookup failure prevents losing dequeued runs when Postgres is down.

internal-packages/run-engine/src/engine/tests/dequeuing.test.ts (1)

84-213: Great recovery-flow coverage.
The new test exercises direct nack, concurrency cleanup, and stalled-system recovery end-to-end.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

internal-packages/run-engine/src/engine/systems/dequeueSystem.ts

devin-ai-integration bot reviewed Jan 24, 2026

View reviewed changes

coderabbitai bot reviewed Jan 24, 2026

View reviewed changes

internal-packages/run-engine/src/engine/systems/dequeueSystem.ts Show resolved Hide resolved

mpcgrid approved these changes Jan 24, 2026

View reviewed changes

matt-aitken merged commit 825219a into main Jan 25, 2026
35 checks passed

matt-aitken deleted the fix-dequeue-database-unavailable branch January 25, 2026 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: requeue a run if the DB is unavailable during dequeuing #2938

Fix: requeue a run if the DB is unavailable during dequeuing #2938

matt-aitken commented Jan 24, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

changeset-bot bot commented Jan 24, 2026

Uh oh!

coderabbitai bot commented Jan 24, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

vibe-kanban-cloud bot commented Jan 24, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix: requeue a run if the DB is unavailable during dequeuing #2938

Fix: requeue a run if the DB is unavailable during dequeuing #2938

Conversation

matt-aitken commented Jan 24, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Jan 24, 2026

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

vibe-kanban-cloud bot commented Jan 24, 2026

Review Complete

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

matt-aitken commented Jan 24, 2026 •

edited by devin-ai-integration bot

Loading

coderabbitai bot commented Jan 24, 2026 •

edited

Loading