Skip to content

Conversation

@pmachapman
Copy link
Collaborator

@pmachapman pmachapman commented Jan 29, 2026

Fixes #834

To test locally, stop the serval-machine-engine container, and start a build (for example using the API example). Messages will then retry with an exponential back off.

Restarting serval-machine-engine will cause the messages succeed on retry when the timeout is hit. Previously, messages would not retry in a service outage like this, blocking the queue.

Due to the way the subscription appears to work (watching the first message in the queue for update or deletion), the timeout is in effect as long as there are messages in the queue, and the sending of the first message in the queue has failed. This works correctly now that the attempts counter for that item is no longer incremented (which triggered the subscription, which ran process messages, which failed, which triggered the subscription... ad infinitum).

When the queue is empty, the arrival of the first message in the queue will commence processing, as per before.


This change is Reviewable

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 67.85714% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.41%. Comparing base (b73ab6b) to head (04a6b85).

Files with missing lines Patch % Lines
...L.ServiceToolkit/Services/OutboxDeliveryService.cs 67.85% 7 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #862      +/-   ##
==========================================
- Coverage   66.42%   66.41%   -0.01%     
==========================================
  Files         382      382              
  Lines       20782    20797      +15     
  Branches     2717     2723       +6     
==========================================
+ Hits        13805    13813       +8     
- Misses       6007     6012       +5     
- Partials      970      972       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Peter!

I believe the subscription should be waiting for an insert to the collection, not an update or deletion. If it isn't, I think that's a bug, but I could be wrong.

Also, I may be misunderstanding, but this back-off you've implemented, won't it affect the processing of all messages? I would have thought that we'd want to have a retry mechanism per message group so that we don't hold up all builds if one is having problems. It may be that I'm just not following the code correctly.

@Enkidu93 reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @ddaspit).

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @pmachapman).


src/ServiceToolkit/src/SIL.ServiceToolkit/Services/OutboxDeliveryService.cs line 170 at r1 (raw file):

    {
        // log error
        await messages.UpdateAsync(m => m.Id == message.Id, b => b.Inc(m => m.Attempts, 1));

Is Attempts no longer needed? Can we remove it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Outbox is incorrectly retrying to process messages

5 participants