Skip to content

Implement exponential backoff to replace error_timeout #834

@AbdulRahmanAlHamali

Description

@AbdulRahmanAlHamali

Currently, error_timeout decides how long the circuit should open before it transfers to half-open. It also serves as the error_threshold_timeout in case the latter was not defined.

There are two problems here:

  1. It is far from great that error_timeout is acting as a fallback for lack of error_threshold_timeout. They serve extremely different purposes...
  2. It is pretty arbitrary to decide to open the circuit for 10 seconds vs. 60 seconds.

This issue focuses on part 2, but I also believe that more adoption of this feature will help teams migrate towards getting rid of 1 faster.


One replacement of error_timeout is just to have a default behaviour that implements an exponential backoff. The old behaviour is as follows:

Circuit receives `error_threshold` number of errors --->  transfer to open ---> wait for error_timeout ---> transfer to half open ---> still see errors? ---> transfer to open ---> wait for error_timeout ---> transfer to half open ---> no more errors? ----> transfer to closed

The new exponential backoff implementation would be:

Circuit receives `error_threshold` number of errors --->  transfer to open ---> wait for 500ms ---> transfer to half open ---> still see errors? ---> transfer to open ---> wait for 1s ---> transfer to half open ---> still see errors? --> transfer to open ---> wait for 2s ---> transfer to half open... 

This would allow us to detect recovery much faster, and would not require guess work around what this value should be.

Note: there should probably be a maximum for the backoff, 60 seconds or so

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions