Skip to content

Make TCP keepalive parameters configurable #102

@ansd

Description

@ansd

TCP keepalive is optional and off by default.

Its purpose is to:

  1. detect dead peers. If peer is not alive, close the socket to save resources.
  2. prevent connection from being closed by firewall or NAT proxy due to inactivity.

A real world use case of 2. got reported in https://rabbitmq.slack.com/archives/C1EDN83PA/p1656489674972399.

Therefore in Osiris, TCP keepalive can be optionally enabled for the stream data replication. Both client and server have to opt in by setting osiris parameter replica_keepalive:

This issue is about whether we should make the TCP keepalive parameters configurable:

  1. Keepalive time is the duration between two keepalive transmissions in idle condition. TCP keepalive period is required to be configurable and by default is set to no less than 2 hours.
  2. Keepalive interval is the duration between two successive keepalive retransmissions, if acknowledgement to the previous keepalive transmission is not received.
  3. Keepalive retry is the number of retransmissions to be carried out before declaring that remote end is not available

Specifically, it may be desirable to decrease 1. Keepalive time to a value lower than 2 hours.

See https://github.com/emqx/emqx/blob/6d5ad97528072e7b9186cb35e2eab7695dd0393a/apps/emqx/src/emqx_connection.erl#L269-L272 for an Erlang example.

Note however:

Code such as these [raw socket option] examples is inherently non-portable, even different versions of the same OS on the same platform can respond differently to this kind of option manipulation. Use with care.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions