Skip to content

Conversation

@jeqo
Copy link
Contributor

@jeqo jeqo commented Jan 22, 2026

No description provided.

majialoong and others added 30 commits October 6, 2025 22:07
[In this PR](apache/kafka#20334), we added some
validation checks for the connect config, such as ensuring that
`plugin.path` cannot be empty.

 However, currently, Connect first loads the plugin and then creates the
configuration. Even if `plugin.path` is empty, it still attempts to load
the plugin first, and then throws an exception when creating the
configuration.

The approach should be to first create a configuration to validate that
the config meet the requirements, and then load the plugin only if the
validation passes. This allows for early detection of problems and
avoids unnecessary plugin loading processes.

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
We can rewrite this class from scala to java and move to server-common
module.  To maintain backward compatibility, we should keep the logger
name `state.change.logger`.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Simplify the last known elr update logic. This way can make a more
robust logic.

Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
Added Testcases for consumer rebalance metric manager test.

Reviewers: Lianet Magrans <lmagrans@confluent.io>, TengYao Chi
 <frankvicky@apache.org>, Hong-Yi Chen <apalan60@gmail.com>
…1.0 (#20639)

Added a note regarding the memory leak bug in the documentation.

Reviewers: Matthias J. Sax <matthias@confluent.io>
List of changes:
- prerequisite Jira ticket:
  - [KAFKA-19591](https://issues.apache.org/jira/browse/KAFKA-19591)
- mandatory version upgrades:
  - Gradle version: 8.14.3 -->> 9.1.0
  - Gradle Shadow plugin: 8.3.6 -->> 8.3.9
  - Gradle dependencycheck plugin: 8.2.1 -->> 12.1.3
  - Gradle spotbugs plugin: 6.2.3 -->> 6.2.5
  - Gradle spotless plugin: 6.25.0 -->> 7.2.1
- build logic will be refactored to accommodate Gradle 9 breaking
changes
- (optional): a dozen of Gradle plugins versions will also be upgraded
- other JIRA tickets that had to be solved all along:
  - [KAFKA-16801](https://issues.apache.org/jira/browse/KAFKA-16801)
  - [KAFKA-19654](https://issues.apache.org/jira/browse/KAFKA-19654)

 **Related links:**
- https://gradle.org/whats-new/gradle-9
- https://github.com/gradle/gradle/releases/tag/v9.0.0
- https://docs.gradle.org/9.0.0/release-notes.html
- https://docs.gradle.org/9.0.0/userguide/upgrading_major_version_9.html
- https://docs.gradle.org/9.1.0/release-notes.html

Notes:
- new Gradle version brings up some breaking changes, as always 😃
- Kafka build with Gradle 9 has same issues as other projects:
  - redhat-developer/vscode-java#4018
  - gradle/gradle#32597

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Since that 4.0 branch does not include
[KAFKA-18748](https://issues.apache.org/jira/browse/KAFKA-18748), it is
unable to find the related scan reports, but the ci-complete workflow is
still being triggered on the 4.0 branch. Disable this on the 4.0 branch,
as its reports can be safely ignored.

See apache/kafka#20616 (comment).

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…#19955)

In Kafka Streams, configuration classes typically follow a fluent API
pattern to ensure a consistent and intuitive developer experience.
However, the current implementation of
`org.apache.kafka.streams.KafkaStreams$CloseOptions` deviates from this
convention by exposing a public constructor, breaking the uniformity
expected across the API.

To address this inconsistency, we propose introducing a new
`CloseOptions` class that adheres to the fluent API style, replacing the
existing implementation. The new class will retain the existing
`timeout(Duration)` and `leaveGroup(boolean)` methods but will enforce
fluent instantiation and configuration. Given the design shift, we will
not maintain backward compatibility with the current class.

This change aligns with the goal of standardizing configuration objects
across Kafka Streams, offering developers a more cohesive and
predictable API.

Reviewers: Bill Bejeck<bbejeck@apache.org>
…equest (#20632)

Test the `StreamsGroupDescribeRequest` RPC and corresponding responses
for situations where
- `streams.version` not upgraded to 1
- `streams.version` enabled, multiple groups listening to the same
topic.

Reviewers: Lucas Brutschy <lucasbru@apache.org>
…erver module (#20636)

It moves the `ReconfigurableQuorumIntegrationTest` class to the
`org.apache.kafka.server` package and consolidates two related tests,
`RemoveAndAddVoterWithValidClusterId` and
`RemoveAndAddVoterWithInconsistentClusterId`, into a single file. This
improves code organization and reduces redundancy.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…ome issues with Gradle 9+ are solved) (#20652)

Extends from: #19513

Note: in Gradle 9+ we have to add a switch like this:
```
./gradlew dependencyUpdates --no-parallel
```
Related link:
https://github.com/ben-manes/gradle-versions-plugin/releases/tag/v0.53.0

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This patch updates the Apache Kafka project's build, test, and
dependency configurations.

- Java Version Update: The build and test processes have been upgraded
from Java 24 to Java 25.
- Scala Version Update: The Scala library version has been updated from
2.13.16 to 2.13.17.
-  Dependency Upgrades: Several dependencies have been updated to newer
versions, including mockito (5.14.2 to 5.20.0), zinc (1.10.8 to 1.11.0),
and scala-library/reflect (2.13.16 to 2.13.17).
- Code and Configuration Changes: The patch modifies build.gradle to
exclude certain Spotbugs tasks for Java 25 compatibility. It also
changes the default signing algorithm in TestSslUtils.java from
SHA1withRSA to SHA256withRSA, enhancing security.
- Documentation: The README.md file has been updated to reflect the new
Java 25 requirement.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Liam Clarke-Hutchinson
 <liam@steelsky.co.nz>, Gaurav Narula <gaurav_narula2@apple.com>,
 Chia-Ping Tsai <chia7712@gmail.com>
Updates all GitHub Actions to their latest versions.

----
**Upgraded Actions:**

* **Gradle setup**:
     * `gradle/actions/setup-gradle` **v4.4.4 → v5.0.0**
* **Trivy security scanner**:
     * `aquasecurity/trivy-action` **v0.24.0 → v0.33.1**
* **Docker build tools:**

    * `docker/setup-qemu-action` **v3.2.0 → v3.6.0**
    * `docker/setup-buildx-action` **v3.6.1 → v3.11.1**
    * `docker/login-action` **v3.3.0 → v3.6.0**
* **GitHub utilities:**

    * `actions/github-script` **v7 → v8**

    * `actions/stale` **v9 → v10**

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…0399)

the constructor is error-prone when migrating code, since metrics could
get unintentionally changed. We should remove the constructor and use
constant strings instead to avoid issues like KAFKA-17876, KAFKA-19150,
and others.

Reviewers: Ken Huang <s7133700@gmail.com>, Jhen-Yung Hsu
 <jhenyunghsu@gmail.com>, KuoChe <kuoche1712003@gmail.com>, Chia-Ping
 Tsai <chia7712@gmail.com>
Clear pendingTasksToInit on tasks clear.  It matters in situations when
we shutting down a thread in PARTITIONS_ASSIGNED state. In this case we
may have locked some unassigned task directories (see
TaskManager#tryToLockAllNonEmptyTaskDirectories). Then we may have
gotten assigned to one or multiple of those tasks. In this scenario,  we
will not release the locks for the unassigned task directories (see
TaskManager#releaseLockedUnassignedTaskDirectories), because
TaskManager#allTasks includes pendingTasksToInit, but it hasn't been
cleared.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Lucas Brutschy
 <lbrutschy@confluent.io>
from: apache/kafka#20637 (comment)

Previously, the method used LOGGER.info() instead of LOGGER.trace().
This patch corrects the logging level used in the trace method of
StateChangeLogger.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, TengYao Chi
 <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>

Co-authored-by: Ubuntu <ubuntu@jim.infra.cloudnative.tw>
Fix the incorrect package name in metrics and revert the comment

see:   apache/kafka#20399 (comment)
apache/kafka#20399 (comment)

Reviewers: Ken Huang <s7133700@gmail.com>, Manikumar Reddy
 <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Stores the existing values for both the fields in a local variable for
logging.

Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
…he flakiness of the test. (#20664)

MINOR: changed the condition to only check the test topic to reduce the
flakiness of the test.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Lianet Magrans
 <lmagrans@confluent.io>
…ling (#20661)

When a failure occurs with a push telemetry request, any exception is
treated as fatal, increasing the time interval to `Integer.MAX_VALUE`
effectively turning telemetry off.  This PR updates the error handling
to check if the exception is a transient one with expected recovery and
keeps the telemetry interval value the same in those cases since a
recovery is expected.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Matthias
 Sax<mjsax@apache.org>
…rTest [2/N] (#20544)

Changes made
- Additional `setUpTaskManager()` overloaded method -- Created this
temporarily to pass the CI pipelines so that I can work on the failing
tests incrementally
- Rewrote 3 tests to use stateUpdater thread

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
The auto-topic creation in` AutoTopicCreationManager` currently retries
creating internal topics with every heartbeat.

A simple back-off mechanism was implemented: if there is error in the
errorcache and it's not expired or it's already in the inflight topics,
then not send the topic creation request.

Unit tests are added as well.

Reviewers: Lucas Brutschy <lucasbru@apache.org>
We missed a branch in #20671.

This PR handles the else branch where we log about skipping the follower
state change.

Also updated the doc for the method as it was out of date.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The comment stating "Additionally adds a license header to the wrapper
while editing the file contents" is no longer accurate. This
functionality was removed in PR #7742 (commit 45842a3) since Gradle
5.6+ automatically handles license headers in wrapper files.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…ded with a `--no-parallel` switch (#20683)

Prologue:
apache/kafka#19513 (comment)

Also related to: #20652

@chia7712 I double-checked custom gradle commands in `README.md` and
AFAIK we should be fine now.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…-->> 9 (#20684)

from: apache/kafka#19513 (comment)

1. Fix the task `unitTest` and `integrationTest`;
2. Change the `name` to a method call `name()` for `KeepAliveMode`.

Reviewers: Ken Huang <s7133700@gmail.com>, dejan2609
 <dejan2609@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Clarify the Javadoc for `Node#isFenced` to align with KIP-1073, which
introduced the “fenced” field in `DescribeCluster` for brokers. The
“fenced” flag applies only to broker nodes returned by
`DescribeCluster`. For controller quorum nodes, it doesn’t apply and is
always `false`. This clarifies that not every node has a meaningful
fenced state.

Reviewers: TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
This PR is a follow-up to KAFKA-18193. It addresses the need for a null
check and an improved error message. Please refer to the previous
comments and the review of
apache/kafka#19955 (review)
for more context.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
There are some calls to `TestUtils::waitForCondition` with the actual
value of the condition being evaluated. These calls must use the
overload with a `Supplier` for conditionDetails so that the actual value
is lazily evaluated at the time of the condition failing.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…eMode (#20695)

In Scala/Java joint compilation, builds that run with
`keepAliveMode=SESSION` can fail in the `:core:compileScala` task. This
happens when `javac` is given a non-existent Java output directory on
its classpath.

```
Unexpected javac output: warning: [path] bad path element
".../core/build/classes/java/main": no such file or directory
error: warnings found and -Werror specified
```

This issue occurs after a clean build because
`core/build/classes/java/main` does not exist, as Java classes are
written to `classes/scala/main` during joint compilation. With `-Werror`
enabled, the resulting `[path]` warning is treated as an error and stops
the build.

This PR adds a workaround scoped to **SESSION** builds, while continuing
to investigate the underlying classpath assembly logic during joint
compilation.

Related discussion:
apache/kafka#20684 (comment)
Tracked by:
[KAFKA-19786](https://issues.apache.org/jira/browse/KAFKA-19786)

Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai
 <chia7712@gmail.com>
ShivsundarR and others added 30 commits November 19, 2025 11:36
…#20922)

*What*

- Currently, when there is a backoff wait period, we are retrying the
acknowledgements until the backoff period completes and then these acks
are sent.

- But as this is a unit test, we can use time.sleep() to forward the
`currentTime`, which will allow the backoff period to be over.

- PR fixes the 4 tests in `ShareConsumeRequestManagerTest` to use
`MockTime::sleep`. This makes the tests faster as we do not actually
need to wait for the backoff. We can just update the value.

Reviewers: Andrew Schofield <aschofield@confluent.io>
… (#20914)

Fix to avoid initializing positions for partitions being revoked, as it
is unneeded (we do not allow fetching from partitions being revoked),
and could lead to NoOffsetForPartitionException on a partition that is
already being revoked (this is confusing for applications).

This is the behaviour we already had for fetch, just applying it to
update positions to align.

This gap was surfaced on edge cases of partitions being assigned and
revoked right away.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Lucas Brutschy
 <lbrutschy@confluent.io>
…ck [3/3] (#20882)

Last part of KIP-1228  Here we implement the code path on the partition
leader's side. All the required steps to handle the write txn markers
request.  Extracting the version information from the request and
propagating it to the check producer epoch method where we validate the
producer epoch based on the transaction version.

```
The transactionVersion flows as follows:

1. Extracted from TxnMarkerEntry in
KafkaApis.handleWriteTxnMarkersRequest
2. There is one append per TransactionMarker in the request
3. Passed through ReplicaManager.appendRecords → appendRecordsToLeader →
appendToLocalLog
4. Passed to Partition.appendRecordsToLeader
5. Passed to UnifiedLog.appendAsLeader → append →
analyzeAndValidateProducerState
6. Passed to UnifiedLog.updateProducers
7. Passed to ProducerAppendInfo.append → appendEndTxnMarker
8. Finally used in checkProducerEpoch for epoch validation

```
The transactionVersion is preserved through the call chain and used in
checkProducerEpoch to determine whether to apply TV2 (strict > check) or
legacy (>= check) validation.

Reviewers: Justine Olshan <jolshan@confluent.io>, Artem Livshits
 <alivshits@confluent.io>
## Summary
When working on

[KIP-1091](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1091%3A+Improved+Kafka+Streams+operator+metrics),
we mistakenly applied the `process-id` tag to all client-level metrics,
rather than just the `client-state`, `thread-state`, and
`recording-level` metrics as specified in the KIP. This issue came to
light while working on KIP-1221, which aimed to add the `application-id`
as a tag to the `client-state` metric introduced by KIP-1091. This PR
removes these tags from all metrics by default, and adds them to only
the `client-state` (application-id + process-id) and the
`recording-level` (process-id only)

Reviewers: Matthias Sax<mjsax@apache.org> Bill
 Bejeck<bbejeck@apache.org>

## Tests
Unit tests in `ClientMetricsTest.java` and `StreamsMetricsImplTest.java`
Client unit tests are failing because of OOM. If the
`ConsumerNetworkThread` instantiated in the constructor for
`ApplicationEventHandler` fails to start in a timely fashion, it can be
leaked.

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True
 <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
…` (#20899)

While running Kafka e2e tests, various tests were failing with
`TimeoutError('Kafka node failed to stop in 60 seconds')`. In Kafka e2e
tests, we check the PID to ensure the Kafka server has shut down. After
investigating this issue, I found that the Kafka process was a zombie
process in the container:
```bash
ducker@ducker05:/$ jcmd
285 kafka.Kafka /mnt/kafka/kafka.properties
18207 jdk.jcmd/sun.tools.jcmd.JCmd

ducker@ducker05:/$ cat /proc/285/status | grep -i state
State:  Z (zombie)
```

This issue is related to [this
change](https://github.com/apache/kafka/pull/17554/files#r1845737954).
When using `CMD ["sudo", "service", "ssh", "start", "-D"]`, PID 1
becomes the SSH service, which does not handle `SIGCHLD` signals and
therefore won't reap zombie processes:
```bash
ducker@ducker05:/$ cat /proc/1/cmdline | tr '\0' ' '
sudo service ssh start -D
```

However, with the old syntax `CMD sudo service ssh start && tail -f
/dev/null`, PID 1 is `/bin/sh`, which is a shell that properly reaps
zombie processes:
```bash
ducker@ducker05:/$ cat /proc/1/cmdline | tr '\0' ' '
/bin/sh -c sudo service ssh start && tail -f /dev/null
```

Use `tini` as PID 1 to properly manage processes and avoid zombie
processes from remaining in the system.

Reviewers: PoAn Yang <payang@apache.org>, TaiJuWu <tjwu1217@gmail.com>,
 Chia-Ping Tsai <chia7712@gmail.com>
Javadoc for the `KafkaShareConsumer` updated for AK 4.2.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
Mark KIP-932 and KIP-1043 interfaces as stable. Tighten up deprecation
annotations for interfaces whose removal is planned in AK 5.0.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
…eConsumer (#19886)

https://issues.apache.org/jira/browse/KAFKA-17853  -

- There is an issue with the console share consumer where if the broker
is unavailable, even after force terminating using ctrl-c, the consumer
does not shut down immediately. It takes around ~30 seconds to close
once the broker shuts down.

- The console consumer on the other hand, was supposedly shutting down
immediately once we press ctrl-c. On reproducing the issue with a local
kafka server, I observed the issue was present in both the console
consumer and the console share consumer.

Issue :
- On seeing the client debug logs, this issue seemed related to network
thread sending repeated `FindCoordinator` requests until the timer
expired. This was happening in both the console-consumer and
console-share-consumer.

- Debug logs showed that when the broker is shut down, the heartbeat
fails with a `DisconnectException`(which is retriable), this triggers a
`findCoordinator` request on the network thread which retries until the
default timeout expires.

- This request is sent even before we trigger a close on the consumer,
so once we press ctrl-c, although the `ConsumerNetworkThread::close()`
is triggered, it waits for the default timeout until all the requests
are sent out for a graceful shutdown.

PR aims to fix this issue by adding a check in `NetworkClientDelegate`
to remove any pending unsent requests(with empty node values) during
close.   This would avoid unnecessary retries and the consumers would
shut down immediately upon termination.

Share consumers shutting down after the fix.
```
[2025-06-03 16:23:42,175] DEBUG [ShareConsumer
clientId=console-share-consumer, groupId=console-share-consumer]
Removing unsent request
UnsentRequest{requestBuilder=FindCoordinatorRequestData(key='console-share-consumer',
keyType=0, coordinatorKeys=[]),
handler=org.apache.kafka.clients.consumer.internals.NetworkClientDelegate$FutureCompletionHandler@2b351de8,
node=Optional.empty, remainingMs=28565} because the client is closing
(org.apache.kafka.clients.consumer.internals.NetworkClientDelegate)
[2025-06-03 16:23:42,175] DEBUG [ShareConsumer
clientId=console-share-consumer, groupId=console-share-consumer]
FindCoordinator request failed due to retriable exception
(org.apache.kafka.clients.consumer.internals.CoordinatorRequestManager)
org.apache.kafka.common.errors.NetworkException: The server disconnected
before a response was received.
[2025-06-03 16:23:42,176] DEBUG [ShareConsumer
clientId=console-share-consumer, groupId=console-share-consumer] Closing
RequestManagers
(org.apache.kafka.clients.consumer.internals.RequestManagers)
[2025-06-03 16:23:42,177] DEBUG [ShareConsumer
clientId=console-share-consumer, groupId=console-share-consumer]
RequestManagers has been closed
(org.apache.kafka.clients.consumer.internals.RequestManagers)
[2025-06-03 16:23:42,179] DEBUG [ShareConsumer
clientId=console-share-consumer, groupId=console-share-consumer] Closed
the consumer network thread
(org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread)
[2025-06-03 16:23:42,181] DEBUG [ShareConsumer
clientId=console-share-consumer, groupId=console-share-consumer] Kafka
share consumer has been closed
(org.apache.kafka.clients.consumer.internals.ShareConsumerImpl)
Processed a total of 0 messages
```

Regular consumers shutting down after the fix.

```
[2025-06-03 16:24:27,196] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] Removing unsent request
UnsentRequest{requestBuilder=FindCoordinatorRequestData(key='console-consumer-5671',
keyType=0, coordinatorKeys=[]),
handler=org.apache.kafka.clients.consumer.internals.NetworkClientDelegate$FutureCompletionHandler@3770591b,
node=Optional.empty, remainingMs=29160} because the client is closing
(org.apache.kafka.clients.consumer.internals.NetworkClientDelegate)
[2025-06-03 16:24:27,196] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] FindCoordinator request failed due to
retriable exception
(org.apache.kafka.clients.consumer.internals.CoordinatorRequestManager)
org.apache.kafka.common.errors.NetworkException: The server disconnected
before a response was received.
[2025-06-03 16:24:27,197] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] Closing RequestManagers
(org.apache.kafka.clients.consumer.internals.RequestManagers)
[2025-06-03 16:24:27,197] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] Removing test-topic-23-0 from buffered
fetch data as it is not in the set of partitions to retain ([])
(org.apache.kafka.clients.consumer.internals.FetchBuffer)
[2025-06-03 16:24:27,197] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] RequestManagers has been closed
(org.apache.kafka.clients.consumer.internals.RequestManagers)
[2025-06-03 16:24:27,200] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] Closed the consumer network thread
(org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread)
[2025-06-03 16:24:27,202] DEBUG [Consumer clientId=console-consumer,
groupId=console-consumer-5671] Kafka consumer has been closed
(org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer)
Processed a total of 0 messages
```

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Kirk True
 <ktrue@confluent.io>, Andrew Schofield <aschofield@confluent.io>
Updates to documentation for KIP-932.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>
This PR bumps the versions of docker/setup-qemu-action from 3.6 to 3.7.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This patch marks the OffsetCommit and OffsetFetch APIs as stable.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
Rewrote more tests in `TaskManagerTest.java`

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
* The field `AcquisitionLockTimeoutMs` in
`ShareAcknowledgeResponse.json` should be marked `ignorable` for
compatibility with older client versions.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Shivsundar R
 <shr@confluent.io>
This PR is part of

[KIP-1226](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1226%3A+Introducing+Share+Partition+Lag+Persistence+and+Retrieval).

This PR adds integration tests to ShareConsumerTest.java file to verify
the share partition lag is reported successfully in various scenarios.

Reviewers: Andrew Schofield <aschofield@confluent.io>
We only updated docs version in `4.1` branch, but not in `trunk`.

Cf
apache/kafka@abeebb3

Reviewers: Matthias J. Sax <matthias@confluent.io>
Sorted the names of the packages for which javadoc is generated. Removed
`org/apache/kafka/common/security/oauthbearer/secured` which no longer
exists.

Reviewers: Lianet Magrans <lmagrans@confluent.io>
Updating the docs for the client state metric (KIP-1221).

Reviewers: Matthias Sax<mjsax@apache.org>, Bill
 Bejeck<bbejeck@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
*What*
PR fixes some typos/nit in `ShareConsumerTest` and fixes a test
assertion in `ConsumerNetworkThreadTest`.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
For records with a delivery count exceeding 2, there is suspicion that
delivery failures   stem from underlying issues rather than natural
retry scenarios. The batching of such   records should be reduced.

Solution:  Determining which offset is bad is not possible at broker's
end. But broker can restrict the acquired records to a subset so only
bad record is skipped. We can do the following:

- If delivery count of a batch is >= 3 then only acquire 1/2 of the
batch records i.e for a batch of 0-499 (500 records) if batch delivery
count is 3 then start offset tracking and acquire 0-249 (250 records)
- If delivery count is again bumped then keeping acquring 1/2 of
previously acquired offsets until last delivery attempt i.e. 0-124 (125
records)
- For last delivery attempt, acquire only 1 offset. Then only the bad
record will be skipped.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield
 <aschofield@confluent.io>, Abhinav Dixit <adixit@confluent.io>

---------

Co-authored-by: d00791190 <dinglan6@huawei.com>
…0942)

Changing log level from ERROR to WARN, as it's expected that this can
happen, and we should not incorrectly make users worry about it.

Reviewers: PoAn Yang <payang@apache.org>, Colt McNealy
 <colt@littlehorse.io>, Lucas Brutschy <lbrutschy@confluent.io>,
 Chia-Ping Tsai <chia7712@gmail.com>
…(#20937)

jacksonDatabindYaml does not exist, it should be jacksonDataformatYaml.
I was a bit confused when I first saw the mention, I imagine others
might have the same.  Also moved the entry in the depencencies one row
down, so the order of the dependencies is more alphabetical.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.