Agent
Prelude
Release on: 2025-12-10
- Please refer to the 7.73.0 tag on integrations-core for the list of changes on the Core Checks
Upgrade Notes
-
Replace batch processor with exporter helper for OTLP ingest due to the upcoming end-of-life for batch processor. More details: open-telemetry/opentelemetry-collector#8122
-
Remote Agent Management now creates a new directory that is at the same level as the current Agent configuration directory.
- Linux: /etc/datadog-agent-exp
- Windows: C:ProgramDataDatadog-exp
This directory is used during remote configuration updates and is deleted after the update is complete.
New Features
-
Added a new core check to send raw Kubelet configuration manifests to the Kubernetes Orchestrator.
-
Added comprehensive support for AWS ECS Managed Instances, including automatic deployment mode detection, hostname resolution for sidecar deployments, and validation logic to prevent misconfigured deployments.
-
Configure filtering for collection of autodiscovered metrics and logs through CEL-based rules using cel_workload_exclude.
-
Collect container metrics for ECS Managed Instances when running in sidecar mode.
-
APM: A more efficient trace payload encoding through the /v1.0/traces endpoint has been added.
-
Update JMXFetch to 0.51.0 to add configuration-level dynamic tags for JMX attribute values via dynamic_tags
-
[Remote Agent Management](https://docs.datadoghq.com/agent/fleet_automation/remote_management) is now enabled by default for Agents running on Linux and Windows hosts. This feature allows you to remotely upgrade and configure the Agent from the Datadog UI in Fleet Automation.
To disable, set
remote_updatestofalsein the Agent configuration file. -
The Datadog Installer now supports installing the datadog-apm-inject package on Windows systems.
Enhancement Notes
-
Adds kubernetes_state.daemonset.rollout_duration metric to the KSM check.
-
Implement check filtering in the scheduler and CLI to enforce infrastructure basic mode restrictions. When running in basic mode (
infrastructure_mode: "basic"), only core system checks (cpu, disk, memory, network, uptime, load, io, file_handle, ntp, system_core, telemetry) are allowed to execute. Additional checks can be allowlisted via theallowed_additional_integrationsconfiguration option. -
The Agent's embedded Python has been upgraded from 3.13.7 to 3.13.10
-
Network Path Collector (network traffic paths) now performs traceroutes using domain names instead of IP addresses.
-
This release refactors the ECS workloadmeta collector architecture to clearly separate ECS launch type (EC2 vs Fargate) from agent deployment mode (daemon vs sidecar). This improves code organization, reduces duplication, and helps future Managed Instances support.
-
KSM now supports using a wildcard to collect all resource labels/annotations as tags on metrics.
-
Added CCRID (Canonical Cloud Resource ID) support for Oracle Cloud Infrastructure hosts.
-
Add helpers for translating OTLP duration histograms to DDSketch in the pkg/opentelemetry-mapping-go/otlp/metrics package.
-
APM: The Trace Agent now omits infrequently used statistics when their values are zero, reducing overhead.
This can be overridden by setting the new configuration option apm_config.send_all_internal_stats to true. -
Agents are now built with Go
1.24.8. -
Agents are now built with Go
1.24.9. -
The Cluster Agent now enables both DD_CLUSTER_CHECKS_ADVANCED_DISPATCHING_ENABLED and DD_CLUSTER_CHECKS_REBALANCE_WITH_UTILIZATION by default. These options are now set to true in both the configuration template and the code, improving cluster check dispatching and balancing based on node utilization out-of-the-box. To disable these features, a user must now explicitly set them to false with the following config options: - name: DD_CLUSTER_CHECKS_ADVANCED_DISPATCHING_ENABLED value: "false" - name: DD_CLUSTER_CHECKS_REBALANCE_WITH_UTILIZATION value: "false"
-
Enable the Go disk and network core checks by default for Windows and Linux. These are direct ports of the existing Python disk and network checks and allow the Python runtime to be lazy loaded when other integrations are enabled. It can be disabled with setting
use_diskv2_checkanduse_networkv2_checkrespectively along with the loader in your configuration to use the Python version. -
Python runtime will now be lazy loaded when there are no Python integrations configured. This can be disabled by setting
python_lazy_loading: falsein your configuration. -
Allow check configurations to be matched to services using CEL selectors in Autodiscovery. This allows for more granular targeting of configurations to services based on their metadata.
-
Adds the count of total GPU devices to the telemetry metrics emitted to Datadog.
-
GPU: emit count metrics for NVIDIA Xid errors
-
Adds
DD_INFRASTRUCTURE_MODEinstall option to thedatadog-installer-x86_64.exeinstaller and the Windows MSI installer. SetDD_INFRASTRUCTURE_MODEto configure theinfrastructure_modeconfiguration option at installation. -
The infraattributes processor can now be run when the Datadog Exporter is not configured.
-
Add --enable and --disable commands to the IIS .NET APM instrumentation management script on Windows
-
Windows: Adds a PURGE argument to the MSI to remove all OCI packages during uninstallation.
-
The Workload Protection's activity dump functionality on Linux has been improved to reduce its impact on processes that use very large amounts of memory.
-
Cache result of TagsToString() in serverless-init to improve CPU performance.
-
The DDOT service runs as ddagentuser.
Bug Fixes
- Applies a fix to the hacky_dev_image_build script to copy new check configurations.
- [DBM] Bump go-sqllexer to v0.1.9 to fix the following bugs:
- Fixes a nil pointer when normalizing CTE queries with collectTables=false.
- Fixes normalizing MySQL UNION ALL statements.
- Fixes obfuscating MySQL double quoted string literals.
- Lock down dynamic symbol exports in the Linux Agent binary to prevent unexpected symbol conflicts.
- Changed the log level of the "Too many errors for endpoint '*': retrying later" log message from ERROR to WARN. This message is emitted when the forwarder temporarily suppresses sending transactions to an endpoint that has recently failed, in order to avoid flooding it whilst in an error state.
- Fix an issue preventing the Agent from starting on kernels older than 4.13 because of AmbientCapabilities.
- Fix duplicated logs in Azure App Services after application restart.
- Prevent the file launcher scan from blocking autodiscovery by moving the scan function to a go routine.
- Make context expire with configurable timeout when selecting log source type. Infinite context was masking an error of missing runtime sockets. With this change, expiring context eventually reflects as log source error in agent status log section. Timeout value could be changed by setting logs_config.container_runtime_waiting_timeout in the Agent configuration file. Timeout value provided in seconds.
- Fix Podman log collection without Docker socket being mapped in the container.
- Fix cloudRunPrefix from 'gpc.run' to 'gcp.run'.
- Fixes issue with DogstatsD replay not enriching metrics with the tag state found in the capture file. Now replayed metrics will be enriched using the expected tag state.
- The backoff behavior for the default forwarder was fixed to work properly given a worker now sends multiple transactions concurrently.
- Fixes live process and containers for Agents running as a sidecar in Amazon ECS Managed Instances.
- APM: Fix issue where errors on the debugger or symdb reverse proxy could trigger a panic.
- gpu: the workloadmeta collector will no longer send multiple warn logs if the driver is not loaded
- All internally rebuilt x86_64 dependencies now uniformly target the documented macOS 11.0 minimal ABI. Previously, some still targeted macOS 10.12 or 10.13, even though support for 10.x was dropped in Agent 7.62.0 and numerous x86_64 dependencies were already targeting newer ABI versions.
- Windows: windows_certificate now populates the certificate_thumbprint tag when certificate_subjects filters are used. Previously, the tag was empty, making it impossible to uniquely scope monitors in environments with duplicate subjects.
- Fix issue introduced in 7.70.0 that caused the Windows Event Log check and tailer to fail to load with the error "EvtNext failed: This operation returned because the timeout period expired".
Other Notes
- libarchive and its tools are no longer bundled with the Agent.
- Adds origin tag to APM traces for agents running as a sidecar in AWS ECS Managed Instances.
- During Windows MSI uninstallation, OCI packages are now uninstalled by default. To retain OCI packages, set KEEP_INSTALLED_PACKAGES=1 when running the MSI uninstall.
- libxcrypt is no longer bundled with the Agent.
- Only a minimal set of RPM libraries consisting of librpmio & librpm is now bundled with the agent. The command line tools aren't bundled anymore.
- libmagic and the magic database aren't bundled with the agent anymore.
- libdb tools and libraries aren't bundled with the agent anymore
- elfutils tools and libarires aren't bundled with the agent anymore
- linux: Update libdbus to 1.16.2
- libexpat is no longer bundled with the Agent.
Datadog Cluster Agent
Prelude
Released on: 2025-12-10 Pinned to datadog-agent v7.73.0: CHANGELOG.
Upgrade Notes
-
This change removes support for v1 of the auto-instrumentation webhook used for Single Step Instrumentation. The v2 implementation, which has been the default since Agent v7.57.0, is a drop-in replacement. This setting was never exposed in Helm or the Datadog Operator. If you previously set the DD_APM_INSTRUMENTATION_VERSION environment variable on the Cluster Agent, it is now ignored.
If you use a private registry, add the apm-inject container to your registry before upgrading. No action is required for other users. For details on using private registries, see [Use a private container registry](https://docs.datadoghq.com/tracing/trace_collection/automatic_instrumentation/single-step-apm/kubernetes/?tab=agentv764recommended#use-a-private-container-registry).
New Features
-
Customers using Single Step Instrumentation with target-based workload selection can now use language detection. Language detection greatly reduces startup time when all default libraries are configured for a target.
A target is eligible for language detection if a target has no defined ddTraceVersions or if ddTraceVersions matches the default set of SDKs. Once a language has been determined for a deployment, subsequent deploys only use the SDKs necessary for the detected language.
Enhancement Notes
- Added namespace selectors excluding system namespace (
kube-systemand the Datadog Agent's namespace) resources from Admission Controller mutation webhooks. This prevents mutation webhooks from unnecessarily intercepting system namespace resources, reducing misleading warnings or logs, and improving clarity about which resources are actually mutated.
Bug Fixes
- The Cluster Agent Admission Controller now logs a warning instead of failing the webhook when the Admission Controller lacks permissions to access a pod’s owner.
- Fix default value of automountServiceAccountToken on ServiceAccounts when not set.
- There were several bugs for customers using Single Step Instrumentation with target-based workload selection, and also using local SDK injection. This change resolves an issue where if targets were defined, the Cluster Agent didn't respect the admission.datadoghq.com/enabled annotation or the admission_controller.mutate_unlabelled configuration option, and only respected the language annotations.