A Kubernetes controller that provides fine-grained, declarative readiness for nodes. It ensures nodes only accept workloads when all required components eg: network agents, GPU drivers, storage drivers or custom health-checks, are fully ready on the node.
This project implements the proposed NodeReadinessGates API (KEP 5416) as an out-of-band solution and brings it to any Kubernetes cluster.
Use it to orchestrate complex bootstrap steps in your node-init workflow, enforce node health and improve workload reliability.
The Node Readiness Controller extends Kubernetes' node readiness model by allowing you to define additional pre-requisites for nodes (as readiness rules) in node conditions. It automatically manages node taints to prevent scheduling until all specified conditions are satisfied.
Kubernetes node has a simple "Ready" condition. Modern workloads need more crtical infrastructure dependencies before they can run.
With this controller you can:
- Define custom readiness for your workload
- Automatically taint and untaint nodes
- Support continuous readiness enforcement to block scheduling for fuse break scenarios
- Integrate with existing problem-detectors like NPD or any custom daemons / node plugins for reporting readiness
- Multi-condition Rules: Define rules that require ALL specified conditions to be satisfied
- Flexible Enforcement: Support for bootstrap-only and continuous enforcement modes
- Conflict Prevention: Validation webhook prevents conflicting taint configurations
- Dry Run Mode: Preview rule impact before applying changes
- Comprehensive Status: Detailed observability into rule evaluation and node readiness status
- Node Targeting: Use label selectors to target specific node types
- Bootstrap Completion Tracking: Prevents re-evaluation once bootstrap conditions are met
Node Readiness Controller in Kind cluster
Example Rule
apiVersion: readiness.node.x-k8s.io/v1alpha1
kind: NodeReadinessRule
metadata:
name: network-readiness-rule
spec:
conditions:
- type: "tigera.io/CalicoReady"
requiredStatus: "True"
taint:
key: "readiness.k8s.io/NetworkReady"
effect: "NoSchedule"
value: "pending"
enforcementMode: "bootstrap-only"
nodeSelector:
matchLabels:
node-role.kubernetes.io/worker: ""https://drive.google.com/file/d/1Q2vCU7FYUrEkHDeQV5NneCH_MNoFshvH/view?usp=sharing
- Add documentation capturing design details
- Metrics and alerting integration
- Validation Webhook for rules
- Improve logging and add debugging pointers
- Performance optimizations for large clusters
- Scale testing 1000+ nodes
- Release v0.1.0
If you're interested in participating in future discussions or development related to Node Readiness Controller, you can reach the maintainers of the project at:
- Slack: #sig-node-readiness-controller. (visit slack.k8s.io for a workspace invitation)
Open Issues / PRs / Discussions here:
- Issues: GitHub Issues
- Discussions: GitHub Discussions
See the Kubernetes community on the community page. You can also engage with SIG Node at #sig-node and mailing List
Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.