-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Problem Description
Revisions with initialScale > 1 that are no longer referenced by any Route (i.e., routingState = "reserve") cannot scale down to 0, causing resource waste when old revisions are replaced by new ones.
Expected Behavior
When a revision's routingState becomes "reserve" (meaning it is no longer referenced by any Route), it should be able to scale down to 0 immediately, regardless of initialScale configuration. This is important because:
- When a new revision becomes ready and replaces the old one, the old revision's
routingStatechanges to"reserve" - These old revisions should free up resources quickly by scaling down to 0
- The
initialScaleconstraint should not prevent this cleanup
Actual Behavior
Revisions with routingState = "reserve" and initialScale > 1 cannot scale down to 0 because:
- The
initialScalelogic inscaler.goforcesmin = initialScaleeven when the revision is no longer referenced by any Route - This prevents the autoscaler from scaling down below
initialScale, even thoughScaleBounds()already returnsmin=0for unreachable revisions (which includesroutingState = "reserve")
Steps to Reproduce
1. Deploy a Knative Service with minScale=1 and initialScale=2:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: helloworld
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/initialScale: "2"
spec:
containers:
- image: gcr.io/knative-samples/helloworld-go2. Deploy a new revision with an invalid image (e.g., imagePullBackOff):
spec:
template:
spec:
containers:
- image: invalid-image:latest3. The new revision will have 2 pods in ImagePullBackOff state and will be marked as Unreachable
4. Deploy a third revision with a valid image
5. Observed: The second revision (with ImagePullBackOff pods) remains at 2 pods and cannot scale down to 0, even though the old revision (helloworld-00002) has routingState = "reserve" (no longer referenced by Route):
$ kubectl get po -n paas-uat
NAME READY STATUS RESTARTS AGE
helloworld-nodejs-00002-deployment-564896c9fc-v7ntx 0/2 ImagePullBackOff 0 3h
helloworld-nodejs-00002-deployment-564896c9fc-vsqrh 0/2 ImagePullBackOff 0 3h
helloworld-nodejs-00003-deployment-847f88dbd8-6vfll 2/2 Running 0 168mRoot Cause Analysis
The issue is in serving/pkg/reconciler/autoscaling/kpa/scaler.go, in the scale() method:
// Line 343-349
if initialScale > 1 && !pa.Status.IsScaleTargetInitialized() {
// Ignore initial scale if minScale >= initialScale.
if min < initialScale {
logger.Debugf("Adjusting min to meet the initial scale: %d -> %d", min, initialScale)
}
min = intMax(initialScale, min)
}This code forces min = initialScale regardless of whether the revision's routingState is "reserve". However:
- When a revision's
routingState = "reserve", it means it's no longer referenced by any Route ScaleBounds()already returnsmin=0for unreachable revisions (seepa_lifecycle.go:90), androutingState = "reserve"results inReachability = Unreachable(seerevision/resources/pa.go:77-78)- The
initialScalelogic should respect theroutingState = "reserve"condition and not override themin=0value
Relationship Between routingState and Reachability
According to serving/pkg/reconciler/revision/resources/pa.go:
routingState = "active"→Reachability = ReachableroutingState = "reserve"→Reachability = UnreachableroutingState = "pending"or unset →Reachability = Unknown
So routingState = "reserve" is equivalent to Reachability = Unreachable for the purpose of determining whether a revision should be allowed to scale down.
Proposed Solution
Modify the initialScale check to ignore initialScale when the revision's routingState = "reserve" (i.e., when pa.Spec.Reachability == ReachabilityUnreachable):
if initialScale > 1 && !pa.Status.IsScaleTargetInitialized() && pa.Spec.Reachability != autoscalingv1alpha1.ReachabilityUnreachable {
// Ignore initial scale if minScale >= initialScale.
if min < initialScale {
logger.Debugf("Adjusting min to meet the initial scale: %d -> %d", min, initialScale)
}
min = intMax(initialScale, min)
}This change ensures that:
- Revisions with
routingState = "reserve"(no longer referenced by Route) can scale down to 0 immediately - The
initialScalelogic only applies to revisions that are still referenced by Routes (routingState = "active") - Resources are freed promptly when old revisions are replaced by new ones
Environment
- Knative Serving version: Knative v1.19.6
- Autoscaler configuration:
initialScale=2,minScale=1