Skip to content

Commit 954bbbc

Browse files
committed
Decouple Startup CPU Boost from VPA modes
1 parent fabcbe5 commit 954bbbc

File tree

1 file changed

+133
-56
lines changed
  • vertical-pod-autoscaler/enhancements/7862-cpu-startup-boost

1 file changed

+133
-56
lines changed

vertical-pod-autoscaler/enhancements/7862-cpu-startup-boost/README.md

Lines changed: 133 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,6 @@ the pod startup and to scale the CPU resources back down when the pod is
3838
`Ready` or after certain time has elapsed, leveraging the
3939
[in-place pod resize Kubernetes feature](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources).
4040

41-
> [!NOTE]
42-
> This feature depends on the new `InPlaceOrRecreate` VPA mode:
43-
> [AEP-4016: Support for in place updates in VPA](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md)
44-
4541
### Goals
4642

4743
* Allow VPA to boost the CPU request and limit of a pod's containers during the
@@ -61,17 +57,14 @@ time.
6157

6258
## Proposal
6359

64-
* To extend [`ContainerResourcePolicy`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L191)
60+
* To extend [`VerticalPodAutoscalerSpec`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L75)
6561
with a new `StartupBoost` field to allow users to configure the CPU startup
6662
boost.
6763

68-
* To extend [`ContainerScalingMode`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L231-L236)
69-
with a new `StartupBoostOnly` mode to allow users to only enable the startup
70-
boost feature and not vanilla VPA altogether.
64+
* To extend [`ContainerResourcePolicy`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L191)
65+
with a new `StartupBoost` field to allow users to optionally customize the startup boost behavior for individual containers.
7166

72-
* To allow CPU startup boost if a `StartupBoost` config is specified in `Auto`
73-
[`ContainerScalingMode`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L231-L236)
74-
container policies.
67+
* To enable only startup boost (if the `StartupBoost` config is present in the VPA object) and not regular VPA altogether.
7568

7669
## Design Details
7770

@@ -95,8 +88,15 @@ down the CPU resources to the appropriate non-boosted value:
9588

9689
### API Changes
9790

98-
The new `StartupBoost` parameter will be added to the [`ContainerResourcePolicy`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L191)
99-
and contain the following fields:
91+
The new `StartupBoost` parameter will be added to both:
92+
* [`VerticalPodAutoscalerSpec`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L75):
93+
Will allow users to specify the default CPU startup boost for all containers of the pod targeted by the VPA object.
94+
* [`ContainerResourcePolicy`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L191):
95+
Will allow users to optionally customize the startup boost behavior for individual containers.
96+
97+
`StartupBoost` will contain the following fields:
98+
* [Optional] `StartupBoost.CPU.Mode`: whether CPU boost is enabled (`"Auto"`)
99+
or not (`"Off"`). If not specified, it defaults to `"Auto"`.
100100
* `StartupBoost.CPU.Factor`: the factor by which to multiply the initial
101101
resource request and limit of the containers' targeted by the VPA object.
102102
* `StartupBoost.CPU.Value`: the target value of the CPU request or limit
@@ -121,22 +121,15 @@ and contain the following fields:
121121
> section for more details on this feature's behavior for different combinations
122122
> of probers + `StartupBoost.CPU.Duration`.
123123
124-
We will also add a new mode to the [`ContainerScalingMode`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L231-L236):
125-
* **NEW**: `StartupBoostOnly`: new mode that will allow users to only enable
126-
the startup boost feature for a container and not vanilla VPA altogether.
127-
* **NEW**: `Auto`: we will modify the existing `Auto` mode to enable both
128-
vanilla VPA and CPU Startup Boost (when `StartupBoost` parameter is
129-
specified).
130-
131124
#### Priority of `StartupBoost`
132125

133-
The new `StartupBoost` field will take precedence over the rest of the container
134-
resource policy configurations. Functioning independently from all other fields
135-
in [`ContainerResourcePolicy`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L191),
126+
The new `StartupBoost` field will take precedence over the rest of the fields
127+
in [`VerticalPodAutoscalerSpec`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L75)
128+
and [`ContainerResourcePolicy`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L191),
136129
**except for**:
137-
* [`ContainerName`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L192-L195)
138-
* [`Mode`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L196-L198)
139-
* [`ControlledValues`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L214-L217)
130+
* [`VerticalPodAutoscalerSpec.TargetRef`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L88)
131+
* [`ContainerResourcePolicy.ContainerName`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L192-L195)
132+
* [`ContainerResourcePolicy.ControlledValues`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L214-L217)
140133

141134
This means that a container's CPU request/limit can be boosted during startup
142135
beyond [`MaxAllowed`](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.3.0/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L203-L206),
@@ -149,12 +142,11 @@ excluded from [`ControlledResources`](https://github.com/kubernetes/autoscaler/b
149142

150143
* We will check that the `startupBoost` configuration is valid when VPA objects
151144
are created/updated:
152-
* The VPA autoscaling mode must be `InPlaceOrRecreate` (since it does not
153-
make sense to use this feature with disruptive modes of VPA).
154145
* The boost factor is >= 1 (via CRD validation rules)
155146
* Only one of `StartupBoost.CPU.Factor` or `StartupBoost.CPU.Value` is
156147
specified
157-
* The [feature enablement](#feature-enablement) flags must be on.
148+
* The [feature enablement](#feature-enablement-and-rollback) flags must be
149+
on.
158150

159151

160152
#### Dynamic Validation
@@ -166,7 +158,7 @@ are created/updated:
166158

167159
The VPA Updater **will not** evict a pod if it attempted to scaled the pod down
168160
in place (to unboost its CPU resources) and the update failed (see the
169-
[scenarios](https://github.com/kubernetes/autoscaler/blob/0a34bf5d3a71b486bdaa440f1af7f8d50dc8e391/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md?plain=1#L164-L169 ) where the VPA
161+
[scenarios](https://github.com/kubernetes/autoscaler/blob/0a34bf5d3a71b486bdaa440f1af7f8d50dc8e391/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md?plain=1#L164-L169) where the VPA
170162
updater will consider that the update failed). This is to avoid an eviction
171163
loop:
172164

@@ -179,37 +171,33 @@ the pod in-place and it fails.
179171

180172
#### How can this feature be enabled / disabled in a live cluster?
181173

182-
* Feature gates names: `CPUStartupBoost` and `InPlaceOrRecreate` (from
183-
[AEP-4016](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md#feature-enablement-and-rollback))
174+
* Feature gates name: `CPUStartupBoost`
184175
* Components depending on the feature gates:
185176
* admission-controller
186177
* updater
187178

188-
Enabling of feature gates `CPUStartupBoost` AND `InPlaceOrRecreate` will cause
189-
the following to happen:
179+
Enabling of feature gates `CPUStartupBoost` will cause the following to happen:
190180
* admission-controller to **accept** new VPA objects being created with
191-
`StartupBoostOnly` configured.
181+
`StartupBoost` configured.
192182
* admission-controller to **boost** CPU resources.
193183
* updater to **unboost** the CPU resources.
194184

195-
Disabling of feature gates `CPUStartupBoost` OR `InPlaceOrRecreate` will cause
196-
the following to happen:
185+
Disabling of feature gates `CPUStartupBoost` will cause the following to happen:
197186
* admission-controller to **reject** new VPA objects being created with
198-
`StartupBoostOnly` configured.
187+
`StartupBoost` configured.
199188
* A descriptive error message should be returned to the user letting them
200189
know that they are using a feature gated feature.
201190
* admission-controller **to not** boost CPU resources, should it encounter a
202-
VPA configured with a `StartupBoost` config and `StartupBoostOnly` or `Auto`
203-
`ContainerScalingMode`.
191+
VPA configured with a `StartupBoost` config.
204192
* updater **to not** unboost CPU resources when pods meet the scale down
205193
requirements, should it encounter a VPA configured with a `StartupBoost`
206-
config and `StartupBoostOnly` or `Auto` `ContainerScalingMode`.
194+
config.
207195

208196
### Kubernetes Version Compatibility
209197

210198
Similarly to [AEP-4016](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4016-in-place-updates-support#kubernetes-version-compatibility),
211-
`StartupBoost` configuration and `StartupBoostOnly` mode are built assuming that
212-
VPA will be running on a Kubernetes 1.33+ with the beta version of
199+
`StartupBoost` configuration is built assuming that VPA will be running on a
200+
Kubernetes 1.33+ with the beta version of
213201
[KEP-1287: In-Place Update of Pod Resources](https://github.com/kubernetes/enhancements/issues/1287)
214202
enabled. If this is not the case, VPA's attempt to unboost pods may fail and the
215203
pods may remain boosted for their whole lifecycle.
@@ -242,11 +230,47 @@ down.
242230
Here are some examples of the VPA CR incorporating CPU boosting for different
243231
scenarios.
244232

245-
### CPU Boost Only
233+
### Per-pod configurations (`startupBoost` configured in `VerticalPodAutoscalerSpec`)
246234

247-
All containers under `example` deployment will receive "regular" VPA updates,
248-
**except for** `boosted-container-name`. `boosted-container-name` will only be
249-
CPU boosted/unboosted, because it has a `StartupBoostOnly` container policy.
235+
#### Startup CPU Boost Enabled & VPA Disabled
236+
237+
```yaml
238+
apiVersion: "autoscaling.k8s.io/v1"
239+
kind: VerticalPodAutoscaler
240+
metadata:
241+
name: example-vpa
242+
spec:
243+
targetRef:
244+
apiVersion: "apps/v1"
245+
kind: Deployment
246+
name: example
247+
updatePolicy:
248+
# This only disables VPA actuations. It doesn't disable
249+
# startup boost configurations.
250+
updateMode: "Off"
251+
startupBoost:
252+
cpu:
253+
value: 3.0
254+
duration: 10s
255+
```
256+
257+
#### Startup CPU Boost Disabled & VPA Enabled
258+
259+
```yaml
260+
apiVersion: "autoscaling.k8s.io/v1"
261+
kind: VerticalPodAutoscaler
262+
metadata:
263+
name: example-vpa
264+
spec:
265+
targetRef:
266+
apiVersion: "apps/v1"
267+
kind: Deployment
268+
name: example
269+
updatePolicy:
270+
updateMode: "Auto"
271+
```
272+
273+
#### Startup CPU Boost Enabled & VPA Enabled
250274
251275
```yaml
252276
apiVersion: "autoscaling.k8s.io/v1"
@@ -259,23 +283,77 @@ spec:
259283
kind: Deployment
260284
name: example
261285
updatePolicy:
262-
# VPA Update mode must be InPlaceOrRecreate
263-
updateMode: "InPlaceOrRecreate"
286+
updateMode: "Auto"
287+
startupBoost:
288+
cpu:
289+
value: 3.0
290+
duration: 10s
291+
```
292+
293+
### Per-container configurations (`startupBoost` configured in `ContainerPolicies`)
294+
295+
#### Startup CPU Boost Enabled & VPA Disabled
296+
297+
All containers under `example` deployment will receive "regular" VPA updates
298+
(VPA is in `"Auto"` mode in this example), **except for**
299+
`boosted-container-name`. `boosted-container-name` will only be CPU
300+
boosted/unboosted (`StartupBoost` is enabled and VPA `Mode` is set to `Off`).
301+
302+
```yaml
303+
apiVersion: "autoscaling.k8s.io/v1"
304+
kind: VerticalPodAutoscaler
305+
metadata:
306+
name: example-vpa
307+
spec:
308+
targetRef:
309+
apiVersion: "apps/v1"
310+
kind: Deployment
311+
name: example
264312
resourcePolicy:
265313
containerPolicies:
266314
- containerName: "boosted-container-name"
267-
mode: "StartupBoostOnly"
315+
# VPA mode is set to Off, so it never changes pod resources for this
316+
# container. This setting is independent from the startup boost mode.
317+
# CPU startup boost changes will still be applied (mode Auto).
318+
mode: "Off"
268319
startupBoost:
320+
mode: "Auto"
269321
cpu:
270322
factor: 2.0
271323
```
272324

273-
### CPU Boost and Vanilla VPA
325+
#### Startup CPU Boost Disabled & VPA Enabled
326+
327+
All containers under `example` deployment will receive "regular" VPA updates
328+
and be CPU boosted/unboosted, except for `disable-cpu-boost-for-this-container`.
329+
It has a `containerPolicy` `startupBoost` overriding the global VPA config.
330+
331+
```yaml
332+
apiVersion: "autoscaling.k8s.io/v1"
333+
kind: VerticalPodAutoscaler
334+
metadata:
335+
name: example-vpa
336+
spec:
337+
targetRef:
338+
apiVersion: "apps/v1"
339+
kind: Deployment
340+
name: example
341+
startupBoost:
342+
cpu:
343+
factor: 2.0
344+
resourcePolicy:
345+
containerPolicies:
346+
- containerName: "disable-cpu-boost-for-this-container"
347+
startupBoost:
348+
mode: "Off"
349+
```
350+
351+
#### Startup CPU Boost Enabled & VPA Enabled
274352

275353
All containers under `example` deployment will receive "regular" VPA updates,
276354
**including** `boosted-container-name`. Additionally, `boosted-container-name`
277355
will be CPU boosted/unboosted, because it has a `StartupBoost` config in its
278-
container policy and `Auto` container policy mode.
356+
container policy.
279357

280358
```yaml
281359
apiVersion: "autoscaling.k8s.io/v1"
@@ -287,13 +365,9 @@ spec:
287365
apiVersion: "apps/v1"
288366
kind: Deployment
289367
name: example
290-
updatePolicy:
291-
# VPA Update mode must be InPlaceOrRecreate
292-
updateMode: "InPlaceOrRecreate"
293368
resourcePolicy:
294369
containerPolicies:
295370
- containerName: "boosted-container-name"
296-
mode: "Auto" # Vanilla VPA mode + Startup Boost
297371
minAllowed:
298372
cpu: "250m"
299373
memory: "100Mi"
@@ -308,5 +382,8 @@ spec:
308382

309383
## Implementation History
310384

385+
* 2025-05-27: Decouple Startup CPU Boost from InPlaceOrRecreate mode, allow
386+
users to specify a `startupBoost` config in `VerticalPodAutoscalerSpec` and in
387+
`ContainerPolicies` to make the API simpler and add more yaml examples.
311388
* 2025-03-20: Initial version.
312389

0 commit comments

Comments
 (0)