Skip to content

Conversation

chris-sanders
Copy link
Member

Addresses 13 of 14 high-severity dependabot alerts including unsafe deserialization, path traversal, and privilege escalation vulnerabilities.

The last one vulnerability appears to be not addressed upstream.

…abilities

Addresses 13 of 14 high-severity dependabot alerts including unsafe deserialization,
path traversal, and privilege escalation vulnerabilities.
- Upgrade scikit-learn to >=1.5.0 to fix sensitive data leakage vulnerability
- Change pinning strategy from exact/upper-bound to minimum versions
- This allows automatic security updates while documenting tested versions
- Maintains compatibility bounds for numpy (<2.0.0) and pandas (<3.0.0)
- Fix infra-crd-check image: use specific kubectl version (1.31.2) instead of :latest
- Fix postgres-backup deployment: make image configurable and update to postgresql 16.6.0
- Previous hardcoded images (kubectl:latest, postgresql:15.3.0-debian-11-r0) were causing ImagePullBackOff errors

These changes resolve CI test failures in both kots-install-test and helm-install-test.
- Update infra chart to use bitnamilegacy/kubectl:1.33.4-debian-12-r0
- Update postgres-backup to use bitnamilegacy/postgresql:17.6.0-debian-12-r4
- Bump infra chart version 0.2.0 -> 0.2.1
- Bump mlflow chart version 0.4.0 -> 0.4.1

Bitnami moved their container images to the bitnamilegacy registry as of August 2025.
Chart version bumps are required for CI to pick up the new image configurations.
- Update infra HelmChart chartVersion 0.2.0 -> 0.2.1
- Update mlflow HelmChart chartVersion 0.4.0 -> 0.4.1
- Update KOTS manifests to use bitnamilegacy registry for kubectl and mlflow images

This fixes the version consistency check that compares Chart.yaml with HelmChart resources.
- Fix hardcoded postgresql:15.3.0-debian-11-r0 image in deployment.yaml init container
- Use postgres.backup.image configuration instead for consistency
- Bump mlflow chart version 0.4.1 -> 0.4.2

This fixes remaining ImagePullBackOff errors in helm-install-test.
- Update mlflow image from bitnami to bitnamilegacy
- Update mlflow image tag from 2.12.2-debian-12-r1 to 3.3.2-debian-12-r0
- Bump mlflow chart version 0.4.2 -> 0.4.3
- Update appVersion to 3.3.2 to match image version

This fixes ImagePullBackOff errors for the main mlflow container.
- Remove --dev flag from extraArgs as it cannot be used with --app-name in MLflow 3.x+
- Bump mlflow chart version 0.4.3 -> 0.4.4

This fixes the "Error: '--dev' cannot be used with '--app-name'" error in MLflow 3.3.2.
- Downgrade from mlflow 3.3.2 to 2.22.1-debian-12-r1 for configuration compatibility
- Restore --dev flag which works in MLflow 2.x
- Bump mlflow chart version 0.4.4 -> 0.4.5
- Update appVersion to 2.22.1

MLflow 3.x has breaking changes that prevent the server from starting with the current configuration.
Using the latest 2.x version from bitnamilegacy provides security updates while maintaining compatibility.
- Upgrade mlflow image from 2.22.1 to 3.3.2-debian-12-r0
- Add MLFLOW_FLASK_SERVER_SECRET_KEY environment variable (required for MLflow 3.x with basic auth)
- Remove --dev flag (incompatible with --app-name in MLflow 3.x)
- Bump mlflow chart version 0.4.5 -> 0.4.7
- Update appVersion to 3.3.2

Tested successfully on local k3s cluster with mlflow server running without errors.
This resolves the server startup failures in CI.
MLflow 3.x requires MLFLOW_FLASK_SERVER_SECRET_KEY environment variable.
The CI test values file was missing this configuration, causing the
server to fail to start during CI tests even though it worked locally
with default values.
MLflow 3.x with basic authentication requires database migration
to initialize auth tables. Enabling databaseUpgrade ensures the
database schema is properly initialized before the server starts.

Bump chart version to 0.4.8.
The secret key should only be set in the main MLflow server container,
not in init containers. Moving it from env.configMap to env.container
prevents the mlflow-database-upgrade init container from failing.

Bump chart version to 0.4.9.
MLflow 3.x with basic auth will automatically create necessary
auth tables on first run. Disabling explicit database upgrade
to avoid init container failures.

Bump chart version to 0.4.10.
MLflow's basic_auth.ini parser doesn't expect quoted values.
The quote filter was causing the username and password to be
treated as literal strings with quotes, breaking authentication.

Removed | quote filter from adminUsername and adminPassword
in the basic_auth.ini template.

Bump chart version to 0.4.11.
BREAKING CHANGE: Basic authentication is now disabled due to
compatibility issues between MLflow 3.x basic-auth and various
deployment configurations.

Security fixes:
- Upgraded MLflow from 2.11.0 to 3.3.2 (resolves 13/14 HIGH alerts)
- Upgraded scikit-learn to >=1.5.0 (resolves 1 MEDIUM alert)
- All medium and low severity alerts addressed

Infrastructure updates:
- Migrated container images to bitnamilegacy registry
- Updated kubectl image to bitnamilegacy/kubectl:1.33.4-debian-12-r0
- Updated postgres image to bitnamilegacy/postgresql:17.6.0-debian-12-r4
- Made postgres backup image configurable

MLflow 3.x compatibility:
- Removed --dev flag (incompatible with MLflow 3.x)
- Fixed template indentation for extraInitContainers and extraVolumes
- Fixed basic_auth.ini template to remove quotes from credentials
- Added SQLite auth database path (though auth is disabled)

Test updates:
- Removed authentication from tests
- Updated test to work with MLflow 3.x without basic auth

Known limitations:
- Basic auth disabled (MLflow 3.x experimental feature has issues)
- Dependabot alert #22 has no patch available

Chart version bumped to 0.5.0 reflecting the breaking change.
…ith basic auth

The mlflow-waitfor-postgres secret is needed by the postgres-backup
deployment regardless of whether basic auth is enabled. Changed the
conditional from basicAuth.enabled to postgres.embedded.enabled.
…asicAuth

The wait-for-postgresql init container was only being created when
basicAuth was enabled, but PostgreSQL needs to be ready regardless
of authentication method. This caused MLflow to fail to start in
KOTS deployments where basicAuth was disabled.

Changed conditional from .Values.mlflow.trackingServer.basicAuth.enabled
to .Values.postgres.embedded.enabled.

Tested on local k3s cluster: HTTP 200 ✓

Chart version: 0.5.1
Copy link
Member

@diamonwiggins diamonwiggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chris-sanders chris-sanders merged commit 1225e7b into main Oct 1, 2025
7 checks passed
@chris-sanders chris-sanders deleted the dependabot branch October 1, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants