Skip to content

Conversation

nirs
Copy link
Contributor

@nirs nirs commented May 8, 2025

Testing revealed that vment-helper sudoers configuration may be hard to get working for some users, and we need better validation when validating the --network vmnet-shared option.

This change:

  • Improve validation, providing suggestions for resolving issues when vmnet-helper sudoers configuration does not work for the user.
  • Change vfkit to use vment-helper --socket mode, simplifying sudoers configuration so we don't depend on the --close-from= option (it did not work for @medyagh).
  • Switching to --socket also makes it possible to run vmnet-helper without sudoers configuration if we can interact with the user. We fallback to interactive sudo in this case.

With this change the only requirement is being able to run vment-helper with sudo. The sudoers configuration is required only if you want to run minikube with --interactive=false.

Before

% minikube start --driver vfkit --network vmnet-shared
😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨  Using the vfkit (experimental) driver based on user configuration

🙈  Exiting due to NOT_FOUND_VMNET_HELPER:


💡  Suggestion:

    vmnet-helper was not found on the system, resolve by:

    Option 1) Installing vment-helper:

    https://github.com/nirs/vmnet-helper#installation

    Option 2) Using the nat network:

    minikube start<no value> --driver vfkit --network nat

After

% minikube start --driver vfkit --network vmnet-shared
😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨  Using the vfkit (experimental) driver based on user configuration
💡  Unable to run vmnet-helper without a password
    To configure vment-helper to run without a password, please check the documentation:
    https://github.com/nirs/vmnet-helper/#granting-permission-to-run-vmnet-helper
Password:
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🔥  Creating vfkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.33.0 on Docker 27.4.0 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 8, 2025
@k8s-ci-robot k8s-ci-robot requested review from medyagh and spowelljr May 8, 2025 20:08
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 8, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @nirs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 8, 2025
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@medyagh
Copy link
Member

medyagh commented May 8, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 8, 2025
@minikube-pr-bot

This comment has been minimized.

@nirs nirs marked this pull request as draft May 9, 2025 19:43
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 9, 2025
nirs added 3 commits May 9, 2025 22:50
Previously we did not check that the helper can run with the
--close-from=4 option, so the command could succeed when incorrect
sudoers configuration. For example a user with liberal NOPASSWD rule,
but without the closefrom_override option.

When the check failed, we log unhelpful log:

    libmachine: Failed to run vmnet-helper:
    %!w(*exec.ExitError=&{0x14000135e30 [115 117 100 111 58 32 97 ... 101 100 10]})

And we returned a bool, so the caller could not provide a suggestion how
to resolve the issue.

Fix by:

- Rename vment.HelperAvaialble to vment.ValidateHelper
- Return an error describing the issue, including a reason.Kind that can
  be used to provide a suggestion for resolving the issue.
- Include the ExitError.Stderr int the error. This includes helpful
  error messages from sudo.
- Add new reason.NotConfiguredVmnetHelper error
- Improve log when vment.ValidateHelper() succeeded

Example error flow - vment-helper not installed:

    % minikube start --driver vfkit --network vmnet-shared
    😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
    ✨  Using the vfkit (experimental) driver based on user configuration

    🙈  Exiting due to NOT_FOUND_VMNET_HELPER: failed to validate vmnet-shared network:
    stat /opt/vmnet-helper/bin/vmnet-helper: no such file or directory
    💡  Suggestion:

        vmnet-helper was not found on the system, resolve by:

        Option 1) Installing vmnet-helper:

        https://github.com/nirs/vmnet-helper#installation

        Option 2) Using the nat network:

        minikube start<no value> --driver vfkit --network nat

I resolved the issue by installing vmnet-helper but I did not configured
the sudoers rule:

    % minikube start --driver vfkit --network vmnet-shared
    😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
    ✨  Using the vfkit (experimental) driver based on user configuration

    🙈  Exiting due to NOT_CONFIGURED_VMNET_HELPER: failed to validate vmnet-shared network:
    exit status 1: sudo: you are not permitted to use the -C option
    💡  Suggestion:

        Configure vmnet-helper to run without a password.

        Please install a vmnet-helper sudoers rule using these instructions:

        https://github.com/nirs/vmnet-helper#granting-permission-to-run-vmnet-helper

After installing the sudoers rule minikube could start.
The --fd option avoids the need to manage a bound unix sockets, in
particular the limit on unix socket length. It is also more secure;
only the process inheriting the socket can access the helper. However it
requires the sudo --close-from= option, which may not work for some
users. We don't understand why it does not work, and debugging it is
hard since users are not happy to share their local sudoers
configuration.

Avoid the trouble by switching to the --socket option. In this case we
pass a unix socket path to the helper and vfkit. The helper creates a
bound unix datagram socket in the specified path, and waits until vfkit
connects to the socket. When vfkit connects to the unix socket the
programs are connected in the same way they are connected by passing
file descriptors.

When running minikube we will see 3 new files in the machine directory:

- `vfkit-fb64-7802.sock`: vfkit unix datagram socket
- `vmnet-helper.sock`: vmnet-helper unix datagram socket
- `vmnet-helper.sock.lock`: lockfile for vment-helper socket

The files are deleted when vmnet-helper and vfkit are terminated
gracefully. If they are killed the stale files are replaced on the next
start.

Issues:
- If the path exceeds the limit (104 characters), opening the socket
  will fail. We have the sames issue with vfkit management socket.
If vmnet-helper sudoers rule is not configured or does not work for the
user, maybe because the user has disabled the NOPASSWD option, we used
to fail, recommending to configure vmnet sudoers rule. This does not
help a user that cannot fix the sudoers configuration.

Since we switched to --socket mode, we can work without a sudoers rule.
If we can interact with the user, we fall back to interactive sudo. The
user can enter a password to start the machine.

Example run with --interactive=false:

    % minikube start --driver vfkit --network vmnet-shared --interactive=false
    😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
    ✨  Using the vfkit (experimental) driver based on user configuration

    🙈  Exiting due to NOT_CONFIGURED_VMNET_HELPER: failed to validate vmnet-shared network:
    exit status 1: sudo: a password is required
    💡  Suggestion:

        Configure vmnet-helper to run without a password.

        Please install a vmnet-helper sudoers rule using these instructions:

        https://github.com/nirs/vmnet-helper#granting-permission-to-run-vmnet-helper

Example run with --interactive (default):

    % minikube start --driver vfkit --network vmnet-shared
    😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
    ✨  Using the vfkit (experimental) driver based on user configuration
    💡  Unable to run vmnet-helper without a password
        To configure vment-helper to run without a password, please check the documentation:
        https://github.com/nirs/vmnet-helper/#granting-permission-to-run-vmnet-helper
    Password:
    👍  Starting "minikube" primary control-plane node in "minikube" cluster
    🔥  Creating vfkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
    🐳  Preparing Kubernetes v1.33.0 on Docker 27.4.0 ...
        ▪ Generating certificates and keys ...
        ▪ Booting up control plane ...
        ▪ Configuring RBAC rules ...
    🔗  Configuring bridge CNI (Container Networking Interface) ...
    🔎  Verifying Kubernetes components...
        ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
    🌟  Enabled addons: storage-provisioner, default-storageclass
    🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
@nirs nirs force-pushed the vmnet-unix-socket branch from 641c06f to aa22260 Compare May 9, 2025 20:01
@nirs nirs changed the title vfkit: Use helper --socket instead of --fd vmnet: Support running without sudoers configuration May 9, 2025
@nirs nirs marked this pull request as ready for review May 9, 2025 20:21
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 9, 2025
@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 20719) |
+----------------+----------+---------------------+
| minikube start | 51.3s    | 52.7s               |
| enable ingress | 17.0s    | 16.9s               |
+----------------+----------+---------------------+

Times for minikube (PR 20719) start: 51.8s 50.9s 57.4s 50.5s 52.8s
Times for minikube start: 51.9s 53.1s 49.6s 50.3s 51.4s

Times for minikube ingress: 14.6s 18.6s 15.0s 17.6s 19.0s
Times for minikube (PR 20719) ingress: 19.1s 19.6s 15.1s 16.1s 14.6s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 20719) |
+----------------+----------+---------------------+
| minikube start | 24.0s    | 24.6s               |
| enable ingress | 13.0s    | 12.8s               |
+----------------+----------+---------------------+

Times for minikube (PR 20719) start: 22.6s 26.2s 24.2s 25.9s 24.3s
Times for minikube start: 23.2s 24.0s 25.6s 24.7s 22.6s

Times for minikube ingress: 13.3s 12.4s 12.9s 13.4s 13.3s
Times for minikube (PR 20719) ingress: 13.3s 12.3s 12.4s 12.3s 13.5s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 20719) |
+----------------+----------+---------------------+
| minikube start | 23.2s    | 22.8s               |
| enable ingress | 29.7s    | 26.1s               |
+----------------+----------+---------------------+

Times for minikube start: 24.5s 22.2s 22.9s 22.6s 23.8s
Times for minikube (PR 20719) start: 23.1s 22.8s 23.5s 22.7s 22.1s

Times for minikube ingress: 23.9s 38.9s 22.9s 23.8s 38.9s
Times for minikube (PR 20719) ingress: 22.9s 38.9s 22.8s 22.9s 22.8s

Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you this look good to me

vmnet-unix-socket ✓
$ mk start --driver vfkit --network vmnet-shared
😄  minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨  Using the vfkit (experimental) driver based on user configuration
💡  Unable to run vmnet-helper without a password
    To configure vment-helper to run without a password, please check the documentation:
    https://github.com/nirs/vmnet-helper/#granting-permission-to-run-vmnet-helper
Password: 
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🔥  Creating vfkit VM (CPUs=2, Memory=4000MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.33.0 on Docker 27.4.0 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
14:02:45 medya/workspace/minikube
vmnet-unix-socket ✓
$ kubectl get pods -A
NAMESPACE     NAME                               READY   STATUS    RESTARTS      AGE
kube-system   coredns-674b8bbfcf-84tg4           1/1     Running   0             48s
kube-system   etcd-minikube                      1/1     Running   0             54s
kube-system   kube-apiserver-minikube            1/1     Running   0             54s
kube-system   kube-controller-manager-minikube   1/1     Running   0             54s
kube-system   kube-proxy-qhc52                   1/1     Running   0             48s
kube-system   kube-scheduler-minikube            1/1     Running   0             54s
kube-system   storage-provisioner                1/1     Running   1 (18s ago)   52s
14:03:36 medya/workspace/minikube

this is an exciting new improvement to minikube experience on macos and I look forward to use this on macos !

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 9, 2025
Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @nirs do you mind adding Before/After this PR output in the PR description

@nirs
Copy link
Contributor Author

nirs commented May 12, 2025

thank you @nirs do you mind adding Before/After this PR output in the PR description

sure

@medyagh
Copy link
Member

medyagh commented May 12, 2025

thank you @nirs look forward to see this in the new release

@medyagh
Copy link
Member

medyagh commented May 12, 2025

/lgtm

@medyagh medyagh merged commit e4bdba7 into kubernetes:master May 12, 2025
30 of 37 checks passed
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, nirs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nirs nirs deleted the vmnet-unix-socket branch May 23, 2025 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants