Skip to content

Conversation

@jmwoliver
Copy link

This is a prototype experimenting with adding the PPM SSO flow into keyring as a new backend. Here is how to use it (this assumes a Package Manager instance running locally with OIDC configured):

R

# install the keyring version locally
install.packages('devtools')
devtools::document()
devtools::install()

# set env vars
Sys.setenv(PACKAGEMANAGER_ADDRESS = "http://localhost:4242")
options(keyring_backend = "ppm")

# you can just use `key_get` without it previously being stored. It will see it
# is not in the keyring and go through the auth flow to get the token
keyring::key_get(service = "http://localhost:4242", username = "__token__")

Please open the following URL in your browser:
   https://dev-513394.oktapreview.com/activate?user_code=XDHFZLFX

And enter the following code when prompted:
   XDHFZLFX

Waiting for authorization...
[1] "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJwYWNrYWdlbWFuYWdlciIsInN1YiI6ImphY29iLndvbGl2ZXJAcG9zaXQuY28iLCJhdWQiOlsicGFja2FnZW1hbmFnZXIiXSwiZXhwIjoxNzU0NDk5MDc4LCJpYXQiOjE3NTQ0OTU0NzksImp0aSI6IjRlOGEwOWUyLTgwZWEtNDg4OC05ZTg0LTNmZDE2M2Q4OGM1NSIsInNjb3BlcyI6eyJyZXBvcyI6eyJweXBpIjoicmVhZCJ9fSwicHBtX3R5cGUiOiJzaG9ydC1saXZlZCJ9.LK2Q-biROgIeOp1E3PfTFvxRdR2LbErkljDS8UYpwV8"

# the token gets saved to ~/.ppm/tokens.toml
cat ~/.ppm/tokens.toml
[[connection]]
url = "http://localhost:4242"
token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJwYWNrYWdlbWFuYWdlciIsInN1YiI6ImphY29iLndvbGl2ZXJAcG9zaXQuY28iLCJhdWQiOlsicGFja2FnZW1hbmFnZXIiXSwiZXhwIjoxNzU0NDk5MDc4LCJpYXQiOjE3NTQ0OTU0NzksImp0aSI6IjRlOGEwOWUyLTgwZWEtNDg4OC05ZTg0LTNmZDE2M2Q4OGM1NSIsInNjb3BlcyI6eyJyZXBvcyI6eyJweXBpIjoicmVhZCJ9fSwicHBtX3R5cGUiOiJzaG9ydC1saXZlZCJ9.LK2Q-biROgIeOp1E3PfTFvxRdR2LbErkljDS8UYpwV8"
method = "sso"

This opens a new browser window for the auth flow.

@gaborcsardi Are we able to add a Package Manager specific backend? Is this set up properly to be easily consumed by pak if it just is updated with the version of keyring that has this new ppm backend?

@gaborcsardi
Copy link
Member

Thanks! Sorry for the really long wait.

I think this is going to work just fine in the end, but I do have some big picture questions and it'll also need a bunch of small changes to make it appropriate for pak.

1. Does this need to be in keyring?

This is probably in keyring because that was a good way to implement it for the Python client (pip), which has built-in support for the Python keyring package. But in R we have direct access to the package manager client (pak), so we might as well implement it there because it is not really a keyring backend.

2. How do I test this?

Is the protocol a standard OAuth 2.0 workflow? It would be great to have a way to test this against a real PPM instance. Not necessarily for every test suite run, for that I can hopefully use webfakes::oauth2_resource_app(). But refactoring is much simpler if I have tests.

3. Smaller issues

To use this in pak, we'd need to trim it down and get rid most of the new dependencies. Plus there are some more changes we need. This is a TODO list for myself:

  • We probably don't need the httr2 package and curl will suffice. This will make the oauth2 code more cumbersome to write, but I did this before and it is certainly possible.
  • We probably don't need the jsonlite package, we have several JSON parsers and encoders in base R in various packages. E.g. https://github.com/r-lib/remotes/blob/main/R/json.R and https://github.com/r-lib/pkgdepends/blob/main/R/tojson.R. Eventually pak will probably use https://github.com/gaborcsardi/tsjson
  • We probably don't need the openssl package, the base64_encode(), rand_bytes() and sha256() functions you use are implemented in many packages, e.g. processx::base64_encode(), keyring:::rand_bytes() (although this uses libsodium, but that's probably OK), cli::hash_sha256().
  • We cannot use the RcppTOML package, it uses C++ that is much harder to compile into a static pak binary. Maybe we can store the token in a JSON file instead, until we get a new TOML parser? Does Python need a TOML file?
  • Need to remove \() lambda functions to support older R.
  • Do not import packages, qualify calls explicitly with ::, this is needed to be embedded in pak.
  • Maybe use the (equivalent of the) rappdirs package to save the token at a standard place. pak/pkgcache/etc. already has base R code for this, no need to depend on rappdirs. In Python do you always wrote it to ~/.ppm?
  • Remove the |> pipes to support older R versions.
  • Does the server support a way (e.g. server side events?) to avoid having to poll for the token?

@jmwoliver
Copy link
Author

@gaborcsardi Thanks for all your comments! I have a few thoughts in response:

Does this need to be in keyring?

No it doesn't. I think adding directly in pak makes more sense too. We added it to keyring because pip already had a built-in mechanism to take custom backends easily. I prototyped it in keyring here on the R side for parity, but I agree that if we could get it directly in pak that is a better option since like you said it is not really a keyring backend.

Is the protocol a standard OAuth 2.0 workflow?

This is a standard OAuth 2.0 workflow, but with some Package Manager specific endpoints. A Package Manager admin configures the desired identity provider with the client ID / client secret, then clients (pip, pak, rspm, browsers) call PPM endpoints, letting PPM route to the configured identity provider. The endpoints we call are:

  • /__api__/device - initiate the device flow
  • /__api__/device_access - get the ID token
  • /__api__/token - exchange the ID token for a PPM token

That's why this is considered a PPM-specific implementation rather than a generic OAuth2.0 backend.

How do I test this?

We test the posit-keyring package like this:
https://github.com/posit-dev/posit-keyring/blob/main/.github/workflows/integration-tests.yml

What it does is:

  • Generates an ID token from a Github Action
  • Configures a Package Manager container with identity federation to allow exchanging GHA ID tokens for PPM tokens
  • Create an authenticated repo
  • Configures the posit-keyring backend
  • export PACKAGEMANAGER_IDENTITY_TOKEN_FILE=${{ runner.temp }}/OIDC_TOKEN_FILE
    • We added the ability to bypass user intervention in the auth flow for testing. We'll probably want this environment variable respected in pak too for testing. This bypasses the device flow and just calls /__api__/token for the token exchange
  • Calls pip install against an authenticated repository

Would this testing strategy be helpful in pak as well? It would be all the same setup steps but we call pak::pkg_install("dplyr") at the end to verify it can download from an authenticated repository.

... get rid most of the new dependencies.

Any dependencies and design choices made here were just to get a prototype out for discussion. We can remove or change anything you feel needs to be changed, with one exception. We should discuss the TOML file more, I'll describe that requirement more below.

We cannot use the RcppTOML package ... does Python need a TOML file?

We write to the ~/.ppm/tokens.toml file for both the posit-keyring backend and the rspm sso login CLI. The intent here was to have all three tools (pak, pip, and rspm) write the PPM token to the same location. So if you log in with rspm sso login and then do pak::pkg_install("dplyr"), you have already authenticated once and pak will use the same token (or vice versa).

If there is no way to write to a TOML file from pak right now, we have a few options:

  • Move everything to write to a JSON file instead of TOML. This would require Package Manger server changes, rspm CLI changes, posit-keyring changes, etc.
  • Have pak write to a different location/file than the other tools. It would be unfortunate but not the end of the world.

Does the server support a way (e.g. server side events?) to avoid having to poll for the token?

No server side events, I think polling is the standard OAuth2.0 way of doing the device access token request and response. I dug around for the RFC to double check and that seems to be the case:
https://www.rfc-editor.org/rfc/rfc8628#section-3.4
https://www.rfc-editor.org/rfc/rfc8628#section-3.5

This section defines the polling behavior:

authorization_pending
      The authorization request is still pending as the end user hasn't
      yet completed the user-interaction steps ([Section 3.3](https://www.rfc-editor.org/rfc/rfc8628#section-3.3)).  The
      client SHOULD repeat the access token request to the token
      endpoint (a process known as polling).  Before each new request,
      the client MUST wait at least the number of seconds specified by
      the "interval" parameter of the device authorization response (see
      [Section 3.2](https://www.rfc-editor.org/rfc/rfc8628#section-3.2)), or 5 seconds if none was provided, and respect any
      increase in the polling interval required by the "slow_down"
      error.

@jmwoliver
Copy link
Author

@gaborcsardi - I remembered after I made this prototype we changed the payload for the auth endpoints to better match the oauth spec. I just pushed a small commit to get this working again so you have a better reference if you are the one going to be implementing this in pak. I can demo this for you so you have an idea of how the flow works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants