Skip to content

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Apr 8, 2024

Improves the performance of the legacy btoa function. It's not recommended to be used in production.

                                            confidence improvement accuracy (*)    (**)   (***)
buffers/buffer-btoa.js n=1000000 size=1024        ***   1089.00 %      ±15.27% ±20.57% ±27.29%
buffers/buffer-btoa.js n=1000000 size=128         ***    571.99 %       ±6.85%  ±9.22% ±12.22%
buffers/buffer-btoa.js n=1000000 size=16          ***    306.04 %       ±4.24%  ±5.70%  ±7.53%
buffers/buffer-btoa.js n=1000000 size=256         ***    870.77 %      ±13.85% ±18.67% ±24.78%
buffers/buffer-btoa.js n=1000000 size=32          ***    350.58 %       ±4.61%  ±6.18%  ±8.15%
buffers/buffer-btoa.js n=1000000 size=64          ***    462.57 %       ±5.60%  ±7.51%  ±9.91%

Disclaimer

"btoa" is a legacy functionality, and not recommended to be used in production applications.

fast and deprecated

Thanks Matthieu Riegler for the poster

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Apr 8, 2024
@anonrig anonrig requested a review from addaleax April 8, 2024 18:59
@lemire
Copy link
Member

lemire commented Apr 9, 2024

@anonrig I recommend trying something like this... We have three cases: one external one-byte string (easy), one non-external one byte string (we use WriteOneByte) and the general case where we go to 16-bit words, we try converting to latin1 and then we see what happens. In this scenario, there might be an error condition, currently in the code I suggest, it returns the empty string.

static void Btoa(const FunctionCallbackInfo<Value>& args) {
  CHECK_EQ(args.Length(), 1);
  Environment* env = Environment::GetCurrent(args);
  THROW_AND_RETURN_IF_NOT_STRING(env, args[0], "argument");

  Local<String> input = args[0].As<String>();
  MaybeStackBuffer<char> buffer;
  size_t written;

  if (input->IsExternalOneByte()) {  // 8-bit case
    auto ext = input->GetExternalOneByteStringResource();
    size_t expected_length = simdutf::base64_length_from_binary(ext->length());
    buffer.AllocateSufficientStorage(expected_length + 1);
    buffer.SetLengthAndZeroTerminate(expected_length);
    written = simdutf::binary_to_base64(
        ext->data(), ext->length(), buffer.out(), simdutf::base64_default);
  } if(input->IsOneByte()) {
    MaybeStackBuffer<uint8_t> stack_buf(input->Length());
        input->WriteOneByte(env->isolate(),
                           stack_buf.out(),
                           0,
                           input->Length(),
                           String::NO_NULL_TERMINATION);

    size_t expected_length = simdutf::base64_length_from_binary(input->Length());
    buffer.AllocateSufficientStorage(expected_length + 1);
    buffer.SetLengthAndZeroTerminate(expected_length);
    written = simdutf::binary_to_base64(
        reinterpret_cast<const char*>(*stack_buf), input->Length(), buffer.out());
  } else {
    String::Value value(env->isolate(), input);
    MaybeStackBuffer<char> stack_buf(value.length());
    size_t out_len = simdutf::convert_utf16_to_latin1(reinterpret_cast<const char16_t *>(*value), value.length(), stack_buf.out());
    if(out_len == 0) { // error
      return args.GetReturnValue().SetEmptyString();
    }
    size_t expected_length = simdutf::base64_length_from_binary(out_len);
    buffer.AllocateSufficientStorage(expected_length + 1);
    buffer.SetLengthAndZeroTerminate(expected_length);
    written = simdutf::binary_to_base64(
        *stack_buf, out_len, buffer.out());
  }

  auto value =
      String::NewFromOneByte(env->isolate(),
                             reinterpret_cast<const uint8_t*>(buffer.out()),
                             NewStringType::kNormal,
                             written)
          .ToLocalChecked();
  return args.GetReturnValue().Set(value);
}

This should be considered ChatGTP-quality code: I did not even try to test it... It is only as an idea.

@lemire
Copy link
Member

lemire commented Apr 9, 2024

@anonrig Lint. Lint. Lint.

@anonrig
Copy link
Member Author

anonrig commented Apr 9, 2024

@lemire I've updated the benchmark results as well :-)

@lemire
Copy link
Member

lemire commented Apr 10, 2024

@anonrig My proposed code would return the empty string on error, but we need to throw, I pushed a commit that does that. (Very minor change.)

@lemire lemire requested a review from joyeecheung April 10, 2024 17:42
@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 10, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 10, 2024
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig requested a review from benjamingr April 11, 2024 01:08
@nodejs-github-bot
Copy link
Collaborator

@anonrig anonrig added the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 11, 2024
@nodejs-github-bot nodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels Apr 11, 2024
@nodejs-github-bot
Copy link
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/52427
✔  Done loading data for nodejs/node/pull/52427
----------------------------------- PR info ------------------------------------
Title      buffer: improve `btoa` performance (#52427)
Author     Yagiz Nizipli  (@anonrig)
Branch     anonrig:improve-btoa -> nodejs:main
Labels     buffer, c++, needs-ci
Commits    2
 - buffer: improve `btoa` performance
 - fix: return exception
Committers 2
 - Yagiz Nizipli 
 - Daniel Lemire 
PR-URL: https://github.com/nodejs/node/pull/52427
Reviewed-By: Daniel Lemire 
Reviewed-By: James M Snell 
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/52427
Reviewed-By: Daniel Lemire 
Reviewed-By: James M Snell 
--------------------------------------------------------------------------------
   ℹ  This PR was created on Mon, 08 Apr 2024 17:24:56 GMT
   ✔  Approvals: 2
   ✔  - Daniel Lemire (@lemire): https://github.com/nodejs/node/pull/52427#pullrequestreview-1992265694
   ✔  - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/52427#pullrequestreview-1993089038
   ⚠  GitHub cannot link the author of 'fix: return exception' to their GitHub account.
   ⚠  Please suggest them to take a look at https://github.com/nodejs/node/blob/99b1ada/doc/guides/contributing/pull-requests.md#step-1-fork
   ✔  Last GitHub CI successful
   ℹ  Last Full PR CI on 2024-04-11T02:25:23Z: https://ci.nodejs.org/job/node-test-pull-request/58258/
- Querying data for job/node-test-pull-request/58258/
   ✔  Last Jenkins CI successful
--------------------------------------------------------------------------------
   ✔  No git cherry-pick in progress
   ✔  No git am in progress
   ✔  No git rebase in progress
--------------------------------------------------------------------------------
- Bringing origin/main up to date...
From https://github.com/nodejs/node
 * branch                  main       -> FETCH_HEAD
✔  origin/main is now up-to-date
- Downloading patch for 52427
From https://github.com/nodejs/node
 * branch                  refs/pull/52427/merge -> FETCH_HEAD
✔  Fetched commits as ee4fa77624f1..6900b0dce7cf
--------------------------------------------------------------------------------
[main 50c914c521] buffer: improve `btoa` performance
 Author: Yagiz Nizipli 
 Date: Mon Apr 8 13:23:56 2024 -0400
 3 files changed, 81 insertions(+), 7 deletions(-)
 create mode 100644 benchmark/buffers/buffer-btoa.js
[main de7c8dcb6a] fix: return exception
 Author: Daniel Lemire 
 Date: Wed Apr 10 09:44:11 2024 -0400
 2 files changed, 6 insertions(+), 2 deletions(-)
   ✔  Patches applied
There are 2 commits in the PR. Attempting autorebase.
Rebasing (2/4)

Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
buffer: improve btoa performance

PR-URL: #52427
Reviewed-By: Daniel Lemire [email protected]
Reviewed-By: James M Snell [email protected]

[detached HEAD f1e7db2933] buffer: improve btoa performance
Author: Yagiz Nizipli [email protected]
Date: Mon Apr 8 13:23:56 2024 -0400
3 files changed, 81 insertions(+), 7 deletions(-)
create mode 100644 benchmark/buffers/buffer-btoa.js
Rebasing (3/4)
Rebasing (4/4)

Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
fix: return exception

PR-URL: #52427
Reviewed-By: Daniel Lemire [email protected]
Reviewed-By: James M Snell [email protected]

[detached HEAD a285a09324] fix: return exception
Author: Daniel Lemire [email protected]
Date: Wed Apr 10 09:44:11 2024 -0400
2 files changed, 6 insertions(+), 2 deletions(-)

Successfully rebased and updated refs/heads/main.

ℹ Add commit-queue-squash label to land the PR as one commit, or commit-queue-rebase to land as separate commits.

https://github.com/nodejs/node/actions/runs/8645651485

@panva panva removed the commit-queue-failed An error occurred while landing this pull request using GitHub Actions. label Apr 11, 2024
@panva panva added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Apr 11, 2024
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Apr 11, 2024
@nodejs-github-bot nodejs-github-bot merged commit 21211a3 into nodejs:main Apr 11, 2024
@nodejs-github-bot
Copy link
Collaborator

Landed in 21211a3

marco-ippolito pushed a commit that referenced this pull request May 2, 2024
PR-URL: #52427
Reviewed-By: Daniel Lemire <[email protected]>
Reviewed-By: James M Snell <[email protected]>
marco-ippolito pushed a commit that referenced this pull request May 3, 2024
PR-URL: #52427
Reviewed-By: Daniel Lemire <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants