[Executorch][LLM] Use caching allocator for runner #15730

kimishpatel · 2025-11-11T04:34:49Z

Stack from ghstack (oldest at bottom):

We observed that on iOS it improves perf by 6% because SDPA op does temp allocations.

No significant difference on android though.

Differential Revision: D86120038

We observed that on iOS it improves perf by 6% because SDPA op does temp allocations. No significant difference on android though. Differential Revision: [D86120038](https://our.internmc.facebook.com/intern/diff/D86120038/) [ghstack-poisoned]

pytorch-bot · 2025-11-11T04:34:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15730

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 122 New Failures

As of commit 5cecbfc with merge base 7600df8 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
RuntimeError: Command docker exec -t 4fb667444c21970fe2efc28a56df38185bf7ba65a4da4a2925f81b58e9d2d044 /exec failed with exit code 1
Build Presets / apple (ios-simulator) / build (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1
Build Presets / apple (ios) / build (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1
Build Presets / apple (llm) / build (gh)
Build Presets / apple (macos) / build (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1
Build Presets / apple (pybind) / build (gh)
Build Presets / linux (linux, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
RuntimeError: Command docker exec -t b3b3d0ee50fabb12489d58c27d01c107537ad94be43686ffc0bfbcb3be2db5be /exec failed with exit code 1
Build Presets / linux (linux, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
RuntimeError: Command docker exec -t e8f98e6452c295b19bdcfaef7b5db029411247dcc459e0f2ab7a6640c78b3cca /exec failed with exit code 1
Build Presets / linux (llm, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
RuntimeError: Command docker exec -t b62651ab2df931deee28f9306a4f1888d75ebb5f2ac8e6352fed7e07c83c83a3 /exec failed with exit code 1
Build Presets / linux (llm, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
RuntimeError: Command docker exec -t 08d3ff9c215607ddf7ffca93909758600a89b424363a82b9a355346ac6d9fe56 /exec failed with exit code 1
Build Presets / linux (pybind, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
RuntimeError: Command docker exec -t 50f4cca521aa302f99f9b794518468d2608b29b5425e7f4c83c7485f1734c884 /exec failed with exit code 1
Build Presets / linux (pybind, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
Lint / lintrunner / linux-job (gh)
>>> Lint for CMakeLists.txt:
periodic / test-models-linux (buck2, mv3, portable, linux.2xlarge, 90) / linux-job (gh)
RuntimeError: Command docker exec -t 691273cd25b7c436c881f765aba0bd2af67cb2cfcb3b7b8e02f40be49db0d270 /exec failed with exit code 1
periodic / test-models-linux (buck2, mv3, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh)
RuntimeError: Command docker exec -t 669e731f3065e591ca66ef3574d749fa81a73256d14f05e32fdddc5d868605f6 /exec failed with exit code 1
periodic / test-models-linux (cmake, mv3, portable, linux.2xlarge, 90) / linux-job (gh)
RuntimeError: Command docker exec -t 8766bebedc2e927c0c63930cd9f40ce16dbcff2bacd2ae431d5d85f2c3296c1c /exec failed with exit code 1
periodic / test-models-linux (cmake, mv3, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh)
RuntimeError: Command docker exec -t e3c40d8fcfea69f1f0f686a84d289acac6f16a678b21d92f0f0f8395ef387cab /exec failed with exit code 1
periodic / test-models-linux (cmake, vit, portable, linux.2xlarge, 90) / linux-job (gh)
RuntimeError: Command docker exec -t 0e9f37fbee0b312e54551f35d98e07beabe8d8f0663d5905e8e12d7cc06afe9b /exec failed with exit code 1
periodic / test-models-linux (cmake, vit, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh)
RuntimeError: Command docker exec -t 3af7a4d508d72b53598bb0718fc92977d08c2acb7b9854d3cb59dba82823bade /exec failed with exit code 1
pull / android / build-llm-demo / linux-job (gh)
RuntimeError: Command docker exec -t 10a0e5d515fffa6f03d7f2b9ec9ec78b9148313ede282dfb972a60986dd5ca70 /exec failed with exit code 1
pull / test-build-wasm-linux / linux-job (gh)
RuntimeError: Command docker exec -t 34dda089f1c155a9fc33f3243e98bcf1adb4e1faa7f80671636c0ecf10a78be4 /exec failed with exit code 1
pull / test-custom-ops-linux / linux-job (gh)
RuntimeError: Command docker exec -t 18e8eb17d47cf8b78f380dd3a2f63f08b03084dacd1f8ff238cf546f8de713be /exec failed with exit code 1
pull / test-eval_llama-wikitext-linux / linux-job (gh)
RuntimeError: Command docker exec -t 4df491daf95e4e85643271512571883ecd8e32ff91a807188473a0ea3a8a736d /exec failed with exit code 1
pull / test-llama_runner_eager-linux / linux-job (gh)
RuntimeError: Command docker exec -t 74862dd92980194a201bbbf48f072e08ea82c515dcd6a19fa3987043c5a1c6d1 /exec failed with exit code 1
pull / test-llama-lora-linux / linux-job (gh)
RuntimeError: Command docker exec -t 97bab90ed631d07df0ffc469e54eb9e331c2463cae4c55fc55e01f1740359293 /exec failed with exit code 1
pull / test-llama-runner-linux (bf16, custom, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 3970cb541b9f844ede84a41129aef20a8fd3cf0bf751c526e069be4a1c427c74 /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 45d2b16aab9060814d48185619a8045b139ad0a166fdfde5a4c9d9d440e30023 /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh)
RuntimeError: Command docker exec -t f0fc93fbbc44345c27fa68f6abcb46187d7575de0ec16f1e7ade0812aabd662c /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.2xlarge, executorch-ubuntu-22.04... / linux-job (gh)
RuntimeError: Command docker exec -t 8fe9e572a8efa75bfe7f0060ad5a49156570686702c1f90818cb447ac89fea70 /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu... / linux-job (gh)
RuntimeError: Command docker exec -t 480db0d1a7c36dab5bad5fcad68002248ea0d62ef4a0f5038c9b3f23c7e9dbcb /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 4b0b1d65c33164870938ffbf8df2be2b81f6f6e774ef43438791dd275702ff85 /exec failed with exit code 1
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu-22.04-... / linux-job (gh)
RuntimeError: Command docker exec -t 7d8352825296952530f76e68d64937eb4f7056fa9817b492f9a25d3760ee7b83 /exec failed with exit code 1
pull / test-llama-runner-linux-android / linux-job (gh)
RuntimeError: Command docker exec -t 810e9caad19908fe1fdeea17b91f2070d99c174e1a8af1004e743b305eded657 /exec failed with exit code 1
pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t 55606ff3ee67982c67a402dafc320576c28e97fdaa58e2b6f146dfbb1ced6bf4 /exec failed with exit code 1
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t fca48f636a755d8c5091b052a855929a83ddf487e306cee104ccaa4a8b50d522 /exec failed with exit code 1
pull / test-mediatek-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t ee3f66e482515bc61a3fcc0e9418d8538fb14179d2af36fcd564583928969e6d /exec failed with exit code 1
pull / test-models-linux (add_mul, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t c873cdbbf3ca4d2fdc4034e4ac6e04acd4320844d95a97e6a944e54671b4e854 /exec failed with exit code 1
pull / test-models-linux (add_mul, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t f2a2191c85679ff03ee5be3fe3234d7cbc1147d740d6fbdced199ed1e1997df4 /exec failed with exit code 1
pull / test-models-linux (add, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 3b6e6e9d89a08519e1d97912a0fc0731d3ca1ae6cc2330c89426b4e6e5578dc7 /exec failed with exit code 1
pull / test-models-linux (add, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t b3f059f9842e6aafe05d7f6beb76b16cf91ee8ce3f3b560b4d930523d397453f /exec failed with exit code 1
pull / test-models-linux (emformer_join, portable, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 168449c5c45f7d2e16f231b31f4cd8e994267a775c5d8fa212fcc145133d3488 /exec failed with exit code 1
pull / test-models-linux (emformer_join, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t be439ec1d7c01799b69a94e88b5fdae6a1b093fe5dd81cbffe538051291d5d3e /exec failed with exit code 1
pull / test-models-linux (emformer_transcribe, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 76a9a4ddd32d650a99d1afa9c02213850a6a029a38b0e6a663d2c75db4dd1db7 /exec failed with exit code 1
pull / test-models-linux (emformer_transcribe, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t ce31eac8250a691028310defdfb9579e904856302b9ea2d9046e3e0995cd708c /exec failed with exit code 1
pull / test-models-linux (ic3, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t ac7a130dda3da5a52f9b53eb53314aa7a39113a246c87070d87899655a271756 /exec failed with exit code 1
pull / test-models-linux (ic3, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t e90c78f5499ec45d027769d4dcebfe154eece5b193c497554adc9d1cdabf66ef /exec failed with exit code 1
pull / test-models-linux (ic4, portable, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 97e3684f1ff6f569fca834345f52dbb56f92c6c24b4c8c73ab7abf52a36e1ed1 /exec failed with exit code 1
pull / test-models-linux (ic4, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 34544ae65838aad3523f145338f5f08aef4fb2670dd22f022060111c596a8536 /exec failed with exit code 1
pull / test-models-linux (linear, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t a60c6bd0727f9e5cdb34de2cbf7c7ba9bbb17287840c336dc0e42c522ebb9583 /exec failed with exit code 1
pull / test-models-linux (linear, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t e02ec0e927ee0850aa846ed4039b7625e42e7ef64fbf3dd7f70c1891e0cfe6c8 /exec failed with exit code 1
pull / test-models-linux (llama3_2_vision_encoder, portable, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 87d205bbedb52750886a88d19175e81c7ec49c1312bae44f3e7003e15e079933 /exec failed with exit code 1
pull / test-models-linux (mobilebert, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 3ad9ccdb7df30457f8f7805a020bd240aed322f3114ec2251c5bcd6b2b46fc09 /exec failed with exit code 1
pull / test-models-linux (mobilebert, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 90190c369296d13c70d2f27334003dd63244cbdee52970fb99ef760ced689b7d /exec failed with exit code 1
pull / test-models-linux (mv2, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 0e0910108e1bfb28a63b252b901c69734e18a960fce19b199003ab564086b6af /exec failed with exit code 1
pull / test-models-linux (mv2, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 3450471a5141f6330f650ac0fe34d01b57f3c973067b7446f44adf6e41dd2129 /exec failed with exit code 1
pull / test-models-linux (phi_4_mini, portable, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 5458720388af902799818eacb2bd3aeee973ae4abae1c819b2e8614b2ab07177 /exec failed with exit code 1
pull / test-models-linux (resnet18, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t f8889d9d21836cf8672200bf776a5d20a346353b042109153bacf5fd983a2da1 /exec failed with exit code 1
pull / test-models-linux (resnet18, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 70425dd9a02d020f40b1f5a34b7575c8a6ac95b23ea02a24da28a813d8b4bfcc /exec failed with exit code 1
pull / test-models-linux (resnet50, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t a5f4583a5afc5addbb8cc0e14ce5afd64751e923c8e6af1a674dbd478b819ff7 /exec failed with exit code 1
pull / test-models-linux (resnet50, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 560ea06e8ac70551b943248264d7d00caa6328b6a70af56e6c4bb7fd10b6723e /exec failed with exit code 1
pull / test-models-linux (w2l, portable, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 04907c1cc7fc80f602e5e09975e65d76a60f9b502b95169a016b667336a5b0b5 /exec failed with exit code 1
pull / test-models-linux-basic (mv3, portable, buck2, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 8e2fcf670b0b72979052b96780fe77fbdcf48e1eccfce0e9f781c320ebc9af0c /exec failed with exit code 1
pull / test-models-linux-basic (mv3, portable, cmake, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 18af6af53533e8e60274b793f81f87b858fd17bbf6feec96052bc12e7f7e3aae /exec failed with exit code 1
pull / test-models-linux-basic (mv3, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh)
RuntimeError: Command docker exec -t bab67fab08d753130251fd6e1bf73dc1d478110dfc2f70906ad4c8ef8d54183a /exec failed with exit code 1
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, buck2, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t 6da9318f42b2e25136ffe7d8ee15df9de19b1ccaef5e970db2f33674ff121310 /exec failed with exit code 1
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t 44a17eb8d319da9085e3112dfdb72f0c540634ef12a57ebbfa142c8b0e2a9a97 /exec failed with exit code 1
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh)
RuntimeError: Command docker exec -t 4a57a2cb2a2115bf310f72b835fdd775400cd189d739a1f16159aad62dae5880 /exec failed with exit code 1
pull / test-models-linux-basic (vit, portable, buck2, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t e7e84bc8172219a869b52fd291b936355e57804d367ba702793a1fcce106792b /exec failed with exit code 1
pull / test-models-linux-basic (vit, portable, cmake, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 27dd23b43f21d68de488c9bbdb7ade00d274d045998ae27edd70d10d3ec515fb /exec failed with exit code 1
pull / test-models-linux-basic (vit, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh)
RuntimeError: Command docker exec -t ba2423658dc83ffb1ebff4bb79552e9128d0dcf347c8f22c642a67f3903c49bd /exec failed with exit code 1
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, buck2, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t 149a7abadb7c6d73b96840f21f1d4e2a30de019f787789cebdbdc3b1e7974fe4 /exec failed with exit code 1
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, cmake, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t 1a7e83d78ab7a223c3234c41c33fb3e2b4ac524a772a895cc7140ddfdf265925 /exec failed with exit code 1
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh)
RuntimeError: Command docker exec -t d458b88e3068fb5ac2aceca3b8555acbdf4165056382e11660cb3c55efaa8ae1 /exec failed with exit code 1
pull / test-moshi-linux / linux-job (gh)
RuntimeError: Command docker exec -t 58fbb79a6e7969ed442ddddad8a09ccf8cd9cdb637d7b81a465ce411b041fc96 /exec failed with exit code 1
pull / test-multimodal-linux (gemma3-4b) / linux-job (gh)
RuntimeError: Command docker exec -t a4a75a4a5137ceddfa3f6a5fcd527388596052a9ed378bb8d807f00e9e6cdb19 /exec failed with exit code 1
pull / test-openvino-linux / linux-job (gh)
RuntimeError: Command docker exec -t 38ab32c74bd42b11bf4bf876eed0cd525353e69ae1ab7e2aaa503e4c4516610c /exec failed with exit code 1
pull / test-phi-3-mini-runner-linux / linux-job (gh)
RuntimeError: Command docker exec -t 19937cac6fbbd42290d51a125a36b2ed2854387742dc2bc365de247750e3a437 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.10) / linux-job (gh)
RuntimeError: Command docker exec -t e3b1dffefe90ecd86033199388d5a0220d778daf3aaef197f06b5a7d5db3457e /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.11) / linux-job (gh)
RuntimeError: Command docker exec -t 8e21cead7e4d15db512eecea977fcbb5f11500c930be5e0d531a0659ff84e5ac /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.12) / linux-job (gh)
RuntimeError: Command docker exec -t 408f919a1475cc372e4f6a1a3ca2ff795ff38a67bf2cc9e78d0854ed42b2c6d8 /exec failed with exit code 1
pull / test-quantized-aot-lib-linux / linux-job (gh)
RuntimeError: Command docker exec -t 386087e8a9f1c5b26bd30bfa1a81efa8a1b53b162b36572c51a5680f2c690f73 /exec failed with exit code 1
pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 725ba2a0d13bf1df021297dd333c5f355c16166d5573497c1bbfb545f738e2e6 /exec failed with exit code 1
pull / test-selective-build-linux / linux-job (gh)
RuntimeError: Command docker exec -t ec118a311444dc6e9ec1c2e333a62c4a60e1b03e0141c3f2b1120bbee39cec05 /exec failed with exit code 1
pull / test-setup-linux-gcc / linux-job (gh)
RuntimeError: Command docker exec -t ffa1ad376619a2e5fbb7eefe6d7e840745ed1533c387a67bc12d7e01a8fef709 /exec failed with exit code 1
pull / test-static-llama-qnn-linux (stories_110m) / linux-job (gh)
RuntimeError: Command docker exec -t bedc26ccdf9aef38276ba3f747e9133974f1281ca253375e8ff3a73c7b9bd2aa /exec failed with exit code 1
pull / test-static-llama-qnn-linux (stories_260k_bc) / linux-job (gh)
RuntimeError: Command docker exec -t 8d593f9322069eb1505c009727be852072af24d163b91da555fbe7d49a075ce8 /exec failed with exit code 1
pull / test-vulkan-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 145b955db0bf43f34beac4714a4557426b75194ad580786ae0267779d8d6a580 /exec failed with exit code 1
pull / test-vulkan-operators-linux / linux-job (gh)
RuntimeError: Command docker exec -t f9707cb91fa895a04fe167bb720cbef802f19f4bc6fadfd2a641a87142041a7c /exec failed with exit code 1
pull / unittest / linux / linux-job (gh)
RuntimeError: Command docker exec -t a45c701633da21c6bfceb22f2fea280ae4ff2bf7bbc29e8c618bfcab19779640 /exec failed with exit code 1
pull / unittest / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
pull / unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job (gh)
RuntimeError: Command docker exec -t 68c2a27cb2b327d5b147508d2ddece9003a01aaf4aaa35e0516445c88065fec9 /exec failed with exit code 1
pull / unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job (gh)
RuntimeError: Command docker exec -t 1c3c7bd934576c936a2927b9d2650a266489e21a5b6554759fc4c4e11ec2c696 /exec failed with exit code 1
pull / unittest-buck / linux / linux-job (gh)
RuntimeError: Command docker exec -t 9a0eb39eebb48445b7be6fed5da2d6dcae3666ce76e7b5ea5a1bd117b780019f /exec failed with exit code 1
pull / unittest-buck / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
pull / unittest-editable / linux / linux-job (gh)
RuntimeError: Command docker exec -t 4854a8b386119769f8329939b5d72be167167814404eb72d00e74fe6cc2dab2a /exec failed with exit code 1
pull / unittest-editable / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
pull / unittest-nxp-neutron / linux-job (gh)
RuntimeError: Command docker exec -t 83e9504dc3206573f4e4a036152fa35422f4123b2ea4618867526361612a01e1 /exec failed with exit code 1
pull / unittest-wasm-bindings (--enable-etdump) / linux-job (gh)
RuntimeError: Command docker exec -t bd3c04515526d4ddf2fd5177950687fc071b9c93fe033d700d8333ac9adb7c80 /exec failed with exit code 1
Test CUDA Builds / check-all-cuda-builds (gh)
Process completed with exit code 1.
Test CUDA Builds / export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t e8cb9ad327aed7e5e5f8b7a0c02014ed2ea68a822e13393f5c5153185cb11b0e /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 813731d3e0d3dcd5a2ea4a48c49080a70e3df6fd89a6e41806c3512bbbd08f4a /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t c381fc8548e9f0097d052123b4584ddd93df7f9805a00ce1f69a136811a58ee5 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 5f1f5ba2d6d3d978cacb350d7b179ebe84de36737948595f6a5c3f976a366eda /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t 4812ad701227cbd0be55fd9fd711819259166752cc33202ce656e56c6af6b2e6 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-large-v3-turbo, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t 6e9339d6fc04cc356e796fcd60efa734163630204e9c4282a01882d64aed1887 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 31ef7535a51b9ee5abe0eb0d79b8ea558a325cd69cc54131f5230fdc5b33dbfe /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t 491c3030b1fa25b6b675c52eb015aa6bc45c805a58d397e06f0a39e3b403903e /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t a903cd31ec364976c16bab22483113e5d333c2575f7ba76dc525420fa2019962 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t d3dfb6019ac56cbeba117b24be18956a7d2e2926f8349085903d118cfb98ca55 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t 6b086b659ed5eb21c4876d8034a185f4f468dd0458a1c945c65b23d958c35cba /exec failed with exit code 1
Test CUDA Builds / test-executorch-cuda-build-12.6 / linux-job (gh)
RuntimeError: Command docker exec -t ca4307ded9196f369b61c3f4de95282ca54c98276327d3769dfbbd1040037897 /exec failed with exit code 1
Test CUDA Builds / test-executorch-cuda-build-12.8 / linux-job (gh)
RuntimeError: Command docker exec -t 3ab85d1e2fba9a3a7b67c2815c02e5fe22dbccdfdc8c5fc5a830203a15b4266f /exec failed with exit code 1
Test CUDA Builds / test-executorch-cuda-build-13.0 / linux-job (gh)
RuntimeError: Command docker exec -t fb7bad9894497abf96a93068365a434dcb08b3b002a7442e5f9417937d201cce /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (add_mul) / linux-job (gh)
RuntimeError: Command docker exec -t 55aa6df4d2da7fe6bcb1ca6973adfde19b0be855b0bc17abbdf35ab0a6407bbe /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (add) / linux-job (gh)
RuntimeError: Command docker exec -t a54746b9c4e6ed54ba5420070afc93f48ad5cdc8557954886b968040f77303fb /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (conv1d) / linux-job (gh)
RuntimeError: Command docker exec -t ffaf10324abaf3b7dd5b3d259de0336ce6093c3d7eae28c086937c09c2eb1a5a /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (linear) / linux-job (gh)
RuntimeError: Command docker exec -t 0ccc9c6c48204466e172a4340289e76f1f752b7322a5f2936f4726215fe95720 /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (resnet18) / linux-job (gh)
RuntimeError: Command docker exec -t f3039afe01c9ce796d935de60e1aa05163478d5cc3a2847c045f7ddd8156e3b7 /exec failed with exit code 1
Test Metal Backend / export-model-metal-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
Test Metal Backend / export-model-metal-artifact (openai, whisper-large-v3-turbo, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
Test Metal Backend / export-model-metal-artifact (openai, whisper-small, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
Test Metal Backend / test-executorch-metal-build / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

We observed that on iOS it improves perf by 6% because SDPA op does temp allocations. No significant difference on android though. Differential Revision: [D86120038](https://our.internmc.facebook.com/intern/diff/D86120038/) ghstack-source-id: 322321964 Pull Request resolved: #15730

github-actions · 2025-11-11T04:35:25Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

We observed that on iOS it improves perf by 6% because SDPA op does temp allocations. No significant difference on android though. Differential Revision: [D86120038](https://our.internmc.facebook.com/intern/diff/D86120038/) [ghstack-poisoned]

Copilot

Pull Request Overview

This pull request integrates a CPU caching allocator into the LLM runner to improve performance by reducing temporary memory allocation overhead during inference. According to the PR description, this change provides a 6% performance improvement on iOS for operations like SDPA that perform temporary allocations, though no significant difference was observed on Android.

Key changes:

Added CPUCachingAllocator as a temporary memory allocator for Module instances
Updated build system dependencies in targets.bzl, CMakeLists.txt files
Configured allocator with a 10MB cache size for temporary allocations

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
extension/llm/runner/targets.bzl	Added dependency on cpu_caching_allocator for Buck build system
extension/llm/runner/llm_runner_helper.cpp	Instantiated CPUCachingAllocator with 10MB cache for Module temp allocations
extension/llm/runner/CMakeLists.txt	Added extension_memory_allocator to runner dependencies for CMake
CMakeLists.txt	Added memory_allocator subdirectory to build when LLM runner is enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-17T16:18:32Z

CMakeLists.txt

 endif()

 if(EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER)
+  add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/extension/memory_allocator/runner)


The path extension/memory_allocator/runner does not exist in the repository. The memory allocator CMakeLists.txt is located at extension/memory_allocator/CMakeLists.txt. This line should be:

add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/extension/memory_allocator)

Suggested change

add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/extension/memory_allocator/runner)

add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/extension/memory_allocator)

Copilot · 2025-11-17T16:18:32Z

extension/llm/runner/llm_runner_helper.cpp


  // Create the Module
  std::unique_ptr<Module> module;
+  uint32_t max_cached_memory_size_bytes_ = 1024 * 1024 * 10; // 10MB


The hardcoded value of 10MB for the caching allocator size should be documented or made configurable. According to the PR description, this improves performance by 6% on iOS for SDPA op temp allocations, but different models or use cases may benefit from different cache sizes. Consider:

Adding a comment explaining why 10MB was chosen

Making this value configurable through a parameter or constant

Documenting the performance implications in code comments

kimishpatel requested review from jackzhxng, kirklandsign, larryliu0820 and mergennachin as code owners November 11, 2025 04:34

This was referenced Nov 10, 2025

[Executorch][llm] Fix flakyness of quantized sdpa test #15653

Merged

[Executorch] parallelize op_choose_qparams #15607

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 11, 2025

meta-codesync bot added fb-exported meta-exported labels Nov 11, 2025

kimishpatel mentioned this pull request Nov 14, 2025

[Executorch] make slice_copy parallel #15830

Open

mergennachin requested a review from Copilot November 17, 2025 16:15

Copilot started reviewing on behalf of mergennachin November 17, 2025 16:15 View session

Copilot finished reviewing on behalf of mergennachin November 17, 2025 16:17

Copilot AI reviewed Nov 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Executorch][LLM] Use caching allocator for runner #15730

[Executorch][LLM] Use caching allocator for runner #15730

Uh oh!

kimishpatel commented Nov 11, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 11, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 17, 2025

Uh oh!

Copilot AI Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/extension/memory_allocator/runner)
	add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/extension/memory_allocator)

[Executorch][LLM] Use caching allocator for runner #15730

Are you sure you want to change the base?

[Executorch][LLM] Use caching allocator for runner #15730

Uh oh!

Conversation

kimishpatel commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15730

❌ 122 New Failures

Uh oh!

github-actions bot commented Nov 11, 2025

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kimishpatel commented Nov 11, 2025 •

edited

Loading

pytorch-bot bot commented Nov 11, 2025 •

edited

Loading

This PR needs a `release notes:` label