-
Notifications
You must be signed in to change notification settings - Fork 678
Description
Problem description
When running multiple gRPC requests asynchronously with the same gRPC client, gRPC client at some point starts cancelling thoe calls with the following error message:
Received RST_STREAM with code 2 triggered by internal client error: Session closed with error code 2
I've enabled the following ENV variables GRPC_TRACE=all
and GRPC_VERBOSITY=DEBUG
and these are the logs for one of the calls that received the RST_STREAM error.
D 2023-11-28T11:52:30.785Z | subchannel_call | [5567] write() called with message of length 433
D 2023-11-28T11:52:30.785Z | subchannel_call | [5567] sending data chunk of length 433
D 2023-11-28T11:52:31.297Z | subchannel_call | [5567] end() called
D 2023-11-28T11:52:31.297Z | subchannel_call | [5567] calling end() on HTTP/2 stream
D 2023-11-28T11:52:35.382Z | subchannel_call | [5567] Node error event: message=Session closed with error code 2 code=ERR_HTTP2_SESSION_ERROR errno=Unknown system error undefined syscall=undefined
D 2023-11-28T11:52:35.393Z | subchannel_call | [5567] HTTP/2 stream closed with code 2
D 2023-11-28T11:52:35.393Z | subchannel_call | [5567] ended with status: code=13 details="Received RST_STREAM with code 2 triggered by internal client error: Session closed with error code 2"
Right before the calls start triggering the ERR_HTTP2_SESSION_ERROR
, i found the following a log explaining the client got closed by GOAWAY with code 2. 0.0.0.0:3000 connection closed by GOAWAY with code 2
.
D 2023-11-28T11:52:35.354Z | resolving_call | [6070] Received status
D 2023-11-28T11:52:35.354Z | resolving_call | [6070] ended with status: code=0 details="OK"
D 2023-11-28T11:52:35.354Z | subchannel_call | [6202] HTTP/2 stream closed with code 0
D 2023-11-28T11:52:35.354Z | subchannel_call | [6424] Received server trailers:
grpc-status: 0
grpc-message: OK
D 2023-11-28T11:52:35.354Z | subchannel_call | [6424] received status code 0 from server
D 2023-11-28T11:52:35.354Z | subchannel_call | [6424] received status details string "OK" from server
D 2023-11-28T11:52:35.354Z | subchannel_call | [6424] close http2 stream with code 0
D 2023-11-28T11:52:35.354Z | subchannel_call | [6424] ended with status: code=0 details="OK"
D 2023-11-28T11:52:35.354Z | load_balancing_call | [6338] Received status
D 2023-11-28T11:52:35.354Z | load_balancing_call | [6338] ended with status: code=0 details="OK"
D 2023-11-28T11:52:35.354Z | retrying_call | [6337] Received status from child [6338]
D 2023-11-28T11:52:35.354Z | retrying_call | [6337] state=COMMITTED handling status with progress PROCESSED from child [6338] in state ACTIVE
D 2023-11-28T11:52:35.354Z | retrying_call | [6337] ended with status: code=0 details="OK"
D 2023-11-28T11:52:35.354Z | resolving_call | [6247] Received status
D 2023-11-28T11:52:35.354Z | resolving_call | [6247] ended with status: code=0 details="OK"
D 2023-11-28T11:52:35.354Z | subchannel_call | [6424] HTTP/2 stream closed with code 0
D 2023-11-28T11:52:35.355Z | transport | (12) 0.0.0.0:3000 connection closed by GOAWAY with code 2
D 2023-11-28T11:52:35.355Z | subchannel | (11) 0.0.0.0:3000 READY -> IDLE
D 2023-11-28T11:52:35.355Z | subchannel_refcount | (11) 0.0.0.0:3000 refcount 2 -> 1
D 2023-11-28T11:52:35.355Z | pick_first | READY -> IDLE
D 2023-11-28T11:52:35.355Z | resolving_load_balancer | dns:0.0.0.0:3000 READY -> IDLE
D 2023-11-28T11:52:35.355Z | connectivity_state | (1) dns:0.0.0.0:3000 READY -> IDLE
You can find the full logs in the attached file:
grpcLogs.txt
This error happened after upgrading our NodeJS project to version 20.10.0. This problem is only replicable when there are a ''large'' (2361 calls from the logs) amount of gRPC calls triggered. When there is a reduced amout of calls, I couldn't replicate the error.
Moreover, the ERR_HTTP2_SESSION_ERROR
is not replicable with lower versions Node versions ( issue didn't appear when runnin on node 20.5.1 ).
Reproduction steps
Use Nodejs 20.10.0 and install grpc/grpc-js 1.9.12.
Run multiple gRPC calls with the same client asynchornously. Issue is also replicable with other older versions of
grpc/grpc-js such as 1.8.14 .
Environment
- OS name, version and architecture: Arch Linux amd64
- Node version 20.10.0/20.9.0
- Node installation method: nvm
- Package name and version: 1.9.12