-
Notifications
You must be signed in to change notification settings - Fork 678
Description
Problem description
Hello, and thanks for all your hard work on this library!
I want to share a situation that might look like a memory leak in @grpc/grpc-js
, but I want to stress that this isn't the library’s fault. Instead, it's caused by the way the client is being re-initialized in application code.
I'm creating this issue to help others who might run into similar issues and to suggest potential improvements that could make it easier to detect or prevent.
We noticed an ongoing increase in memory usage when our application handled UNAVAILABLE
or DEADLINE_EXCEEDED
errors. In those cases, we were re-initializing a new client without closing the existing one. Over time, the detached references to the old clients accumulated in memory, resulting in a crash.
It was difficult to pinpoint the source of the leak because the attached memory references aren’t always obvious in devtools, especially if your monitoring tools (like Datadog) don’t highlight “detached nodes.” As soon as we added client.close()
before creating a new client, memory usage stabilized and the issue was resolved.
Reproduction steps
server.js
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
const path = require('path');
// Load proto definition
const PROTO_PATH = path.join(__dirname, 'ping.proto');
const packageDefinition = protoLoader.loadSync(PROTO_PATH);
const pingProto = grpc.loadPackageDefinition(packageDefinition).pingpong;
let callCount = 0;
function ping(call, callback) {
callCount++;
console.log('Received ping request #', callCount);
// Every 10th call, return an error with code = UNAVAILABLE
if (callCount % 10 === 0) {
const error = {
code: grpc.status.UNAVAILABLE,
message: 'Simulated server error: UNAVAILABLE',
};
console.log('Sending error for request #', callCount);
return callback(error, null);
}
// For normal requests, just echo back a "pong"
callback(null, { message: `pong: ${call.request.message}` });
}
function main() {
const server = new grpc.Server();
server.addService(pingProto.PingService.service, { Ping: ping });
const address = '0.0.0.0:50051';
server.bindAsync(address, grpc.ServerCredentials.createInsecure(), (err, port) => {
if (err) {
return console.error(err);
}
console.log(`Server running at http://${address}`);
server.start();
});
}
main();
client.js
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
const path = require('path');
// Load proto definition
const PROTO_PATH = path.join(__dirname, 'ping.proto');
const packageDefinition = protoLoader.loadSync(PROTO_PATH);
const pingProto = grpc.loadPackageDefinition(packageDefinition).pingpong;
// For convenience
const { status } = grpc;
let client;
function initClient() {
console.log('Initializing client');
client = new pingProto.PingService(
'localhost:50051',
grpc.credentials.createInsecure()
);
}
function makePingCall(message) {
return new Promise((resolve, reject) => {
client.Ping({ message }, (err, response) => {
if (err) {
return reject(err);
}
resolve(response);
});
});
}
async function pingWithRetry(message, retries = 3) {
for (let attempt = 0; attempt < retries; attempt++) {
try {
const response = await makePingCall(message);
return response;
} catch (err) {
const code = err.code;
console.error(
`Ping call failed [attempt=${attempt + 1}] with code=${code}, msg="${err.message}"`
);
if (
code === status.UNAVAILABLE ||
code === status.DEADLINE_EXCEEDED
) {
console.log('Re-initializing client and retrying...');
initClient();
} else {
// Non-retryable error, just throw
throw err;
}
}
}
throw new Error('Max retries reached.');
}
async function main() {
initClient();
let count = 0;
// Ping once every second
setInterval(async () => {
count++;
try {
const response = await pingWithRetry(`Hello #${count}`, 5);
console.log('Got response:', response.message);
} catch (err) {
console.error('Failed after retries:', err.message);
}
}, 50);
}
main();
ping.proto
syntax = "proto3";
package pingpong;
service PingService {
rpc Ping(PingRequest) returns (PingResponse);
}
message PingRequest {
string message = 1;
}
message PingResponse {
string message = 1;
}
Now npm install:
$ npm install
Run the server
node server.js
Run the client
node --inspect client.js
Now you should see something like this:
Observe memory usage over time (e.g., using DevTools).
Open devtools, keep taking heapsnapshots over time, you'll see how memory grows continuously:
The memory leak is located at this place in client.js:
function initClient() {
console.log('Initializing client');
client = new pingProto.PingService(
'localhost:50051',
grpc.credentials.createInsecure()
);
}
...
if (
code === status.UNAVAILABLE ||
code === status.DEADLINE_EXCEEDED
) {
console.log('Re-initializing client and retrying...');
initClient(); // <--- this causes client reinitialization
} else {
// Non-retryable error, just throw
throw err;
}
The problem is that initClient
creates a brand new client without closing the previous one.
When a previous client is not closed, the application level code stops holding references to that client, which causes it's internal objects to become "detached", mostly because of Nodejs Timers that keep holding references to them. As per what the official documentation says, Objects retained by detached nodes: objects that are kept alive because a detached DOM/object node references them..
These detached nodes stay in memory forever. In our case, they were adding up to 100mb per day until the server crashed.
I know this has been discussed previously here.
The obvious fix for this problem is to close the previous client before initializing a new one:
function initClient() {
if (client) {
console.log('Closing previous client');
client.close();
}
console.log('Initializing client');
client = new pingProto.PingService(
'localhost:50051',
grpc.credentials.createInsecure()
);
}
After closing the client, the memory becomes stable:
Environment
- OS name, version and architecture: Apple M4 Max
- Node version: v20.18.0
- Node installation method: nvm
- Package name and version: "@grpc/grpc-js": "^1.12.4"
Additional context
Although this is not a bug in @grpc/grpc-js
, it could be helpful if the library:
- Logged a warning when multiple active clients are detected (optionally suppressible).
- Offered a singleton-like pattern or documented best practices for client lifecycle management.
These measures could prevent unintentional client reinitialization without proper cleanup, which can be hard to track down in large applications.
Our debugging process was complicated by the fact that profiling tools like Datadog don't highlight detached nodes. It took some trial and error to trace the leak to unclosed gRPC client instances. We hope this helps others who might face similar issues.
If there's anything more we can clarify or test, please let us know. Thank you again for maintaining this library and for considering these suggestions.