-
-
Notifications
You must be signed in to change notification settings - Fork 55
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Have you tried latest version of polars?
- [yes]
What version of polars are you using?
0.7.3
What operating system are you using polars on?
Macos 13.0
What node version are you using
node 18.6.0
Describe your bug.
Memory usage seems unreasonably large and slow when passing data to nodejs-polars
.
With the example code below, it takes about 20 seconds and balloons the memory usage to around 3.5GB.
When doing a simple copy of the data in JS only, the operation takes 10-20ms and peak memory usage is around 325MB:
What are the steps to reproduce the behavior?
const { DataFrame, Series } = require("nodejs-polars")
function runBuggyCode(entries) {
let df = new DataFrame(entries)
const timestampSeries = new Series("created_at", new Array(df.height).fill(Date.now()))
df = df.withColumn(timestampSeries)
df.writeParquet()
}
async function main() {
const data = Array(50000)
.fill(null)
.map((_, i) =>
Array(100)
.fill(0)
.map((_, ii) => i * 100 + ii)
)
let count = 0
let peakMemUsage = 0
while (true) {
const start = Date.now()
runBuggyCode(data)
console.log("process time:", Date.now() - start, "ms")
peakMemUsage = Math.max(process.memoryUsage().rss, peakMemUsage)
console.log("peak rss mem usage:", Math.round(peakMemUsage / 1024 ** 2), "MB")
console.log("run:", count++)
await new Promise((res) => setTimeout(res, 100))
}
}
if (require.main === module) {
void main().catch((err) => console.error(err))
}
Example JS replacement for comparison:
function runBuggyCode(entries) {
const copy = entries.map(row => row.slice())
}
What is the actual behavior?
Code runs as expected, it just uses what seems to be an unreasonable amount of memory.
What is the expected behavior?
Memory usage should be somewhat comparable to "maybe" 2x the usage for the data when not using library.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working