Improve performance of loading json graph profile #4

echoix · 2024-04-21T11:34:46Z

I had an issue where profile my project using mkcheck generated a graph json file of about 40 MB, and even the simplest of the python tool operations, like "list" would take over 8min 27s to complete.

I profiled the complete python invocation using scalene, and found out that it wasn't a CPU-only bottleneck, but a memory one in parse_graph(). The inputs and outputs are sets, and are filled in through the loop with the | (union) operator, which returns a new set with elements from the set and all others.. The complete object was copyied over to a new one at each iteration. This is why the scalene profiling showed a peak memory allocation of 149GB only for the line

inputs = inputs | proc_in

(It probably wasn't used at once, but the multiple assignments throughout the loop iterations).

The solution for this, is to use the update operator, which updates the set, adding elements from all others.. The same object is used and the set contains the new elements. Also, since the proc_in variable wasn't used anywhere else, I inlined the call, removing an extra set instantiation.

After this, I made a small change that doesn't help at much: using json.load(f) instead of json.loads(f.read()). The function json.loads() parses a string (that comes from f.read()), while json.load() loads json from a file.

With these changes, I passed from 8 min 27s to 26 seconds, which is way more managable.

… file to string and from string to json

echoix added 2 commits April 21, 2024 10:55

graph.py: Update inputs and outputs set instead of returning a new set

7f75631

graph.py: Load json directly from file handle instead of from loading…

5b3da2a

… file to string and from string to json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance of loading json graph profile #4

Improve performance of loading json graph profile #4

Uh oh!

echoix commented Apr 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Improve performance of loading json graph profile #4

Are you sure you want to change the base?

Improve performance of loading json graph profile #4

Uh oh!

Conversation

echoix commented Apr 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant