Performance regression (>10x) from 4.1.2 to 4.1.3

Our users observed a significant performance regression (10x and more) and it turns out to be introduced somewhere between 4.1.2 and 4.1.3 (release on 29th September 2021). The `awkward` version is the latest `1.8.0`.

I went through the release notes in https://github.com/scikit-hep/uproot4/releases/tag/4.1.3 and I first reverted https://github.com/scikit-hep/uproot4/pull/441 which also needed the `objectarray1d` function from `uproot._util` (see https://github.com/scikit-hep/uproot4/pull/442/files) but it did not help.

I also went through the other PRs but they seemed unrelated, so I started a `git bisect` with

Good: `382bb62f47a05f6111d3787e3235f9e56d9e113e` (4.1.2)
Bad: `03cb08ccb1f47ce0b082117f41a112af1ab403ee` (4.1.3)

and identified the https://github.com/scikit-hep/uproot4/commit/007b10014003a74d0e89e0166e21a6218de0d445 from @nsmith- which introduced the performance regression:

```
░ tamasgal@silentbox-(2):uproot4   (git)-[4.1.3~8|bisect]- py-system
░ 13:30:56 > git bisect good
007b10014003a74d0e89e0166e21a6218de0d445 is the first bad commit
commit 007b10014003a74d0e89e0166e21a6218de0d445
Author: Nicholas Smith <nick.smith@cern.ch>
Date:   Mon Sep 20 14:09:59 2021 -0500

    Implement regular array support for TClonesArray (#442)

    * Implement regular array support for TClonesArray

    * Add a small test

    * Fix issue with incidentally regular TRefArray lists

    * Cannot trust dict order

 src/uproot/_util.py                     | 13 +++++++++
 src/uproot/containers.py                | 45 ++++++++++++++++++++++---------
 src/uproot/interpretation/identify.py   |  4 +--
 src/uproot/interpretation/library.py    | 20 +++++++++++---
 src/uproot/interpretation/numerical.py  | 11 ++++++++
 src/uproot/interpretation/objects.py    | 48 +++++++++++++++++++++++++++------
 tests/test_0442-regular-TClonesArray.py | 40 +++++++++++++++++++++++++++
 7 files changed, 155 insertions(+), 26 deletions(-)
 create mode 100644 tests/test_0442-regular-TClonesArray.py
```

I investigated our codes and found the exact call which causes the problem: it's indeed the reading of doubly nested arrays.

With `uproot 4.1.2`, the following code is almost instantaneous whereas with `uproot 4.1.3` it takes about 1.5 minutes (for a 1.5GB file):

```
>>> import uproot
>>> f = uproot.open("doubly_nested_regression.root")
>>> f["E/Evt/trks/trks.rec_stages"].array()
<Array [[[1, 2, 5, 3, 5, 4], ... 1], [1], [1]]] type='158262 * var * var * int32'>
```

Here is the file http://131.188.161.12:30002/doubly_nested_regression.root

@jpivarski you probably remember that you introduced some special hack to squeeze a lot out of doubly-nested structures, see https://github.com/scikit-hep/awkward-1.0/issues/968 and https://github.com/scikit-hep/awkward-1.0/blob/main/src/awkward/_connect/_uproot.py#L35 so my first thought was that these became ineffective with the changes introduced in https://github.com/scikit-hep/uproot4/commit/007b10014003a74d0e89e0166e21a6218de0d445 but I am just speculating `; )`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance regression (>10x) from 4.1.2 to 4.1.3 #572

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance regression (>10x) from 4.1.2 to 4.1.3 #572

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions