Skip to content

Conversation

@simmsa
Copy link
Contributor

@simmsa simmsa commented Aug 4, 2025

Summary

Fix #408: ZeroDivisionError when reading RDI burst mode data files where sec_between_ping_groups=0 causes division by zero in sampling frequency calculation.

Changes

  • Added check for sec_between_ping_groups == 0 before calculation of fs.

    • When true, set fs to NaN with descriptive warning.
  • Added test using unittest.mock.patch to verify fix behavior with burst mode configuration.

  • Fixed pytest-asyncio deprecation warning in test configuration.

Details

Per #408 RDI instruments in continuous burst mode set sec_between_ping_groups=0 for maximum ping rate, causing ZeroDivisionError in:

cfg["fs"] = 1 / (cfg["sec_between_ping_groups"] * cfg["pings_per_ensemble"])

The fix detects this condition and sets fs=NaN since burst mode operation with variable timing prevents reliable sampling rate determination using standard methods.

Testing

Patch RDIReader.finalize to set sec_between_ping_groups to 0 and verify the fix prevents the ZeroDivisionError, sets fs to np.nan and warns the user.

simmsa added 5 commits August 4, 2025 09:49
MHKiT-Software#408)

Resolves issue MHKiT-Software#408
where RDI Pinnacle 45 instruments in continuous burst mode (sec_between_ping_groups=0)
caused ZeroDivisionError when calculating sampling frequency.

Changes:
- Add check for zero values before division in rdi.py:1047
- Set fs to NaN with descriptive warning when division by zero would occur
- Add test case covering both zero conditions and normal operation

The fix handles the edge case where burst mode instruments output zero for
sec_between_ping_groups or pings_per_ensemble, which indicates variable
ping rates where no meaningful average sampling rate can be calculated.
Add asyncio_default_fixture_loop_scope setting to fix pytest-asyncio
configuration warning during test execution.
* Only check `sec_between_ping_groups` is zero.
* Point users to github issue
@simmsa simmsa requested a review from jmcvey3 August 4, 2025 19:33
@simmsa simmsa changed the title DOLFyN/RDI: Set fs to NaN when typical calculation methods yield error (#408) DOLfYN/RDI: Set fs to NaN when typical calculation methods yield error (#408) Aug 4, 2025
@jmcvey3
Copy link
Contributor

jmcvey3 commented Aug 6, 2025

Since the "fs" attribute is critical for many dolfyn functions, I've assumed it cannot be set to nan. It's annoying that RDI does not explicitly set a sampling rate, but I've been trying to log the different ways that it can be recorded. So if "sec_between_ping_groups" = 0, (as you say, any instrument doing burst measurements), there needs to be another if statement for figuring out what the sampling rate is. Might be worthwhile just adding this as another condition to the previous "if" clause.

@simmsa
Copy link
Contributor Author

simmsa commented Aug 6, 2025

@jmcvey3, understood on not setting cfg['fs'] to np.nan. I was unsure if using an averaging method to calculate sample rate was correct for this edge case, but it sounds like it is. The code was updated accordingly. Does this seem like a better fix?

@simmsa simmsa marked this pull request as ready for review August 6, 2025 20:10
@jmcvey3
Copy link
Contributor

jmcvey3 commented Aug 11, 2025

@jmcvey3, understood on not setting cfg['fs'] to np.nan. I was unsure if using an averaging method to calculate sample rate was correct for this edge case, but it sounds like it is. The code was updated accordingly. Does this seem like a better fix?

Yes, it's the best idea I have for a fix.
I use median instead of mean because it's better at ignoring corrupted pings or instances where the instrument's battery dies (instead of immediately shutting down, it starts slowing down its ping rate), etc. Downside of that is you can get an irrational number for a sampling frequency (e.g., 0.499723937 instead of 0.5)

@simmsa
Copy link
Contributor Author

simmsa commented Aug 13, 2025

@jmcvey3 thanks for the feedback. The median solution seems like a good compromise for this specific situation.

I think this is ready to go, is there anything else needed in this PR?

@jmcvey3
Copy link
Contributor

jmcvey3 commented Aug 15, 2025

@jmcvey3 thanks for the feedback. The median solution seems like a good compromise for this specific situation.

I think this is ready to go, is there anything else needed in this PR?

Let me check to see if any of our current test files actually have this case. I can't remember, but the SentinelV test files I added might.

cfg["fs"] = round(1 / np.median(np.diff(dat["coords"]["time"])), 2)
calculate_sample_rate_from_time_diff = (
cfg.get("source_program", "").lower() in ["vmdas", "winriver", "winriver2"]
or "sentinelv" in cfg["inst_model"].lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't pull your branch, but please remove this line 1043. It's no longer necessary and your solution is cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added you as a collaborator, you should be able to push changes now. The latest comments line 1043, are you indicating that line should be removed entirely? I can happily remove that line, I just want to make sure I am doing the right thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can remove all of or "sentinelv" in cfg["inst_model"].lower(). The tests should still pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, removed that line.

@akeeste akeeste merged commit c0bd56e into MHKiT-Software:develop Aug 18, 2025
54 checks passed
akeeste pushed a commit to akeeste/MHKiT-Python that referenced this pull request Sep 30, 2025
…ror (MHKiT-Software#408) (MHKiT-Software#409)

## Summary

Fix MHKiT-Software#408: ZeroDivisionError when reading RDI burst mode data files where
`sec_between_ping_groups=0` causes division by zero in sampling
frequency calculation.

## Changes

- Added check for `sec_between_ping_groups == 0` before calculation of
`fs`.

  - When true, set `fs` to `NaN` with descriptive warning.

- Added test using `unittest.mock.patch` to verify fix behavior with
burst mode configuration.

- Fixed pytest-asyncio deprecation warning in test configuration.

## Details

Per MHKiT-Software#408 RDI instruments in continuous burst mode set
`sec_between_ping_groups=0` for maximum ping rate, causing
ZeroDivisionError in:

```python
cfg["fs"] = 1 / (cfg["sec_between_ping_groups"] * cfg["pings_per_ensemble"])
```

The fix detects this condition and sets fs=NaN since burst mode
operation with variable timing prevents reliable sampling rate
determination using standard methods.

## Testing

Patch RDIReader.finalize to set `sec_between_ping_groups` to `0` and
verify the fix prevents the `ZeroDivisionError`, sets `fs` to `np.nan`
and warns the user.
@akeeste akeeste mentioned this pull request Sep 30, 2025
akeeste added a commit that referenced this pull request Sep 30, 2025
v1.0.0
# MHKiT v1.0.0
## New Features
* Sound Exposure Level by @jmcvey3 in #388
* Add discharge function to MHKiT by @jmcvey3 in #385

## Functionality enhancements
* Fix for corrupted Nortek files by @jmcvey3 in #372
* Update integral length scale function by @jmcvey3 in #376
* Fix ever-changing RDI RiverPro depth bin ranges by @jmcvey3 in #378
* Allow clean functions to handle _avg variables by @jmcvey3 in #377
* IEC TS 62600 updates by @akeeste in #382
* MLER explanation updates/corrections by @rgcoe in #393
* Improve Nortek2 index file creator functions by @jmcvey3 in #397
* Read Sentinel V specific data packets by @jmcvey3 in #396
* Short list of VMDAS updates by @jmcvey3 in #405
* Allow user to specify universal Kolmogorov constant for TKE dissipation rate function by @jmcvey3 in #406
* Nortek Dual Profile Dataset Rotation by @jmcvey3 in #414

## Source code improvements
* Lint Tidal by @ssolson in #386
* Lint river module by @ssolson in #389
* Lint hindcast by @ssolson in #398
* Modernize Package Configuration by @ssolson in #400
* Configure specific warnings by @ssolson in #401

## Bug fixes
* Avoid failing to scan very large files by @jmcvey3 in #371
* Acoustics SPL bugfix by @jmcvey3 in #379
* DOLfYN/RDI: Set  `fs` to NaN when typical calculation methods yield error (#408) by @simmsa in #409

## Testing and Continuous Integration Updates
* Fix Jupyter Notebook tests running Python 3.13 by @ssolson in #380
* CI Test Clean Up: Mock USGS, Acoustic Tolerances by @ssolson in #404
* Speed up tests with concurrency checks to prevent duplicate workflows on PRs from develop into main or from main into develop by @akeeste
* Define MPLBACKEND to decrease intermittent matplotlib errors in tests by @akeeste

## Documentation and Examples
* Add WEC-Sim power performance example  by @akeeste in #395
* Update dolfyn function docstrings and associated notebooks by @jmcvey3 in #412
* Update examples by @akeeste in #417
* Update installation instructions in README.md by @akeeste
* Adjust acoustics test tolerances by @akeeste in #420

**Full Changelog**: v0.9.0...v1.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants