Skip to content

Conversation

@jmcvey3
Copy link
Contributor

@jmcvey3 jmcvey3 commented Feb 6, 2025

Makes the ADCP cleaning functions more robust - updated based on latest reader updates for dual profiling instruments. Solution for Issue #373

Copy link
Contributor

@ssolson ssolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmcvey3 thanks for submitting this. I think most of these questions are for me but maybe I found something helpful.

Comment on lines +160 to +169
# Use "avg" velocty if standard isn't available.
# Should not matter which is used.
tag = []
if hasattr(ds, "vel"):
tag += [""]
if hasattr(ds, "vel_avg"):
tag += ["_avg"]

# This finds the maximum of the echo profile:
inds = np.argmax(ds["amp"].values, axis=1)
inds = np.argmax(ds["amp" + tag[0]].values, axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tag[0] could throw an error since it is initialized as an empty list.

I think this should have an else tag=[''] (or maybe its initialized as an empty string and only add _avg ?) or have better error handling:
raise ValueError("Neither 'vel' nor 'vel_avg' found in dataset")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thoughts; however, if "vel" doesn't exist, "amp" and "corr" will also not exist. The signal amplitude and correlation are the quality analysis of the velocity ping's signal.

ds["depth"] = xr.DataArray(
d.astype("float32"),
dims=["time"],
dims=["time" + tag[0]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed to have a time average if we have a vel_avg?

Copy link
Contributor Author

@jmcvey3 jmcvey3 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I hardcoded it. The data stored under the "averaging" ID has their own timestamp, and I log "time" from that data ID with the "_avg" tag.

Comment on lines 188 to 194
d = np.median(D, axis=0)

# Throw out values that do not increase near the surface by *thresh*
for ip in range(ds["vel"].shape[1]):
for ip in range(ds["vel" + tag[0]].shape[1]):
itmp = np.min(inds[:, ip])
if (edf[itmp:, :, ip] < thresh).all():
d[ip] = np.nan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on 188 here you are using the median on d1 and d2 and I notice on 194 you add a nan to the array.

Will d1 or d2 ever have nan? If so median will always return nan. Is that the behavior you want? Or would np.nanmedian be preferred?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm this is a Levi function... I'll add nan to all of those median min and max functions, because if this is called after another QC function that would be a problem.

# Density calcation
P = ds["pressure"].values
T = ds["temp"].values # temperature, degC
P = ds[pressure[0]].values # pressure, dbar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do all instruments use dbar (over Pa)? or is it well described in the examples? Should this be added to the docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they all use dbar because it translates nearly 1:1 as meters-beneath-the-surface (assuming the pressure sensor was zeroed before deploying, of course)

Comment on lines 393 to 404
# Fetch cell size
cs = [
a
for a in ds.attrs
if (
("cell_size" in a)
and ("_bt" not in a)
and ("_alt" not in a)
and ("wave" not in a)
)
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this:
raise KeyError("No valid 'cell_size' attribute found in dataset.")

Or will there always be a "cell_size"?

Copy link
Contributor Author

@jmcvey3 jmcvey3 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will always be a "cell_size" if the user doesn't remove it. I'll add a code block for user input if need be.

@akeeste
Copy link
Contributor

akeeste commented Feb 20, 2025

With Sterling's commits to develop pulled in, tests are now passing

@akeeste
Copy link
Contributor

akeeste commented Feb 25, 2025

@jmcvey3 are there any other outstanding changes in this PR from @ssolson's review?

@jmcvey3
Copy link
Contributor Author

jmcvey3 commented Feb 25, 2025

@jmcvey3 are there any other outstanding changes in this PR from @ssolson's review?

I was waiting on a response in issue #373, but looks like we're good to go.

@jmcvey3 jmcvey3 merged commit 7e91cea into MHKiT-Software:develop Feb 25, 2025
43 checks passed
@jmcvey3 jmcvey3 deleted the clean_avg branch February 25, 2025 17:11
akeeste added a commit to akeeste/MHKiT-Python that referenced this pull request Sep 30, 2025
Makes the ADCP cleaning functions more robust - updated based on latest
reader updates for dual profiling instruments. Solution for Issue MHKiT-Software#373

---------

Co-authored-by: akeeste <[email protected]>
@akeeste akeeste mentioned this pull request Sep 30, 2025
akeeste added a commit that referenced this pull request Sep 30, 2025
v1.0.0
# MHKiT v1.0.0
## New Features
* Sound Exposure Level by @jmcvey3 in #388
* Add discharge function to MHKiT by @jmcvey3 in #385

## Functionality enhancements
* Fix for corrupted Nortek files by @jmcvey3 in #372
* Update integral length scale function by @jmcvey3 in #376
* Fix ever-changing RDI RiverPro depth bin ranges by @jmcvey3 in #378
* Allow clean functions to handle _avg variables by @jmcvey3 in #377
* IEC TS 62600 updates by @akeeste in #382
* MLER explanation updates/corrections by @rgcoe in #393
* Improve Nortek2 index file creator functions by @jmcvey3 in #397
* Read Sentinel V specific data packets by @jmcvey3 in #396
* Short list of VMDAS updates by @jmcvey3 in #405
* Allow user to specify universal Kolmogorov constant for TKE dissipation rate function by @jmcvey3 in #406
* Nortek Dual Profile Dataset Rotation by @jmcvey3 in #414

## Source code improvements
* Lint Tidal by @ssolson in #386
* Lint river module by @ssolson in #389
* Lint hindcast by @ssolson in #398
* Modernize Package Configuration by @ssolson in #400
* Configure specific warnings by @ssolson in #401

## Bug fixes
* Avoid failing to scan very large files by @jmcvey3 in #371
* Acoustics SPL bugfix by @jmcvey3 in #379
* DOLfYN/RDI: Set  `fs` to NaN when typical calculation methods yield error (#408) by @simmsa in #409

## Testing and Continuous Integration Updates
* Fix Jupyter Notebook tests running Python 3.13 by @ssolson in #380
* CI Test Clean Up: Mock USGS, Acoustic Tolerances by @ssolson in #404
* Speed up tests with concurrency checks to prevent duplicate workflows on PRs from develop into main or from main into develop by @akeeste
* Define MPLBACKEND to decrease intermittent matplotlib errors in tests by @akeeste

## Documentation and Examples
* Add WEC-Sim power performance example  by @akeeste in #395
* Update dolfyn function docstrings and associated notebooks by @jmcvey3 in #412
* Update examples by @akeeste in #417
* Update installation instructions in README.md by @akeeste
* Adjust acoustics test tolerances by @akeeste in #420

**Full Changelog**: v0.9.0...v1.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants