Skip to content

Conversation

Hannibal404
Copy link

This commit expands upon the existing InfiniBand PMDA to collect more metrics to enhance diagnostic and monitoring capabilities.

Authored-by: Mohith Kumar Thummaluru [email protected]

Authored-by: Mohith Kumar Thummaluru <[email protected]>

Signed-off-by: Pradyumn Rahar <[email protected]>
@Hannibal404 Hannibal404 changed the title Add collection of more metrics to InfiniBand PMDA PCP: PMDA: Add collection of more metrics to InfiniBand PMDA Aug 25, 2025
@natoscott
Copy link
Member

@Hannibal404 @mohith-kumar-thummaluru hi! Thanks for contributing these changes. It looks like the existing Infiniband code has no tests :( ... would you be able to add some along with these changes?

Looking at the code for the first time just now, one possible approach would be to create "dummy" IB libraries that provide "fake" data to the PMDA allowing all the code paths to be tested on any hardware (esp. for CI etc, without Infiniband, which is relatively exotic). A second, simpler test could also be written that tests for the presence of the required kernel and hardware feature, and does a basic agent Install/Remove cycle.

@natoscott
Copy link
Member

natoscott commented Aug 25, 2025

Looks like the use of <infinibands/verbs.h> is problematic too - not present on some distros? (this is the reason for the CI failures here)

nathans@debian:~/git/pcp$ ls /usr/include/infiniband/
mad.h      umad_cm.h  umad_sa.h      umad_sm.h   umad_types.h
mad_osd.h  umad.h     umad_sa_mcm.h  umad_str.h
nathans@debian:~/git/pcp$ cat /etc/debian_version 
12.11
nathans@debian:~/git/pcp$ 

I expect configure.ac will need some additional macro(s) in the pmda_infiniband section that can toggle aspects of the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants