-
Couldn't load subscription status.
- Fork 449
prov/ucx: Fix completion entries insertion for RMA operations. #11493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
tatarintsevsv
commented
Oct 12, 2025
- Ignore the absence of the FI_COMPLETION flag if CQ configured without FI_SELECTIVE_COMPLETION flag.
- Error completions must generated for all operations, including those for which a completion was not requested
prov/ucx/src/ucx_rma.c
Outdated
| ofi_ep_cntr_inc(&u_ep->ep, CNTR_RD); | ||
|
|
||
| if (flags & FI_COMPLETION) | ||
| if (flags & FI_COMPLETION || u_ep->ep.tx_op_flags & FI_COMPLETION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tx_op_flags is already considered when ucx_proc_rma_msg is called.
prov/ucx/src/ucx_rma.c
Outdated
|
|
||
| if (ucx_req->completion.flags & FI_COMPLETION) | ||
| if (ucx_req->ep->ep.tx_op_flags & FI_COMPLETION || | ||
| ucx_req->completion.flags & FI_COMPLETION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
Should use !(u_ep->ep.tx_op_flags & FI_SELECTIVE_COMPLETION) to check for the case that selective completion is not asked for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should just be able to check the completion.flags which are set when initializing the request. The op flags are set to force include FI_COMPLETION when FI_SELECTIVE_COMPLETION is not requested so it should be ok. See here
I think the missing thing in the current code is the OR'ing in of the msg flags on the *msg APIs to make sure FI_COMPLETION is set in the req->completion.flags for the non FI_SELECTIVE_COMPLETION case.
@tatarintsevsv Can you explain the selective completion path that is incorrectly writing a completion when it shouldn't be? I don't immediately see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should just be able to check the completion.flags which are set when initializing the request. The op flags are set to force include
FI_COMPLETIONwhenFI_SELECTIVE_COMPLETIONis not requested so it should be ok. See hereI think the missing thing in the current code is the OR'ing in of the msg flags on the *msg APIs to make sure
FI_COMPLETIONis set in the req->completion.flags for the nonFI_SELECTIVE_COMPLETIONcase.@tatarintsevsv Can you explain the selective completion path that is incorrectly writing a completion when it shouldn't be? I don't immediately see it.
According to manual, in this this case cq records must be generated
fi_ep_bind(ep, &cq->fid, FI_TRANSMIT|FI_RECV);
fi_writemsg(ep, &msg, 0);
but it's so only if we pass FI_COMPLETION to fi_writemsg flags.
As far as I understand, the correct usage of FI_COMPLETION / FI_SELECTIVE_COMPLETION is
fi_ep_bind(ep, &cq->fid, FI_TRANSMIT|FI_RECV|FI_SELECTIVE_COMPLETION);
fi_writemsg(ep, &msg, FI_COMPLETION); // completion must be generated
fi_writemsg(ep, &msg, 0); // completion doesn't matter, will not be generated
and
fi_ep_bind(ep, &cq->fid, FI_TRANSMIT|FI_RECV);
fi_writemsg(ep, &msg, FI_COMPLETION); // completion must be generated
fi_writemsg(ep, &msg, 0); // completion must be generated anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
U just need to use flags | util_ep->tx_msg_flags as the internal flags when you implement fi_*msg. tx_msg_flags will be 0 when FI_SELECTIVE_COMPLETION is used when ep bind, it will be FI_COMPLETION when there is no FI_SELECTIVE_COMPLETION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you check whether that internal flags have FI_COMPLETION to determine whether you want to generate completions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as it pointed by @j-xiong this is not correct - we must adds flags but not override.
fixed with only FI_COMPLETION is taken into account
|
After looking more closely, it seems this error only occurs with ucx_writemsg and ucx_readmsg. I'll fix the PR. |
9eb350d to
8d6b373
Compare
|
can anyone review last changes? |
| { | ||
| return ucx_proc_rma_msg(ep, msg, flags, UCX_DO_WRITE); | ||
| struct ucx_ep *u_ep = container_of(ep, struct ucx_ep, ep.ep_fid); | ||
| return ucx_proc_rma_msg(ep, msg, flags | u_ep->ep.tx_op_flags, UCX_DO_WRITE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. The flags passed to the msg version override the default op flags, no adds to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I've fixed it so that only FI_COMPLETION is taken into account in the internal flags.
| { | ||
| return ucx_proc_rma_msg(ep, msg, flags, UCX_DO_READ); | ||
| struct ucx_ep *u_ep = container_of(ep, struct ucx_ep, ep.ep_fid); | ||
| return ucx_proc_rma_msg(ep, msg, flags | u_ep->ep.rx_op_flags, UCX_DO_READ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
|
libfabric ci is failing udp fabtests. Can you please rebase to pickup the changes from #11501 to fix this issue? |
1fae2f8 to
0e0953f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also take a look at your commits and remove Merge commits
prov/ucx/src/ucx_rma.c
Outdated
| if (u_ep->ep.tx_op_flags & FI_COMPLETION) | ||
| flags |= FI_COMPLETION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still incorrect.
The *msg APIs don't take into account the op_flags. They are specifically to override the op_flags so you shouldn't be using the op_flags on the *msg paths. You should only use the u_ep->tx_msg_flags here which only has FI_COMPLETION for the non FI_SELECTIVE_COMPLETION case and are 0 if FI_SELECTIVE_COMPLETION is being used. You should be able to always OR in the tx_msg_flags for the msg APIs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for explaining. looks it just flags|msg_flags
- Ignore the absence of the FI_COMPLETION flag for fi_writemsg/fi_readmsg if CQ configured without FI_SELECTIVE_COMPLETION flag. - Error completions must generated for all operations, including those for which a completion was not requested Signed-off-by: Sergey Tatarintsev <[email protected]>
0e0953f to
b37faa7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other *msg APIs look like they are incorrect as well. Could you fix those while you're here? Thanks!
https://github.com/ofiwg/libfabric/blob/main/prov/ucx/src/ucx_msg.c#L234
https://github.com/ofiwg/libfabric/blob/main/prov/ucx/src/ucx_tagged.c#L114
| { | ||
| return ucx_proc_rma_msg(ep, msg, flags, UCX_DO_WRITE); | ||
| struct ucx_ep *u_ep = container_of(ep, struct ucx_ep, ep.ep_fid); | ||
| return ucx_proc_rma_msg(ep, msg, flags | u_ep->ep.tx_op_flags, UCX_DO_WRITE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be return ucx_proc_rma_msg(ep, msg, flags | u_ep->ep.tx_msg_flags, UCX_DO_WRITE);
| { | ||
| return ucx_proc_rma_msg(ep, msg, flags, UCX_DO_READ); | ||
| struct ucx_ep *u_ep = container_of(ep, struct ucx_ep, ep.ep_fid); | ||
| return ucx_proc_rma_msg(ep, msg, flags | u_ep->ep.rx_op_flags, UCX_DO_READ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be return ucx_proc_rma_msg(ep, msg, flags | u_ep->ep.tx_msg_flags, UCX_DO_READ);
(A read is still a transmitted operation)