- 
                Notifications
    
You must be signed in to change notification settings  - Fork 490
 
Description
Describe the bug
ucx_info and ucx_perftest reports dc_mlx5.c:329  UCX  ERROR mlx5dv_create_qp(mlx5_0:1, DCI): failed: Invalid argument.
Steps to Reproduce
UCX version: UCT version=1.10.0 revision c7add93
UCX build config: --prefix=$PREFIX --enable-debug --enable-assertions --enable-params-check --enable-frame-pointer --enable-backtrace-detail
Setup and versions
lsb_release -a:
LSB Version:	:core-4.1-aarch64:core-4.1-noarch
Distributor ID:	CentOS
Description:	CentOS Linux release 8.1.1911 (Core) 
Release:	8.1.1911
Codename:	Core
ofed_info -s:MLNX_OFED_LINUX-5.1-0.6.6.0rpm -q rdma-core:rdma-core-51mlnx1-1.51066.aarch64rpm -q libibverbs:libibverbs-51mlnx1-1.51066.aarch64
Additional information (depending on the issue)
For ucx_info -d, this happens when it tries to print info about the dc_mlx5 transport.
For ucx_perftest, it happens when running any UCP test without any environment variable set.
All issues go away if I add --without-dc to the configure script.
This doesn't happen with UCX 1.9.0, dc transport will be enabled and work correctly.
This also doesn't happen when built against MLNX_OFED_LINUX-4.5-1.0.1.0 on another ThunderX2 machine, but it looks like dc is automatically disabled there.