Comparative Tests on KME BKM and VWF

The performance of H2Pack strongly depends on the performance of the kernel evaluation functions, i.e., the Kernel Matrix Evaluation (KME) function and Bi-Kernel Matvec (BKM) function. The KME and BKM functions can be either vectorized manually using the provided Vector Wrapper Functions (VWF) or vectorized automatically by the C compiler.

The following numerical results demonstrate how these different techniques could affect the performance of -construction and -matvec.

Hardware and software configuration

2 * Intel Xeon Gold 6226 CPU @ 2.7GHz (2 * 12 cores, 2 * 12 * 2 threads, hyperthreading disabled)
6 * 32 GB DDR4 memory
Red Hat Enterprise Linux 7.6 (kernel 3.10.0-957.12.1.el7)
Intel Parallel Studio Cluster version 2019.5
ICC optimization flags: -O3 -xHost
OpenMP environment variables
- OMP_NUM_THREADS=24
- OMP_PLACES=cores
- OMP_PROC_BIND=close

Test settings

Point sets: uniformly and randomly distributed points in a 3D unit ball
Running mode: JIT
Relative error threshold: 1e-6
Kernel: 3D Gaussian with
Comparison of kernel implementations:
- no vectorization ("no-vec")
- ICC automatic vectorization ("auto-vec")
- manual vectorization by VWF ("wrap-vec")

Numerical Results (timings in seconds)

	Number of Points	100,000	400,000	1,600,000
-construction	KME no-vec	0.022	0.083	0.440
	KME auto-vec	0.020	0.092	0.448
	KME wrap-vec	0.023	0.084	0.442
-matvec	KME no-vec	0.120	0.313	0.745
	KME auto-vec	0.038	0.101	0.260
	KME wrap-vec	0.028	0.081	0.233
	BKM no-vec	0.161	0.369	0.908
	BKM auto-vec	0.031	0.091	0.265
	BKM wrap-vec	0.020	0.056	0.156

Notes:

Computation in -construction is dominated by the column-pivoted QR. It only gains minor performance improvement from vectorization of KME functions.
Both automatic and manual vectorization of KME and BKM functions can lead to 300% - 400% speedup in -matvec, while manual vectorization is 20% - 50% faster than automatic vectorization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparative Tests on KME BKM and VWF

Numerical Results (timings in seconds)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Navigation

Getting Started

Advanced Configurations and Tools

Numerical Tests

Last But Not Least

Clone this wiki locally