DDTBench: Micro-Applications for Communication Data Access Patterns and MPI Datatypes
DDTBench is a suite of Micro-Apps that captures how parallel scientific applications from many different fields of science access the data that they send and receive between processes. MPI Derived Datattypes (DDTs) allow to specify those access patterns in such a way that no explicit copy operation is needed, in contrast to the pack-unpack loops found in many codes. In DDTBench we compare the packing overhead incurred by such loops to that of MPI DDTs. This is done by performing a ping-pong benchmark, once using MPI DDTs to specify how data should be packed and once using the pack-unpack loops that we found in the applications. The measurement loop of the benchmark is shown below:
Measurement loop of DDTBench. Measurements are taken on process 0, no global clock is not required.
Using the times it takes to perform each operation (colored block in the picture above) we can calculate the overhead for packing/unpacking data with both methods. Of course we can not measure this overhead directly in the case MPI DDTs are used, because data re-packing is implicit. But we can calculate the time used for transferring packed data, t_net, by subtracting the time required for manual packing and unpacking from the round-trip time of the ping-pong with manual packing. Now the data-repacking overhead for both cases can be calculated by subtracting t_net from the ping-pong round trip time and dividing the result by the ping pong round trip time. We did this for some of the micro-apps in the graph shown below:
Packing costs for different test cases
It can be seen that MPI DDTs can reduce the overhead associated with data-packing (i.e., from 40% to 15% in the case of NAS_LU_x, where a contiguous array is needlessly copied by the original code). The large difference between the performance delivered by Open MPIs DDT engine compared to that of MVAPICH shows that there is still some work to be done in improving MPI DDT implementations. We hope that DDTBench can server implementers as a guideline on which access patterns deserve special attention. A list of the micro-apps included in DDTBench can be found in the table below.
Application Class | Testname | Access Pattern |
Atmospheric Science | WRF_x_vec | struct of 2D/3D/4D face exchanges in different directions (x,y), using different (semantically equivalent) datatypes: nested vectors (_vec) and subarrays (_sa) |
WRF_y_vec | ||
WRF_x_sa | ||
WRF_y_sa | ||
Quantum Chromodynamics | MILC_su3_zd | 4D face exchange, z direction, nested vectors |
Fluid Dynamics | NAS_MG_x | 3D face exchange in each direction (x,y,z) with vectors (y,z) and nested vectors (x) |
NAS_MG_y | ||
NAS_MG_z | ||
NAS_LU_x | 2D face exchange in x direction (contiguous) and y direction (vector) | |
NAS_LU_y | ||
Matrix Transpose | FFT | 2D FFT, different vector types on send/recv side |
SPECFEM3D_mt | 3D matrix transpose | |
Molecular Dynamics | LAMMPS_full | unstructured exchange of different particle types (full/atomic), indexed datatypes |
LAMMPS_atomic | ||
Geophysical Science | SPECFEM3D_oc | unstructured exchange of acceleration data for different earth layers, indexed datatypes |
SPECFEM3D_cm |
DDTBench downloads can be found below. The tarballs contain the source files, Makefiles, a more detailed documentation in pdf format. We also provide a sample R script to analyze the output data, since no statistical aggregation is done by the benchmark itself.
Version | Date | Changes |
DDTBench-1.2.1.tar.gz - (367 kb) | September 24, 2015 | Licence change |
DDTBench-1.2.tar.gz - (367 kb) | January 27, 2014 | added one sided scheme, PAPI and high-resolution timer support |
ddtbench-1.1.tar.gz - (44 kb) | June 16, 2012 | added C implementation |
ddtbench-1.0.tar.gz | May 19, 2012 | initial release, Fortran implementation only |
References
[1] T. Schneider, R. Gerstenberger, T. Hoefler: | ||
Application-oriented ping-pong benchmarking: how to assess the real communication overheads
Journal of Computing. Vol 96, Nr. 4, pages 279-292, Springer Vienna, ISSN: 0010-485X, Apr. 2014, Special issue on top picks from EuroMPI'12. |
[2] T. Schneider, R. Gerstenberger, T. Hoefler: | ||
Micro-Applications for Communication Data Access Patterns and MPI Datatypes
Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, pages 121-131, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, Invited to a journal special issue on top picks from EuroMPI'12. |