The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Publications of SPCL
|G. Bauer, S. Gottlieb, T. Hoefler:|
|Performance Modeling and Comparative Analysis
of the MILC Lattice QCD Application su3 rmd|
(. Vol , Nr. , In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), presented in Ottawa, Canada, pages 652--659, IEEE Computer Society, ISSN: , ISBN: 978-0-7695-4691-9, May 2012, )
AbstractApplication performance modeling is an essential part of application and system development as HPC moves into the petascale and prepares for the exascale. However, performance modeling of parallel systems is a difficult task due to natural variations in measurements and noise effects. In this paper, we give a detailed example for a semi-empirical performance-modeling method applied to the ubiquitous HPC application su3 rmd from the lattice Quantum Chromodynamics field on a variety of parallel computing platforms. We apply statistical techniques that are well known in natural sciences to model the variance in the input system. Using a simple analytical model to capture the main characteristics of the code, such as numbers and sizes of passed messages and invocation counts of serial code blocks in conjunction with statistically sound curvefitting methods, we develop an accurate performance model and use it to characterize application performance on various target architectures. Our fitting techniques allow us to characterize the variance of different performance observations on a given system and show the influence of noise from different sources. The techniques we developed can be applied to a wide class of bulk-synchronous applications. With this detailed example, we aim to motivate the scientific computing community to develop and use similar performance models for software development and maintenance.