The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Publications of SPCL
|Improving Parallel Computing Platforms|
(Presentation - presented in Munich, Germany, Oct. 2009, Presentation at the Technical University of Munich, Host: Prof. M. Gerndt )
AbstractLarge-scale parallel systems are important to advance scientific development in many fields. In this talk, we address issues in programming and design of such large-scale systems. We emphasize the importance of collective operations as high-level specifications of data redistribution and discuss new developments in the Message Passing Interface (MPI) standard versions 2.2 and 3. We discuss application studies and use-cases for new nonblocking collective operations. We also discuss a proposal for nearest neighbor (sparse) collective operations to support common stencil communication operations. Later in the talk, we discuss system issues in the design of large-scale systems. We present a case study based on the InfiniBand network architecture and evaluate effects of static routing strategies. We also disprove several mysteries about full bisection bandwidth networks. Based on this discussion, we develop a new routing strategy for InfiniBand networks and, if time permits, finish with a small excursion into adaptive routing. With this work, we show that large-scale systems must be analyzed and optimized as a whole. This means that we have to consider programming strategies and abstractions, network topologies and routing as a whole.