Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Publications of SPCL

M. Besta, J. Domke, M. Schneider, M. Konieczny, S. Di Girolamo, T. Schneider, A. Singla, T. Hoefler:

 High-Performance Routing with Multipathing and Path Diversity in Ethernet and HPC Networks

(IEEE Transactions of Parallel and Distributed Systems. Vol 32, Nr. 4, pages 943-959, IEEE, Apr. 2021)

Publisher Reference

Abstract

The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that substantially reduce cost, power consumption, and latency have been proposed. A key challenge in realizing the benefits of these topologies is routing. On one hand, these networks provide shorter path lengths than established topologies such as Clos or torus, leading to performance improvements. On the other hand, the number of shortest paths between each pair of endpoints is much smaller than in Clos, but there is a large number of non-minimal paths between router pairs. This hampers or even makes it impossible to use established multipath routing schemes such as ECMP. In this article, to facilitate high-performance routing in modern networks, we analyze existing routing protocols and architectures, focusing on how well they exploit the diversity of minimal and non-minimal paths. We first develop a taxonomy of different forms of support for multipathing and overall path diversity. Then, we analyze how existing routing schemes support this diversity. Among others, we consider multipathing with both shortest and non-shortest paths, support for disjoint paths, or enabling adaptivity. To address the ongoing convergence of HPC and Big Data domains, we consider routing protocols developed for both HPC systems and for data centers as well as general clusters. Thus, we cover architectures and protocols based on Ethernet, InfiniBand, and other HPC networks such as Myrinet. Our review will foster developing future high-performance multipathing routing protocols in supercomputers and data centers.

Documents

download article:
access preprint on arxiv:
 

BibTeX

@article{besta-hpcr,
  author={Maciej Besta and Jens Domke and Marcel Schneider and Marek Konieczny and Salvatore Di Girolamo and Timo Schneider and Ankit Singla and Torsten Hoefler},
  title={{High-Performance Routing with Multipathing and Path Diversity in Ethernet and HPC Networks}},
  journal={IEEE Transactions of Parallel and Distributed Systems},
  year={2021},
  month={4},
  pages={943-959},
  volume={32},
  number={4},
  publisher={IEEE},
  doi={10.1109/TPDS.2020.3035761},
}