Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Publications of SPCL

T. Hoefler:

 Towards smart(er) High-Performance Networking Driving Future Simulations

(Presentation - presented in Seattla, WA, USA, Aug. 2023, )
Invited talk at the MODSIM'23 workshop

Abstract

The network has traditionally been the most crucialcomponent of large-scale high-performance computers (HPC) and more recently datacenters and AI training clusters. HPC networking has always relied on specialized solutions specifically designed for particular systems, such as the Cray interconnects series including Gemini, Aries, Slinghot, or NVIDIA's InfiniBand interconnect. On the other hand, Ethernet has long been the elephant in the room for connecting datacenter machines. The current shift towards hyperscale datacenters and AI supercomputers in the cloud is generating a demand for network performance that surpasses traditional HPC deployments. In this context, we predict that future networks will emerge through a convergence of HPC network technologies and Ethernet. We examine various deployment models and discuss the limitations of RDMA (Remote Direct Memory Access). Subsequently, we present the concept of smart networks that can integrate RDMA accesses, protocol processing, and computations within the network card, thereby unifying these functionalities. Our vision of 'streaming processing in the network' aligns with CUDA's role in accelerating networking tasks. Finally, we provide a glimpse into the future, envisioning a converged interconnect for datacenter and HPC networking.

Documents

download slides:
 

BibTeX

@misc{hoefler-modsim23,
  author={Torsten Hoefler},
  title={{Towards smart(er) High-Performance Networking Driving Future Simulations}},
  year={2023},
  month={8},
  location={Seattla, WA, USA},
  note={},
}