The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Publications of SPCL
|Network topologies for large-scale compute centers: It's the diameter, stupid!|
(Presentation - presented in San Jose, CA, USA, Aug. 2016, Invited talk at the IEEE Hot Interconnects 2016 )
AbstractWe discuss the history and design tradeoffs for large-scale topologies in high-performance computing. We observe that datacenters are slowly following due to the growing demand for low latency and high throughput at lowest cost. We then introduce a high-performance cost-effective network topology called Slim Fly that approaches the theoretically optimal network diameter. We analyze Slim Fly and compare it to both traditional and state-of-the-art networks. Our analysis shows that Slim Fly has significant advantages over other topologies in latency, bandwidth, resiliency, cost, and power consumption. Finally, we propose deadlock-free routing schemes and physical layouts for large computing centers as well as a detailed cost and power model. Slim Fly enables constructing cost effective and highly resilient datacenter and HPC networks that offer low latency and high bandwidth under different HPC workloads such as stencil or graph computations.
Recorded talk (best effort)