The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Publications of SPCL
|Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, Weimin Zheng, Jingfang Xu:|
|ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds|
(. Vol , Nr. , In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC18) - Gordon Bell Award Finalist, presented in Denver, CO, USA, pages , ACM, ISSN: , ISBN: , Nov. 2018, )
AbstractGraphs are an important abstraction used in many scientific fields. With the magnitude of graph-structured data constantly increasing, effective data analytics requires efficient and scalable graph processing systems. Although HPC systems have long been used for scientific computing, people have only recently started to assess their potential for graph processing, a workload with inherent load imbalance, lack of locality, and access irregularity. We propose ShenTu, the first general-purpose graph processing framework that can efficiently utilize an entire petascale system to process multi-trillion edge graphs in seconds. ShenTu embodies four key innovations: hardware specializing, supernode routing, on-chip sorting, and degree-aware messaging, which together enable its unprecedented performance and scalability. It can traverse an unprecedented 70-trillion-edge graph in seconds. Furthermore, ShenTu enables the processing of a spam detection problem on a 12-trillion edge Internet graph, making it possible to identify trustworthy and spam web pages directly at the fine-grained page level.