Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Publications of SPCL

Selection by year

2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007

Peer-Reviewed Conference or Journal Articles

SC16
[1] M. Martinasso, G. Kwasniewski, S. R. Alam, T. C. Shulthess, T. Hoefler:
 A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 63:1--63:11, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))
SC16
[2] W. Tang, B. Wang, S. Ethier, G. Kwasniewski, T. Hoefler, K. Z. Ibrahim, K. Madduri, S. Williams, L. Oliker, C. Rosales-Fernandez, T. Williams:
 Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 43:1--43:12, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))
SC16
[3] J. Domke, T. Hoefler:
 Scheduling-Aware Routing for Supercomputers Nov. 2016, Accepted at The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16) (acceptance rate: 18% (82/446))
SC16
[4] T. Gysi, J. Baer, T. Hoefler:
 dCUDA: Hardware Supported Overlap of Computation and Communication In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 52:1--52:12, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))
OOPSLA'16
[5] Andrei Marian Dan, Patrick Lam, Torsten Hoefler, Martin Vechev:
 Modeling and Analysis of Remote Memory Access Programming In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, presented in Amsterdam, Netherlands, pages 129--144, ACM, ISBN: 978-1-4503-4444-9, Nov. 2016, Outstanding Paper Award at OOPSLA'16 (4/52)
Cluster'16
[6] A. Calotoiu, D. Beckingsale, C. W. Earl, T. Hoefler, I. Karlin, M. Schulz, F. Wolf:
 Fast Multi-Parameter Performance Modeling Oct. 2016, Accepted at IEEE International Conference on Cluster Computing (Cluster'16) (acceptance rate: 24% (39/162))
HOTI'16
[7] T. Schneider, O. Bibartiu, T. Hoefler:
 Ensuring Deadlock-Freedom in Low-Diameter InfiniBand Networks In Proceedings of the 24th Annual Symposium on High-Performance Interconnects (HOTI'16), Aug. 2016, Best Student Paper at HOTI'16
IEEE MICRO
[8] S. Di Girolamo, P. Jolivet, K. D. Underwood, T. Hoefler:
 Exploiting Offload Enabled Network Interfaces IEEE MICRO. Vol 36, Nr. 4, IEEE, Jul. 2016,
HPDC'16
[9] J. Domke, T. Hoefler, S. Matsuoka:
 Routing on the Dependency Graph: A New Approach to Deadlock-Free High-Performance Routing In Proceedings of the 25th Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Jun. 2016, (acceptance rate: 16% (20/129))
HPDC'16
[10] P. Schmid, M. Besta, T. Hoefler:
 High-Performance Distributed RMA Locks In Proceedings of the 25th Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Jun. 2016, (acceptance rate: 16% (20/129)) Karsten Schwan Best Paper Award at HPDC'16 (1/20)
ICS'16
[11] T. Grosser, T. Hoefler:
 Polly-ACC: Transparent compilation to heterogeneous hardware In Proceedings of the the 30th International Conference on Supercomputing (ICS'16), Jun. 2016, (acceptance rate: 24% (43/178))
PASC'16
[12] T. Hoefler:
 Selecting Technical Papers for an Interdisciplinary Conference: The PASC Review Process In Proceedings of the 3rd Platform of Advanced Scientific Computing Conference (PASC'16), Jun. 2016,
IJHPCA
[13] P. M. Widener, S. Levy, K. B. Ferreira, T. Hoefler:
 On noise and the performance benefit of nonblocking collectives The International Journal of High Performance Computing Applications. Vol 30, Nr. 1, pages 121-133, Sage, ISSN: 1094-3420, Jan. 2016, accepted for publication on Nov. 2nd 2015
IEEE TPDS
[14] S. Ramos, T. Hoefler:
 Cache Line Aware Algorithm Design for Cache-Coherent Architectures IEEE Transactions on Parallel and Distributed Systems (TPDS). Vol PP, Nr. 99, IEEE, Jan. 2016,

Invited Talks and Presentations

LLVM-HPC'16
[15] T. Hoefler:
 Polly-ACC: Transparent Compilation to Heterogeneous Hardware. (Presentation) presented in Salt Lake City, UT, Nov. 2016, Invited talk at the LLVM-HPC workshop and TiTech Booth at SC16
CCDSC'16
[16] T. Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Lyon, France, Oct. 2016, Invited talk at the CCDSC meeting
CoDesign'16
[17] T. Hoefler:
 Accelerating weather and climate simulations on heterogeneous architectures (Presentation) presented in Xi'an, China, Oct. 2016, Invited talk at the CoDesign Meeting at HPC China 2016
HPC China'16
[18] T. Hoefler:
 Theory and Practice in HPC: Modeling, Programming, and Networking (Presentation) presented in Xi'an, China, Oct. 2016, Keynote talk at HPC China 2016
Cluster'16
[19] T. Hoefler:
 Theory and Practice in HPC: Modeling, Programming, and Networking (Presentation) presented in Taipei, Taiwan, Sep. 2016, Opening keynote talk at IEEE Cluster 2016
Wuxi'16
[20] T. Hoefler:
 High-Performance Distributed RMA Locks (Presentation) presented in Wuxi, China, Sep. 2016, Seminar talk at Intl. Workshop on High-Performance Systems
Guangzhou'16
[21] T. Hoefler:
 MODESTO: Data-centric Analytic Optimization of Complex Stencil Programs on Heterogeneous Architectures (Presentation) presented in Guangzhou, China, Sep. 2016, Seminar talk at Intl. Workshop on High-Performance Systems
HotI'16
[22] T. Hoefler:
 Network topologies for large-scale compute centers: It's the diameter, stupid! (Presentation) presented in San Jose, CA, USA, Aug. 2016, Invited talk at the IEEE Hot Interconnects 2016
HP
[23] T. Hoefler:
 Towards scalable RDMA locking on a NIC (Presentation) presented in Palo Alto, CA, USA, Aug. 2016,
UTK
[24] T. Hoefler:
 Scientific Benchmarking of Parallel Computing Systems (Presentation) presented in Knoxville, TN, USA, Aug. 2016,
ISC'16
[25] T. Hoefler:
 An Overview of Static & Dynamic Techniques for Automatic Performance Modeling (Presentation) presented in Frankfurt, Germany, Jun. 2016, Invited talk at International Supercomputing Conference
ISC'16
[26] T. Hoefler:
 The Eigth Green Graph500 (Presentation) presented in Frankfurt, Germany, Jun. 2016,
Cetraro'16
[27] T. Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Cetraro, Italy, Jun. 2016, Invited talk at the Cetraro HPC conference
Technion'16
[28] T. Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Haifa, Israel, Jun. 2016, Seminar talk at Israel Institute of Technology (Technion)
HLRS
[29] T. Hoefler:
 Scientific Benchmarking of Parallel Computing Systems (Presentation) presented in Stuttgart, Germany, Apr. 2016,
Salishan
[30] T. Hoefler:
 Active RDMA - new tricks for an old dog (Presentation) presented in Gleneden Beach, OR, USA, Apr. 2016, Invited talk at Salishan Meeting