Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Publications of SPCL

T. Hoefler:

 Benchmarking data science: Twelve ways to lie with statistics and performance on parallel computers

(IEEE Computer. Vol 55, pages 49-56, Aug. 2022)
Cover Feature Research Reproducibility

Abstract

We humorously discuss 12 fallacies when focusing on compute performance that we have frequently observed in practice. We follow each with a recommendation to mitigate the danger and hope to contribute to good benchmarking etiquette for data science.

Documents

download article:
 

BibTeX

@article{hoefler-12-ways-data,
  author={Torsten Hoefler},
  title={{Benchmarking data science: Twelve ways to lie with statistics and performance on parallel computers}},
  journal={IEEE Computer},
  year={2022},
  month={08},
  pages={49-56},
  volume={55},
}