ETH Zürich/CS: Design of Parallel and High-Performance Computing

Basic Information

Course Number: 263-2800-00L, 7 Credits
Autumn 2018, lectures: Mon 13-16 CAB G 61; recitation sessions Thu 13-15 LEE D 101
Instructors: T. Hoefler (htor at inf, CAB F 75), M. Püschel (pueschel at inf, CAB H 69.3)
TAs: Salvatore Di Girolamo (digirols at inf, CAB E 64.1)

News

2019-14-01: Report deadline: Friday Jan 18, 2019 11:59pm (Zurich Time, GMT+1). Send via e-mail to salvatore.digirolamo (at) inf.ethz.ch.
2018-20-09: Webpage online

Course overview

This course exposes students to the principal issues involved in software development for parallel computing and discusses a number of approaches to handle the problems and opportunities caused by the increased availability of parallel platforms.

The course includes lectures, assignments, self-study, and a project. 50% of your grade is determined by project work and 50% is determined by a written exam; the exam is given during the official examination period, and there is no makeup exam. Students must be able to program using Java and C/C++.

The course may cover: memory coherence and consistency models, implications for language-specific memory models, Java memory model, models of parallel programming and parallel program execution, performance models for parallel systems, transactional memory, compiler extraction of parallelism, language and compiler support for parallel programming, threads and their execution environment, synchronization, and implementation issues of these topics.

Course schedule

Lectures are given Mondays 13:15 - 16:00 in CAB G 61.

Recitation sessions take place Thursdays 13:15 -- 15:00 in LEE D 101 and take place when announced. Some of the recitation session hours will be devoted to other activities (tutorials, reviews, etc) or will be devoted to group meetings. Please watch this page for updates and announcements.

Tentative schedule of lectures

Week	Monday	Thursday
1	09/17: no lecture	09/20: no recitation
2	09/24: Organization - Introduction	09/27: MPI Tutorial - Code
3	10/01: Caches and Cache Coherence (1pp) (6pp) (unrolled animations)	10/04: Caches recap - Euler cluster - Euler sample job script
4	10/08: Memory models (1pp) (6pp) (unrolled animations)	10/11: Teams and initial projects presentations - OpenMP intro
5	10/15: Languages, Fast Locks, and Lock-Free (1pp) (6pp) (unrolled animations)	10/18: Memory Models - MPI Part 2
6	10/22: Fast Locks and Lock-Free (1pp) (6pp) (unrolled animations)	10/25: SPIN Tutorial - Continuing MPI Part 2
7	10/29: Intermediate Project Presentations	11/01: No recitation session
8	11/05: Amdahl's Law, PRAM, Alpha-Beta Model, Little's Law, Operational Intensity, Roofline Model I	11/08: Amdahl's Law, PRAM
9	11/12: Notes - Roofline Model II - Balance Principles - Scheduling	11/15: Roofline Model - Balance Principles
10	11/19: SIMD Vector Extensions	11/22: Balance Principles, SIMD - Vandermonde matrix determinant vectorized
11	11/26: Finishing consensus, scalable lock study, and oblivious algorithms (1pp) (6pp) (unrolled animations)	11/29: Work-Depth Model - MS thesis proposal
12	12/03: I/O complexity, Red-Blue Pebble Game, and Recomputation in Neural Networks	12/06: Red-Blue Pebble Game
13	12/10: Oblivious and Non-Oblivious Algorithms (1pp) (6pp)	12/13: Prefix-Sum, Network Models
14	12/17: Final Presentations

Mailinglist

If you want to discuss issues related to the class, please subscribe to the class mailinglist. This list is read by the instructors as well as the TA, and any student of the class who voluntarily subscribed to it. So it is a great place to discuss questions about homework or the lecture slides.

Assignments

Assignments are an important part of the course. You will not learn this material from listening to a lecture alone -- you have to do the assignments.

Note: Do not hesitate to write an email to your TA if you have trouble with the assignments!

Number	Assignment	Description	Solution
1	Parallel PI	See last slide of the first recitation session. PI-seq: download	download
2	Caches and Cache Coherence	Assignment 2	Solution - False sharing benchmark
3	Sequential Consistency, Locks	Assignment 3	Solution
4	Locks and Measuring Cache Misses	Assignment 4	Solution - Code
5	Roofline model, Balance Principles	Assignment 5	Solution
6	SIMD	Assignment 5 - Code template	Solution

Teams

Number	Members	Project Description
1	Jack Clark, Genming Bai, Liaowang Huang, Zhifei Yang	Collective Communications for Distributed Machine Learning
2	Lea Fritschi, Roman Haag, Michael Bernasconi, Giovanni Balduzzi	Connected Components
3	Fabian Wolff, David Rohr, Pascal Oberholzer, Elias Stalder	Single-Source Shortest Paths
5	Daniel Maag, Julian Dunskus, Silvan Läubli, Cédric Neukom	Distributed MST
6	Andrey Oblupin, Gabriela Evrova, Clara Brimnes Gardner, Erik Träff	Solving the Single Source Shortest Path (SSSP) problem on distributed graphs
7	Samuel Petitjean, Momchil Peychev, Daniel Zvara, David Ittah	Parallel Strongly Connected Components
8	Anastasios Papageorgiou, Diego Renner, Ramy Tanios, Konstantinos Triantafyllou	Parallel N-Body Simulations
9	Tommaso Bonato, Francesco Forcher, Anouk Paradis, Stefano d'Apolito	Parallel Suffix Tree Construction
10	Thomas Cambier, Raphaël Dang-Nhu, Thibault Dardinier, Clément Trassoudaine	Minimum Spanning Tree: A Parallel Approach
12	Kevin Ni, Fredrik Strupe, Samuele Decarli, Ankur Magdum	Concurrent Skip List

Teams are listed in the order they have been announced.

Template for project report

Hard page limit: 6 pages. Template below.
Everybody reads this and this.
For latex use: report.zip.
For word use (discouraged): report-word.doc.

Grading

50% of your grade is determined by the project, and the other 50% of the grade is determined by a written 2 hr exam. You are not allowed to use any electronic devices or books, notes, etc. to the exam.

Resources

There exist a large number of books on programming multi-processors, multi-core system, or threads. These books may explore some topics in more depth than the lectures, or may provide background information. None of these books is mandatory. Copies of the first two books are "on reserve" the CS library.

Brian Goetz, Joshua Bloch, Joseph Bowbeer, Doug Lea, David Holmes, Tim Peierls, Java Concurrency in Practice Addison-Wesley, 2006.
A good introduction to using threads for Java programmers, assumes proficiency in Java and concepts of object-oriented programming.
Maurice Herlihy and Nir Shavit: The Art of Multiprocessor Programming (Revised Reprint), Morgan Kaufmann June 2012.
A good book for advanced students, thorough treatment of key concepts. Make sure you get the "revised reprint" edition unless you enjoy re-discovering bugs & typos.

263-2800-00L: Design of Parallel and High-Performance Computing

Torsten Hoefler and Markus Püschel