SPCL_Bcast(COMM_WORLD)

What: SPCL_Bcast
is an open, online seminar series that covers a broad range of topics around parallel and high-performance computing, scalable machine learning, and related areas.
Who: We invite top researchers and engineers from all over the world to speak.
Where: Anyone is welcome to join over Zoom! This link will always redirect to the right Zoom meeting. When possible, we make recordings available on our YouTube channel.
Old talks: See the SPCL_Bcast
archive.
Social media: Follow along with #spcl_bcast on Twitter!
When: Every two weeks on Thursdays, in one of two slots (depending on speaker).
14 January – 11 March, 2021:
- Morning: 9 AM Zurich, 5 PM Tokyo, 4 PM Beijing, 3 AM New York, 12 AM (midnight) San Francisco
- Evening: 6 PM Zurich, 2 AM (Friday) Tokyo, 1 AM (Friday) Beijing, 12 PM (noon) New York, 9 AM San Francisco
25 March, 2021:
- Morning: 9 AM Zurich, 5 PM Tokyo, 4 PM Beijing, 4 AM New York, 1 AM San Francisco
- Evening: 6 PM Zurich, 2 AM (Friday) Tokyo, 1 AM (Friday) Beijing, 1 PM New York, 10 AM San Francisco
8 April – 20 May, 2021:
- Morning: 9 AM Zurich, 4 PM Tokyo, 3 PM Beijing, 3 AM New York, 12 AM (midnight) San Francisco
- Evening: 6 PM Zurich, 1 AM (Friday) Tokyo, 12 AM (midnight) Beijing, 12 PM (noon) New York, 9 AM San Francisco
Enabling Rapid COVID-19 Small Molecule Drug Design Through Scalable Deep Learning of Generative Models
Abstract: We improved the quality and reduced the time to produce machine-learned models for use in small molecule antiviral design. Our globally asynchronous multi-level parallel training approach strong scales to all of Sierra with up to 97.7% efficiency. We trained a novel, character-based Wasserstein autoencoder that produces a higher quality model trained on 1.613 billion compounds in 23 minutes while the previous state-of-the-art takes a day on 1 million compounds. Reducing training time from a day to minutes shifts the model creation bottleneck from computer job turnaround time to human innovation time. Our implementation achieves 318 PFLOPS for 17.1% of half-precision peak. We will incorporate this model into our molecular design loop, enabling the generation of more diverse compounds: searching for novel, candidate antiviral drugs improves and reduces the time to synthesize compounds to be tested in the lab.

Homepage
Optimizing CESM-HR on Sunway TaihuLight and An Unprecedented Set of Multi-Century Simulations
Abstract: CESM is one of the very first and most complex scientific codes that gets migrated onto Sunway TaihuLight. Being a community code involving hundreds of different dynamic, physics, and chemistry processes, CESM brings severe challenges for the many-core architecture and the parrallel scale of Sunway TaihuLight. This talk summarizes our continuous effort on enabling efficient run of CESM on Sunway, starting from refactoring of CAM in 2015, redesigning of CAM in 2016 and 2017, and a collaborative effort starting in 2018 to enable highly efficient simulations of the high-resolution (25 km atmosphere and 10 km ocean) Community Earth System Model (CESM-HR) on Sunway Taihu-Light. The refactoring and optimizing efforts have improved the simulation speed of CESM-HR from 1 SYPD (simulation years per day) to 5 SYPD (with output disabled). Using CESM-HR, We manage to provide an unprecedented set of high-resolution climate simulations, consisting of a 500-year pre-industrial control simulation and a 250-year historical and future climate simulation from 1850 to 2100. Overall, high-resolution simulations show significant improvements in representing global mean temperature changes, seasonal cycle of sea-surface temperature and mixed layer depth, extreme events and in relationships between extreme events and climate modes.

Evaluating modern programming models using the Parallel Research Kernels
Abstract: The Parallel Research Kernels were developed to support empirical studies of programming models in a variety of contexts without the porting effort required by proxy or mini-applications. I will describe the project and why it has been a useful tool in a variety of contexts and present some of our findings related to modern C++ parallelism for CPU and GPU architectures.

Homepage
Details to be announced.
Details to be announced.
Details to be announced.
Details to be announced.
Details to be announced.
Details to be announced.
Details to be announced.