Theodoros Rekatsinas
The Scalable Parallel Computing Lab's *SPCL_Bcast* seminar continues with *Theodoros Rekatsinas**of **Axelera AI* presenting on *Data Selection - Data Challenges when Training Generative Models*. Everyone is welcome to attend (over Zoom)!
*When:* Thursday, 8th May, 9AM CET
*Where:* Zoom
Join https://spcl.inf.ethz.ch/Bcast/join
*Abstract:* This talk explores how strategic data selection can improve the efficiency of training generative AI models. I will cover approaches for both pre-training and fine-tuning that achieve comparable performance to full training while using only a fraction of the data. During the talk I will cover key filtering techniques and data selection methods for efficient pre-training as well as the connection between data selection and optimal transport for optimized fine-tuning. I will conclude with promising future directions for adaptive data selection research.
*Biography:* Theo Rekatsinas is the VP of Machine Learning at Axelera AI. before that he was a tech lead at Apple working on on-device intelligence and a senior manager in the Apple Knowledge Graph (KG) team responsible for the KG construction and Graph Machine learning teams. Theo co-founded Inductiv (acquired by Apple), a company that developed Generative AI solutions for identifying and correcting errors in data. Theo was also a Professor of Computer Science at ETH Zürich and the University of Wisconsin-Madison. Theo's research focuses on scalable machine learning over billion-scale relational and graph-structured data. His research focused on exploring the fundamental connections between data preparation, data integration, and knowledge management with statistical machine learning and probabilistic inference.
More details & future talks https://spcl.inf.ethz.ch/Bcast/
Scalable Parallel Computing Lab (SPCL) Department of Computer Science, ETH Zurich Website https://spcl.inf.ethz.ch X(Twitter) https://twitter.com/spcl_eth YouTube https://www.youtube.com/@spcl GitHub https://github.com/spcl