Genie 2.0. and 3.0.: an optimized denoising diffusion probabilistic model for protein design

  • Name: Genie 2.0. and 3.0.: an optimized denoising diffusion probabilistic model for protein design
  • EuroHPC machine used: Karolina
  • Topic: Life sciences, Protein design, Vaccine Development

Overview of the project

This project focuses on scaling, evaluating and optimizing the training of the Genie 2.0 and 3.0 model, a denoising diffusion probabilistic model (DDPM) used in protein engineering to design protein structures, developed by Yeqing Lin part of the lab of Mohammed AlQuraishi (Columbia University). Genie, initially developed with a maximum protein sequence length of 256, is being extended to support sequences up to 512. The training of Genie was conducted on the IT4I system Karolina, with the objective of optimizing runtime and efficiency for larger protein data sets. Once trained, the model was experimentally validated via the expression, purification, and biophysical characterization of proteins that were designed via Genie. This led to a positive result, which outcompeted a non-diffusion-based protein design model that was used to perform a similar task.

 

How did EPICURE support the project and what were the benefits of the support?

“Although we had used HPC systems in the past, we were not experienced in installing new packages and optimizing them on a HPC system, especially AI training. EPICURE supported us in this process. For example, during the first training sessions, issues occurred due to the limited memory on the A100 GPU’s due to an increase in amino acid residues (2x). EPICURE helped us explore several options (model compiling, batch parallelism) to cover this issue. Secondly, they helped us to maximize the speed of the training process by analyzing what the optimal number of GPU nodes was. Overall, we were trained in using the HPC cluster for AI training, so we now can do it independently.

The benefit from EPICURE is that we quickly could start our project but also won time because the overall training was optimized. Furthermore, because we were using our resources more effectively, we could also execute more tests during the granted GPU hours.”

Project website:

www.puxano.com