Predictive Exploration of Correlation in Specific Interactions and Optimization Networks

  • Name: Predictive Exploration of Correlation in Specific Interactions and Optimization Networks , PRECISION
  • EuroHPC machine used: Leonardo
  • Topic: Computer and information sciences; Biological Science

Overview of the project

The PRECISION (Predictive Exploration of Correlation in Specific Interactions and Optimization Networks) project aims to revolutionize the prediction of absolute Binding Free Energy (BFE) in Protein-Ligand complexes by integrating large-scale molecular dynamics (MD) simulations with machine learning (ML) models. Building on the successes of the LIGATE project, which collectively simulated over 4500+ protein-ligand complexes with four replicas each, this project seeks to address the critical bottlenecks in accurately predicting BFE. Traditional methods like Molecular Mechanics/Poisson-Boltzmann Surface Area (MM-PBSA) and Molecular Mechanics/Generalized Born Surface Area (MM-GBSA) often suffer from inconsistencies and high computational costs, limiting their scalability and precision. Existing techniques such as Free Energy Perturbation (FEP+) are either computationally expensive or impractical for large datasets. PRECISION leverages the computational power of exascale resources and the novel Binding Free Energy eXscalate (BFEX) pipeline to process MD trajectories and create robust inputs for ML models. The primary goal of this project is to develop a fast and accurate ML-based scoring function for absolute BFE calculations, targeting improved specificity and affinity prediction for Protein-Ligand complexes. Through an indepth analysis of Protein-Ligand interaction fingerprints, PRECISION has explore the correlation between amino acid interaction profiles at the atomic level and binding energies, ultimately contributing to more efficient drug discovery pipelines. The outcomes of this project are has enhanced the speed and accuracy of hit-to-lead identification and lead optimization, bridging critical gaps in current drug development methodologies.

 

How did EPICURE support the project and what were the benefits of the support?

“Dompé farmaceutici S.p.A. has developed an internal software called Binding Free Energy Exscalate version 1.0 (BFEx version 1.0) which was a bash script. BFEx version 1.0 is developed in order to make calculation of binding free energy. This software uses MM-GBSA/PBSA applied to a molecular dynamics trajectory and extracts interaction patterns which could further be used to make Machine Learning Models. In order to increase the production capacity we requested EPICURE help to make a  parallelise version of BFEx version 1.0.

The support provided through EPICURE was beneficial in several ways.

Firstly, it offered access to HPC expertise. The EPICURE team provided guidance on best practices for parallelisation, helping to identify the main bottlenecks of the original serial workflow and to redesign the analysis pipeline to exploit MPI-based parallelism effectively.

Secondly, EPICURE gave support for porting and refactoring the code. The migration from fragmented Bash/Python scripts to a unified, modular Python codebase was facilitated through EPICURE consultations. This made it possible to integrate mpi4py, restructure the workflow, and prepare the application for scalable execution on HPC systems.

Thirdly, it enhanced performance optimisation on the LEONARDO DCPG partition. EPICURE support helped validate the new version of the workflow on real HPC hardware, identify optimal job configurations, and test scalability from 1 to 5 nodes.

Fourthly, EPICURE helped with environment and dependency management. The team provided advice on managing scientific environments using mamba, ensuring reproducibility and compatibility with the AMBER ecosystem, NumPy, Pandas, MDAnalysis, pytraj and mpi4py. This greatly streamlined deployment on HPC systems.

Overall, EPICURE support was essential for transforming the BFEx version 1.0 (Property Owner: Dompé farmaceutici S.p.A.) workflow into a scalable, high-performance application capable of processing large datasets efficiently and reliably. The improvements achieved during the project directly enabled high-throughput analysis of thousands of protein–ligand complexes, making the pipeline suitable for large-scale screening efforts within the EuroHPC ecosystem.” – Akash Deep Biswas (Senior scientist, Dompé farmaceutici S.p.A.)

 

Additional references

The link to the GIT can be made available upon request.

Contact the project:

  • Akash Deep Biswas (akashdeep.biswas@dompe.com)