Industrial engineering graduate student powers next-generation open-source AI

Feb 12, 2026

Editor’s Note: A version of this article originally appeared on the Institute for Computational and Data Sciences site. 

UNIVERSITY PARK, Pa. — Integrating high-performance computing, structural bioinformatics and open-source software infrastructure is a critical computational challenge for solving protein structure and designing de novo proteins. This work is important for furthering the discovery and translational research in this field. Vinay Mathew, a Penn State doctoral student in the Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, has emerged as a key innovator. At the heart of his recent work are two production-grade service-modules built for modern HPC environments. 

Mathew is working with Soundar Kumara, Allen E. Pearce and Allen M. Pearce Professor of Industrial Engineering, doctoral adviser and director of the Center for Applications of Artificial Intelligence and Machine Learning to Industry (AIMI); Gretta Kellogg, engineering program manager at the Penn State Institute for Computational and Data Sciences and AIMI assistant director; and William KM Lai, assistant research professor in biology and genetics and computational biology at Cornell University. 

Open OnDemand Integration of AlphaFold 2/3 for On-Demand Prediction 

Recognizing that the standard AlphaFold workflow suffers from under-utilized resource phases — for example, long Central Processing Unit (CPU)-bound MSA generation followed by brief Graphics Processing Unit (GPU)-intensive structure prediction — Vinay developed a tailored implementation that orchestrates the separation of CPU and GPU phases within a single Open OnDemand instance. His work showed that approximately 75% of runtime was consumed by CPU-heavy tasks, and by restructuring workflow scheduling he enabled far more efficient GPU usage. 

Importantly, this framework was recently presented at the GOOD25 conference at Harvard University and is now in the process of being adopted by the National Center for Supercomputing Applications (NCSA) as a production service, enabling researchers to run protein-structure prediction at scale, directly through their HPC portal interface. 

ARM64-Optimised Containerized Infrastructure for Protein-Structure & Design 

Building on container technology, Vinay led the development of a Apptainer container stack tailored to ARM64 architectures, or next-gen HPC nodes, and x86_64 systems. By providing pre-configured, architecture-aware containers for AlphaFold3, Boltz and Chai-1 models, this project lowers the barrier for HPC centers to deploy leading-edge protein-prediction and design workflows. 

By supporting ARM64 and GPU pathways, the project anticipates and matches evolving HPC hardware landscapes, positioning institutions like NCSA for the future of structural-bioinformatics pipelines. 

What’s Next: Launching the Open-Source EDM Platform 

Having built foundational infrastructure, Vinay’s focus now pivots to his newest endeavor: the open-source EDM project. This next-generation software platform is poised to support high-throughput ensemble diffusion modelling, generative structural-design workflows and multi-modal integration — sequence, structure and design. 

By leveraging the same infrastructure experience — containers, HPC service design and efficient scheduling —that served the AlphaFold and design-container efforts, EDM expands the platform footprint from prediction to design and inference, enabling the broader research community to adopt cutting-edge generative modelling workflows within their HPC portals. 

Why This Matters 

  • Efficiency at Scale:Separating CPU/GPU phases, aligning container architectures with hardware and user-accessibility via OOD ensures that institutions achieve high utilization and researcher-friendly access to complex workflows. 
  • Open-Source, Reproducible:Both the OOD-AlphaFold3 and the container designs are fully open, supporting transparency, community adoption and reproducibility. 
  • Future-Ready Architecture:By embracing ARM64, next-gen GPUs and container stacking, Vinay’s work helps institutions stay ahead of the hardware and algorithmic curve. 
  • Broad Impact for Structural Bioinformatics:Protein-structure prediction and design are foundational in drug discovery, functional genomics and synthetic biology. Deploying these workflows as easily accessible services via HPC portals dramatically broadens their reach. 

If your research group, HPC center or institutional computational resource aims to deploy a production-grade protein-structure, protein-design service, or is interested in generative modelling workflows via EDM, ICDS Fellow Vinay Mathew’s open-source frameworks offer a ready path. 

Contact Penn State ICDS to arrange a demo, discussion or collaborative deployment and consider adoption across your HPC portal, container registry or workflow platform. 

Explore the GitHub repositories: 

Funded by the Penn State ICDS Graduate Fellowship 

This work is supported by the Graduate Student Fellowship of the Institute for Computational & Data Sciences (ICDS) at Penn State University and NIH grant R35GM155380 to William Lai. This fellowship has enabled the development, documentation and deployment of these open-source service modules and container stacks, reinforcing Penn State’s commitment to open tools, reproducible science and scalable HPC infrastructure. 

 

Share this story:

facebook linked in twitter email

MEDIA CONTACT:

College of Engineering Media Relations

communications@engr.psu.edu