HPC Forge

HPC Forge is a parallel computing research lab at the University of California, Irvine. We aim to advance computational science and engineering using high-performance computing and artificial intelligence. Our target platforms span single-node to large-scale systems (i.e., supercomputers). Check out the Projects tab for active projects.

We are always looking for interested and motivated students/postdocs to join our team. If you are interested in joining our research lab, please email your CV and one representative publication (if any) to amowli@uci.edu.

Hi there!

I’m an associate professor in the Department of Electrical Engineering and Computer Science at UC Irvine. My research is in the area of high-performance computing and I lead the HPC Forge research lab. I received my Ph.D in Computational Science and Engineering from Georgia Tech in 2013 in the HPC Garage. Prior to joining UCI, I was a research scientist at MIT CSAIL, where I worked on the X-Stack (exascale software stack) project.


  • High-performance computing
  • Parallel algorithms and applications
  • Performance analysis and modeling
  • Scientific and data-intensive computing
  • AI for science


  • PhD in Computational Science and Engineering, 2013

    Georgia Institute of Technology

  • BE in Computer Science and Engineering, 2007

    Anna University

The Lab

Current Members


Octavi Obiols Sales

PhD Candidate, MAE


Mehrnaz Asadi

PhD student, EECS


Sebastian Barschkis

PhD student, EECS



Hengjie Wang

MAE PhD (2021)


Shu-Mei Tseng

EECS MS (2021)


Rohit Zambre

EECS PhD (2020)


Behnam Pourghassemi

EECS PhD (2021)


Laleh Beni

CS PhD (2019)


Bahareh Davani

EECS MS (2016)


Ferran Marti

Postdoctoral Scholar (2017 - 2018)

Recent Publications

Mosaic flows: A transferable deep learning framework for solving PDEs on unseen domains

SURFNet: Super-resolution of Turbulent Flows with Transfer Learning using Small Datasets

Demystifying asynchronous I/O Interference in HPC applications

Logically Parallel Communication for Fast MPI+ Threads Applications

Train Once and Use Forever: Solving Boundary Value Problems in Unseen Domains with Pre-trained Deep Learning Models



A CFD solver for high-performance turbulent flow simulations

High-performance Communication

A fast MPI+threads library for exascale supercomputers

Machine & Deep Learning

HPC for accelerating ML/DL and DL for science

Web browsers

Scalable dynamic analysis of web browsers

Recent & Upcoming Talks

Transferable Deep Learning Surrogates for Solving PDEs

Only Relative Speed Matters -- Virtual Causal Profiling

Scalable Web Performance Analysis Using Causal Profiling

On the Limits of Parallelizing Convolutional Neural Networks on GPUs

CFDNet - A deep learning-based accelerator for fluid simulations