HPC Forge

HPC Forge is a parallel computing research lab at the University of California, Irvine. We aim to advance computational science and engineering using high-performance computing and artificial intelligence. Our target platforms span single-node to large-scale systems (i.e., supercomputers). Check out the Projects tab for active projects.

We are always looking for interested and motivated students/postdocs to join our team. If you are interested in joining our research lab, please email your CV and one representative publication (if any) to amowli@uci.edu.

Hi there!

I’m an associate professor in the Department of Electrical Engineering and Computer Science at UC Irvine. My research is in the area of high-performance computing and HPC Forge is my research lab. I received my Ph.D in Computational Science and Engineering from Georgia Tech in 2013 in the HPC Garage. Prior to joining UCI, I was a research scientist at MIT CSAIL, where I worked on the X-Stack (exascale software stack) project.


  • High-performance computing
  • Parallel algorithms and applications
  • Performance analysis and modeling
  • Scientific and data-intensive computing
  • AI for science


  • PhD in Computational Science and Engineering, 2013

    Georgia Institute of Technology

  • BE in Computer Science and Engineering, 2007

    Anna University

The Lab

Current Members


Behnam Pourghassemi

PhD Candidate, EECS


Hengjie Wang

PhD Candidate, MAE


Octavi Obiols Sales

PhD Candidate, MAE



Rohit Zambre

EECS PhD (2020)


Shu-Mei Tseng



Laleh Beni

CS PhD (2019)


Bahareh Davani

EECS MS (2016)


Ferran Marti

Postdoctoral Scholar (2017 - 2018)

Recent Publications

adPerf: Characterizing the Performance of Third-party Ads

Optimizing the Hypre solver for manycore and GPU architectures

Only Relative Speed Matters: Virtual Causal Profiling

Pencil: A pipelined algorithm for distributed stencils

Artificial Intelligence and High-Performance Computing: The Drivers of Tomorrow's Science



A CFD solver for high-performance turbulent flow simulations

High-performance Communication

A fast MPI+threads library for exascale supercomputers

Machine & Deep Learning

HPC for accelerating ML/DL and DL for science

Web browsers

Scalable dynamic analysis of web browsers

Recent & Upcoming Talks

Only Relative Speed Matters -- Virtual Causal Profiling

Scalable Web Performance Analysis Using Causal Profiling

On the Limits of Parallelizing Convolutional Neural Networks on GPUs

CFDNet - A deep learning-based accelerator for fluid simulations

How I Learned to Stop Worrying about User-Visible Endpoints and Love MPI