Publications | HPC Forge

BERN-NN-IBF: Enhancing Neural Network Bound Propagation Through Implicit Bernstein Form and Optimized Tensor Operations

Wael Fatnassi, Arthur Feeney, Aparna Chandramowlishwaran, Yasser Shoukry

November 2024 IEEE Transactions on Computer-Aided Design (IEEE TCAD)

PDF Project

Recent progress of artificial intelligence for liquid-vapor phase change heat transfer

Youngjoon Suh, Aparna Chandramowlishwaran, Yoonjin Won

March 2024 npj Computational Materials

PDF Project

BubbleML: A Multiphysics Dataset and Benchmarks for Machine Learning

Sheikh Md Shakeel Hassan, Arthur Feeney, Akash Dhruv, Jihoon Kim, Youngjoon Suh, Jaiyoung Ryu, Yoonjin Won, Aparna Chandramowlishwaran

December 2023 Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS Spotlight)

PDF Project

Breaking Boundaries: Distributed Domain Decomposition with Scalable Physics-Informed Neural PDE Solvers

Arthur Feeney, Zitong Li, Ramin Bostanabad, Aparna Chandramowlishwaran

November 2023 Proc. ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC)

PDF Project

ADARNet: Deep Learning Predicts Adaptive Mesh Refinement

Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, Aparna Chandramowlishwaran

August 2023 International Conference on Parallel Processing (ICPP)

PDF Project Project

Lessons Learned on MPI+Threads Communication

Rohit Zambre, Aparna Chandramowlishwaran

November 2022 Proc. ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC)

PDF Project

Mosaic flows: A transferable deep learning framework for solving PDEs on unseen domains

Hengjie Wang, Robert Planas, Aparna Chandramowlishwaran, Ramin Bostanabad

January 2022 Computer Methods in Applied Mechanics and Engineering

PDF Project Project

SURFNet: Super-resolution of Turbulent Flows with Transfer Learning using Small Datasets

Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, Aparna Chandramowlishwaran

September 2021 Proc. 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)

PDF Project Project

Demystifying asynchronous I/O Interference in HPC applications

Shu-Mei Tseng, Bogdan Nicolae, Franck Cappello, Aparna Chandramowlishwaran

May 2021 The International Journal of High Performance Computing Applications (IJHPCA)

PDF

Train Once and Use Forever: Solving Boundary Value Problems in Unseen Domains with Pre-trained Deep Learning Models

Hengjie Wang, Robert Planas, Aparna Chandramowlishwaran, Ramin Bostanabad

April 2021 arXiv

PDF Project Project

Logically Parallel Communication for Fast MPI+ Threads Applications

Rohit Zambre, Damodar Sahasrabudhe, Hui Zhou, Martin Berzins, Aparna Chandramowlishwaran, Pavan Balaji

April 2021 IEEE Transactions on Parallel and Distributed Systems (TPDS)

PDF

adPerf: Characterizing the Performance of Third-party Ads

Behnam Pourghassemi, Jordan Bonecutter, Zhou Li, Aparna Chandramowlishwaran

March 2021 Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS)

PDF Code

Optimizing the Hypre solver for manycore and GPU architectures

Damodar Sahasrabudhe, Rohit Zambre, Aparna Chandramowlishwaran, Martin Berzins

February 2021 Journal of Computational Science

PDF

Pencil: A pipelined algorithm for distributed stencils

Hengjie Wang, Aparna Chandramowlishwaran

November 2020 Proc. ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC)

PDF Project

Only Relative Speed Matters: Virtual Causal Profiling

Behnam Pourghassemi, Ardalan Amiri Sani, Aparna Chandramowlishwaran

November 2020 Proc. ACM Performance Evaluation Review (PER)

PDF Slides Video

Artificial Intelligence and High-Performance Computing: The Drivers of Tomorrow's Science

Aparna Chandramowlishwaran

October 2020 SIAM News

PDF Project

Brief Announcement: On the Limits of Parallelizing Convolutional Neural Networks on GPUs

Behnam Pourghassemi, Chenghao Zhang, Joo Hwan Lee, Aparna Chandramowlishwaran

July 2020 Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA)

PDF Project Video

How I Learned to Stop Worrying about User-Visible Endpoints and Love MPI

Rohit Zambre, Aparna Chandramowlishwaran, Pavan Balaji

June 2020 Proc. ACM International Conference on Supercomputing (ICS)

PDF Slides Video

CFDNet: A deep learning-based accelerator for fluid simulations

Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, Aparna Chandramowlishwaran

June 2020 Proc. ACM International Conference on Supercomputing (ICS)

PDF Project Project Video

Towards Portable Online Prediction of Network Utilization using MPI-level Monitoring

Shu-Mei Tseng, Bogdan Nicolae, George Bosilca, Emmanuel Jeannot, Aparna Chandramowlishwaran, Franck Cappello

August 2019 Proc. 25th International Conference on Parallel and Distributed Computing (EuroPar)

PDF Project

Breaking Band: A Breakdown of High-performance Communication

Rohit Zambre, Megan Grodowitz, Aparna Chandramowlishwaran, Pavel Shamis

August 2019 Proc. 48th International Conference on Parallel Processing (ICPP)

PDF Slides

What-If Analysis of Page Load Time in Web Browsers Using Causal Profiling

Behnam Pourghassemi, Ardalan Amiri Sani, Aparna Chandramowlishwaran

June 2019 Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS)

PDF Code Slides

Multi-criteria partitioning of multi-block structured grids

Hengjie Wang, Aparna Chandramowlishwaran

June 2019 Proc. ACM International Conference on Supercomputing (ICS)

PDF Code Project Slides

Portal: A High-Performance Language and Compiler for Parallel N-body Problems

Laleh Aghababaie Beni, Saikiran Ramanan, Aparna Chandramowlishwaran

May 2019 Proc. 33rd IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS)

PDF Code Project Slides

Scalable Communication Endpoints for MPI+Threads Applications

Rohit Zambre, Aparna Chandramowlishwaran, Pavan Balaji

December 2018 Proc. IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS)

PDF Poster

Roofline Guided Design and Analysis of a Multi-stencil CFD Solver for Multicore Performance

Bahareh Mostafazadeh, Ferran Marti, Feng Liu, Aparna Chandramowlishwaran

May 2018 Proc. 32nd IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS)

PDF Project Slides

Sugar: Secure GPU Acceleration in Web Browsers

Zhihao Yao, Zongheng Ma, Yingtong Liu, Ardalan Amiri Sani, Aparna Chandramowlishwaran

February 2018 ACM SIGPLAN Notices

PDF

cudaCR: An in-kernel application-level checkpoint/restart scheme for CUDA-enabled GPUs

Behnam Pourghassemi, Aparna Chandramowlishwaran

September 2017 Proc. IEEE International Conference on Cluster Computing (CLUSTER)

PDF

PASCAL: A Parallel Algorithmic SCALable Framework for N-body Problems

We propose PASCAL, a parallel unified algorithmic framework for generalized N-body problems. PASCAL utilizes tree data structures and …

Laleh Aghababaie Beni, Aparna Chandramowlishwaran

August 2017 Proc. 23rd International Conference on Parallel and Distributed Computing (EuroPar)

PDF Project Slides

Unsteady Navier-Stokes Flow on GPU Architectures

Bahareh Mostafazadeh Davani, Ferran Marti, Behnam Pourghassemi, Feng Liu, Aparna Chandramowlishwaran

June 2017 Proc. 23rd AIAA Computational Fluid Dynamics Conference

PDF Project

Parallel Performance-Energy Predictive Modeling of Browsers: Case Study of Servo

Rohit Zambre, Lars Bergstrom, Laleh A. Beni, Aparna Chandramowlishwaran

December 2016 Proc.~IEEE Int’l Conf. on High Performance Computing, Data, and Analytics (HiPC)

PDF Slides

A SystemC model for N-body problems and its Parallel Design Space Exploration

Kasra Moazzemi, Rainer Doemer, Aparna Chandramowlishwaran

November 2016

PDF

A CPU--GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method

Jee Whan Choi, Aparna Chandramowlishwaran, Kamesh Madduri, Richard Vuduc

March 2014 Proc.~7th Wkshp.~on General-purpose Processing using GPUs (GPGPU-7)

PDF

Brief Announcement: Towards a Communication Optimal Fast Multipole Method and its Implications at Exascale

Aparna Chandramowlishwaran, Jee Whan Choi, Kamesh Madduri, Richard Vuduc

June 2012 Proc.~ACM Symposium on Parallel Algorithms and Architectures (SPAA)

PDF

Courses in High-Performance Computing for Scientists and Engineers

Richard Vuduc, Kenneth Czechowski, Aparna Chandramowlishwaran, Jee Whan Choi

May 2012 Proc. NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar)

PDF

Communication-Optimal Parallel N-body Solvers

Aparna Chandramowlishwaran, Richard Vuduc

May 2012 Proc. 26th IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS)

A massively parallel adaptive Fast Multipole Method on heterogeneous architectures

Ilya Lashuk, Aparna Chandramowlishwaran, Harper Langston, Tuan-Ahn Nguyen, Rahul Sampath, Aashay Shringarpure, Richard Vuduc, Lexing Ying, Denis Zorin, George Biros

May 2012 Communications of the ACM (CACM)

PDF

Balance principles for algorithm-architecture co-design

Kenneth Czechowski, Casey Battaglino, Chris McClanahan, Aparna Chandramowlishwaran, Richard Vuduc

May 2011 Proc.~USENIX Wkshp. on Hot Topics in Parallelism (HotPar)

PDF

Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures

Abtin Rahimian, Ilya Lashuk, Dhairya Malhotra, Aparna Chandramowlishwaran, Logan Moon, Rahul Sampath, Aashay Shringarpure, Shravan Veerapaneni, Jeffery Vetter, Richard Vuduc, Denis Zorin, George Biros

November 2010 Proc.~ACM/IEEE Conf. Supercomputing (SC)

PDF

Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method

Aparna Chandramowlishwaran, Kamesh Madduri, Richard Vuduc

November 2010 Proc.~ACM/IEEE Conf. Supercomputing (SC)

PDF

On the limits of GPU acceleration

Richard Vuduc, Aparna Chandramowlishwaran, Jee Whan Choi, Murat Efe Guney, Aashay Shringarpure

June 2010 Proc.~USENIX Wkshp. on Hot Topics in Parallelism (HotPar)

PDF

Performance evaluation of Concurrent Collections on high-performance multicore computing systems

Aparna Chandramowlishwaran, Kathleen Knobe, Richard Vuduc

April 2010 Proc.~IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS)

PDF

Optimizing and tuning the Fast Multipole Method for state-of-the-art multicore architectures

Aparna Chandramowlishwaran, Samuel Williams, Leonid Oliker, Ilya Lashuk, George Biros, Richard Vuduc

April 2010 Proc.~IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS)

PDF

Performance evaluation of Concurrent Collections on high-performance multicore computing systems

Aparna Chandramowlishwaran, Kathleen Knobe, Richard Vuduc

February 2010

Applying the Concurrent Collections Programming Model to Asynchronous Parallel Dense Linear Algebra

Aparna Chandramowlishwaran, Kathleen Knobe, Richard Vuduc

January 2010 Proc.~ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP)

PDF

A massively parallel adaptive Fast Multipole Method on heterogeneous architectures

Ilya Lashuk, Aparna Chandramowlishwaran, Harper Langston, Tuan-Anh Nguyen, Rahul Sampath, Aashay Shringarpure, Richard Vuduc, Lexing Ying, Denis Zorin, George Biros

November 2009 Proc.~ACM/IEEE Conf. Supercomputing (SC)

PDF

Multi-core implementations of the Concurrent Collections programming model

Zoran Budimlić, Aparna Chandramowlishwaran, Kathleen Knobe, Geoff Lowney, Vivek Sarkar, Leo Treggiari

January 2009 Proc.~Wkshp. Compilers for Parallel Computing (CPC)

PDF

Declarative Aspects of Memory Management in the Concurrent Collections Parallel Programming Model

Zoran Budimlić, Aparna Chandramowlishwaran, Kathleen Knobe, Geoff Lowney, Vivek Sarkar, Leo Treggiari

January 2009 Workshop on Declarative Aspects of Multicore Programming (DAMP)

PDF

On the Design of Fast Pseudo-Random Number Generators for the Cell Broadband Engine and an Application to Risk Analysis

David A. Bader, Aparna Chandramowlishwaran, Virat Agarwal

September 2008 Proc. 37th Int’l. Conf. Parallel Processing (ICPP)

PDF

Numerical algorithms with tunable parallelism

Aparna Chandramowlishwaran, Abhinav Karhu, Ketan Umare, Richard Vuduc

April 2008 Proc.Wkshp. Software Tools for Multicore Systems (STMCS), at IEEE/ACM Int’l. Symp. Code Generation and Optimization (CGO)

PDF