publications | Piyush K. Sao

2026

FUNNL: Fast Nonlinear Nonnegative Unmixing for Alternate Energy Systems

Jeffrey A Graves, Thomas F Blum, Piyush Sao, and 2 more authors

In Knowledge-Guided Machine Learning, 2023

Scholar
Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters

Piyush Sao, Yang Liu, Nan Ding, and 2 more authors

In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023

PDF
Optimizing Communication in 2D Grid-Based MPI Applications at Exascale

Hao Lu, Piyush Sao, Michael Matheson, and 3 more authors

In Proceedings of the 30th European MPI Users’ Group Meeting, 2023

PDF
Brief Announcement: Communication Optimal Sparse LU Factorization for Planar Matrices

Piyush Sao, and Xiaoye Sherry Li

In Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, 2023

A single-tree algorithm to compute the Euclidean minimum spanning tree on GPUs

Andrey Prokopenko, Piyush Sao, and Damien Lebrun-Grandie

In Proceedings of the 51st International Conference on Parallel Processing, 2022

PDF
Exaflops biomedical knowledge graph analytics

Ramakrishnan Kannan, Piyush Sao, Hao Lu, and 8 more authors

In 2022 SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Newly Released Capabilities in Distributed-memory SuperLU Sparse Direct Solver

Xiaoye S Li, Paul Lin, Yang Liu, and 1 more author

ACM Transactions on Mathematical Software, 2022

PDF
Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale (Version 2.0)

Christian Engelmann, Rizwan Ashraf, Saurabh Hukerikar, and 2 more authors

2022

PDF

Sparse Binary Matrix-Vector Multiplication on Neuromorphic Computers

Catherine D Schuman, Bill Kay, Prasanna Date, and 3 more authors

In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2021
Dense semiring linear algebra on modern cuda hardware

Vijay Thakkar, Ramakrishnan Kannan, Piyush Sao, and 5 more authors

2021

SIAM Computational Sciences and Engineering. SIAM

Slides

Scalable All-pairs Shortest Paths for Huge Graphs on Multi-GPU Clusters

Piyush Sao, Hao Lu, Ramakrishnan Kannan, and 3 more authors

In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing, 2020

PDF
Scalable knowledge graph analytics at 136 petaflop/s

Ramakrishnan Kannan, Piyush Sao, Hao Lu, and 5 more authors

In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 2020
A supernodal all-pairs shortest path algorithm

Piyush Sao, Ramakrishnan Kannan, Prasun Gera, and 1 more author

In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

PDF
Traversing large graphs on GPUs with unified memory

Prasun Gera, Hyojong Kim, Piyush Sao, and 2 more authors

Proceedings of the VLDB Endowment, 2020

Multifrontal Non-negative Matrix Factorization

Piyush Sao, and Ramakrishnan Kannan

In International Conference on Parallel Processing and Applied Mathematics, 2019
Self-stabilizing Connected Components

Piyush Sao, Christian Engelmann, Srinivas Eswar, and 2 more authors

In 2019 IEEE/ACM 9th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), 2019
A Communication-avoiding 3D Sparse Triangular Solve Algorithm

Piyush Sao, Ramakrishnan Kannan, Xiaoye Li, and 1 more author

In International Conference on Supercomputing, Jun 2019

PDF
A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems

Piyush Sao, Xiaoye S Li, and Richard Vuduc

Journal of Parallel and Distributed Computing, Jun 2019

A communication-avoiding 3D LU factorization algorithm for sparse matrices

Piyush Sao, Xiaoye S. Li, and Richard Vuduc

In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2018
Scalable and Resilient Sparse Linear Solvers

Piyush Sao

Georgia Institute of Technology, Aug 2018

A Self-Correcting Connected Components Algorithm

Piyush Sao, Oded Green, Chirag Jain, and 1 more author

In Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale, Aug 2016
SuperLU Users’ Guide

Xiaoye S Li, James W Demmel, John R Gilbert, and 4 more authors

Aug 2016

A Sparse Direct Solver for Distributed Memory Xeon Phi-accelerated Systems

Piyush Sao, Xing Liu, Richard Vuduc, and 1 more author

In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International, Aug 2015

A distributed CPU-GPU sparse direct solver

Piyush Sao, Richard Vuduc, and Xiaoye Sherry Li

In European Conference on Parallel Processing, Aug 2014

PDF
A distributed kernel summation framework for general-dimension machine learning

Dongryeol Lee, Piyush Sao, Richard Vuduc, and 1 more author

Statistical Analysis and Data Mining: The ASA Data Science Journal, Aug 2014

Self-stabilizing iterative solvers

Piyush Sao, and Richard Vuduc

In Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Aug 2013

Model Order Reduction Techniques for VLSI Circuit Simulation

Piyush Sao

IIT Madras, May 2011