
publications by categories in reversed chronological order. generated by jekyll-scholar.


  1. PANDORA: A Parallel Dendrogram Construction Algorithm for Single Linkage Clustering on GPU
    Piyush Sao, Andrey Prokopenko, and Damien Lebrun-Grandié
    arXiv preprint arXiv:2401.06089, 2024


  1. FUNNL: Fast Nonlinear Nonnegative Unmixing for Alternate Energy Systems
    Jeffrey A Graves, Thomas F Blum, Piyush Sao, and 2 more authors
    In Knowledge-Guided Machine Learning, 2023
  2. Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters
    Piyush Sao, Yang Liu, Nan Ding, and 2 more authors
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023
  3. Optimizing Communication in 2D Grid-Based MPI Applications at Exascale
    Hao Lu, Piyush Sao, Michael Matheson, and 3 more authors
    In Proceedings of the 30th European MPI Users’ Group Meeting, 2023
  4. Brief Announcement: Communication Optimal Sparse LU Factorization for Planar Matrices
    Piyush Sao, and Xiaoye Sherry Li
    In Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, 2023


  1. A single-tree algorithm to compute the Euclidean minimum spanning tree on GPUs
    Andrey Prokopenko, Piyush Sao, and Damien Lebrun-Grandie
    In Proceedings of the 51st International Conference on Parallel Processing, 2022
  2. Exaflops biomedical knowledge graph analytics
    Ramakrishnan Kannan, Piyush Sao, Hao Lu, and 8 more authors
    In 2022 SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
  3. Newly Released Capabilities in Distributed-memory SuperLU Sparse Direct Solver
    Xiaoye S Li, Paul Lin, Yang Liu, and 1 more author
    ACM Transactions on Mathematical Software, 2022
  4. Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale (Version 2.0)
    Christian Engelmann, Rizwan Ashraf, Saurabh Hukerikar, and 2 more authors


  1. Sparse Binary Matrix-Vector Multiplication on Neuromorphic Computers
    Catherine D Schuman, Bill Kay, Prasanna Date, and 3 more authors
    In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2021


  1. Scalable All-pairs Shortest Paths for Huge Graphs on Multi-GPU Clusters
    Piyush Sao, Hao Lu, Ramakrishnan Kannan, and 3 more authors
    In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing, 2020
  2. Scalable knowledge graph analytics at 136 petaflop/s
    Ramakrishnan Kannan, Piyush Sao, Hao Lu, and 5 more authors
    In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 2020
  3. A supernodal all-pairs shortest path algorithm
    Piyush Sao, Ramakrishnan Kannan, Prasun Gera, and 1 more author
    In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020
  4. Traversing large graphs on GPUs with unified memory
    Prasun Gera, Hyojong Kim, Piyush Sao, and 2 more authors
    Proceedings of the VLDB Endowment, 2020


  1. Multifrontal Non-negative Matrix Factorization
    Piyush Sao, and Ramakrishnan Kannan
    In International Conference on Parallel Processing and Applied Mathematics, 2019
  2. Self-stabilizing Connected Components
    Piyush Sao, Christian Engelmann, Srinivas Eswar, and 2 more authors
    In 2019 IEEE/ACM 9th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), 2019
  3. A Communication-avoiding 3D Sparse Triangular Solve Algorithm
    Piyush Sao, Ramakrishnan Kannan, Xiaoye Li, and 1 more author
    In International Conference on Supercomputing, Jun 2019
  4. A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems
    Piyush Sao, Xiaoye S Li, and Richard Vuduc
    Journal of Parallel and Distributed Computing, Jun 2019


  1. A communication-avoiding 3D LU factorization algorithm for sparse matrices
    Piyush Sao, Xiaoye S. Li, and Richard Vuduc
    In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2018
  2. Scalable and Resilient Sparse Linear Solvers
    Piyush Sao
    Georgia Institute of Technology, Aug 2018


  1. A Self-Correcting Connected Components Algorithm
    Piyush Sao, Oded Green, Chirag Jain, and 1 more author
    In Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale, Aug 2016
  2. SuperLU Users’ Guide
    Xiaoye S Li, James W Demmel, John R Gilbert, and 4 more authors
    Aug 2016


  1. A Sparse Direct Solver for Distributed Memory Xeon Phi-accelerated Systems
    Piyush Sao, Xing Liu, Richard Vuduc, and 1 more author
    In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International, Aug 2015


  1. A distributed CPU-GPU sparse direct solver
    Piyush Sao, Richard Vuduc, and Xiaoye Sherry Li
    In European Conference on Parallel Processing, Aug 2014
  2. A distributed kernel summation framework for general-dimension machine learning
    Dongryeol Lee, Piyush Sao, Richard Vuduc, and 1 more author
    Statistical Analysis and Data Mining: The ASA Data Science Journal, Aug 2014


  1. Self-stabilizing iterative solvers
    Piyush Sao, and Richard Vuduc
    In Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Aug 2013


  1. Model Order Reduction Techniques for VLSI Circuit Simulation
    Piyush Sao
    IIT Madras, May 2011