ArborX

Performance-Portable Spatial Search and Clustering

The Spatial Search Problem

Many scientific applications—from cosmology simulations to wind farm modeling—need to answer geometric queries at massive scale: Which particles are within distance r? What are the k nearest neighbors? Traditional spatial data structures don’t map well to GPUs, where irregular memory access patterns kill performance.

ArborX is an open-source C++ library providing performance-portable algorithms for geometric search on modern supercomputing architectures. Developed as part of the Exascale Computing Project (ECP), it uses Kokkos for device-independent parallelism, enabling efficient execution across CPUs and GPUs from multiple vendors using a single codebase.

Architecture

At its core, ArborX uses a linear bounding volume hierarchy (BVH) for low construction cost and high-quality spatial indexing. It supports both spatial queries and nearest-neighbor queries, each requiring fundamentally different tree traversal strategies.

graph TB
    subgraph ArborX["ArborX Library"]
        direction TB

        BVH["<b>Bounding Volume Hierarchy</b><br/>Linear BVH construction<br/>Morton-code sorting"]

        subgraph Queries["Query Types"]
            SQ["Spatial Queries<br/><i>radius-based search</i>"]
            NQ["Nearest-Neighbor<br/><i>k-closest objects</i>"]
        end

        subgraph Algorithms["Algorithms Built on ArborX"]
            DBSCAN["DBSCAN<br/>Clustering"]
            EMST["Euclidean Minimum<br/>Spanning Tree"]
            HDBSCAN["HDBSCAN*<br/>Clustering"]
            PANDORA["PANDORA<br/>Dendrogram<br/>Construction"]
            MLS["Moving Least<br/>Squares Interpolation"]
        end
    end

    BVH --> SQ
    BVH --> NQ
    SQ --> Algorithms
    NQ --> Algorithms

    subgraph Portability["Performance Portability via Kokkos"]
        CPU["Multi-core CPUs<br/>(OpenMP)"]
        NGPU["NVIDIA GPUs<br/>(CUDA)"]
        AGPU["AMD GPUs<br/>(HIP)"]
        IGPU["Intel GPUs<br/>(SYCL)"]
    end

    ArborX --> Portability

    style ArborX fill:#f0f7ff,stroke:#1a73e8,stroke-width:2px
    style Algorithms fill:#e6f4ea,stroke:#137333
    style Portability fill:#fef7e0,stroke:#e37400

My Contributions

Single-Tree Borůvka EMST (ICPP 2022)

Computing the Euclidean Minimum Spanning Tree on GPUs is challenging due to complex branching, data dependencies, and load imbalances. Our single-tree Borůvka algorithm addresses these challenges through careful work distribution and ArborX’s spatial primitives.

PANDORA: Parallel Dendrogram Construction (ICPP 2024)

Single-linkage clustering requires building a dendrogram—a hierarchical tree of cluster merges. PANDORA constructs this structure in parallel on GPUs, leveraging ArborX’s nearest-neighbor queries to achieve scalable hierarchical clustering.

DBSCAN for Cosmology (ExaSky)

The DBSCAN clustering algorithm we developed in ArborX performs 15× better on a single GPU than the multithreaded CPU version on a full Summit node, directly benefiting the ECP ExaSky cosmology project.

Impact

ArborX is published in ACM TOMS (2020) and is an integral component of xSDK and the broader ECP software ecosystem. Applications include:

  • Computational cosmology (halo finding)
  • Multiphysics data transfer
  • Computational mechanics
  • Wind farm simulations

Code: github.com/arborx/ArborX

References