Calissano, A., Pereira, L., Lueg, J., Miolane, N.
Abstract
Analyzing non-Euclidean data such as graphs and trees requires specialized mathematical machinery because these data spaces lack some of the rich structure present in Euclidean spaces or smooth Riemannian manifolds. However, such non-Euclidean spaces can often still leverage the structures of Euclidean or manifold spaces through suitable constructions. For example, the space of unlabeled graphs can be realized as a quotient of the space of matrices (with the Frobenius metric) by the permutation group; the Billera–Holmes–Vogtmann (BHV) tree space is a stratified space assembled from Euclidean orthants; and Wald space is an embedding of phylogenetic forests into the space of symmetric positive-definite (SPD) matrices. These spaces are instances of geodesic metric spaces— topological spaces equipped with a distance metric induced by geodesics (shortest paths between any two points in the space). We present a Python package for the analysis of data living in such geodesic metric spaces. We describe the package’s object-oriented structure, which is based on abstract classes for a point, a point set, a geodesic path, and a metric in the sense of geodesic metric space theory. We provide three example implementations and two real-world applications with associated datasets. The package is implemented as a plug-in to the open-source Geomstats Python library, allowing users to seamlessly apply existing geometric and data-analysis tools to strongly non-Euclidean data in a theoretically consistent way. The code is fully unit-tested and documented.
Citation
Calissano, A., Pereira, L. F., Lueg, J., & Miolane, N. (2024). On the Implementation of Geodesic Metric Spaces.
BibTeX
@article{calissano2024implementation,
title={On the Implementation of Geodesic Metric Spaces},
author={Calissano, Anna and Pereira, Lu{\'\i}s F and Lueg, Jonas and Miolane, Nina},
journal={Journal of Machine Learning Research},
year={2024}
}
