We show that, contrary to popular assumptions, predictions from machine learning potentials built upon high-dimensional atom-density representations almost exclusively occur in regions of the representation space which lie outside the convex hull defined by the training set points. We then propose a perspective to rationalise the domain of robust extrapolation and accurate prediction of atomistic machine learning potentials in terms of the probability density induced by training points in the representation space.
The data here contained can be used to reproduce all results and graphs shown in the article. We also include the trajectory files for the Au13 dataset we generate by running molecular dynamics simulations of an Au nanoparticle containing 13 atoms at temperatures of 50K, 100K, 200K, 300K, and 400K. Details regarding the generation of such dataset can be found in the supplementary information file for the article.