By Andy Smith (Northumbria University)
We often have large, unlabelled datasets in space physics, where the phenomenon of interest only appears rarely. Understanding the underlying physics of the system from rare observations is a challenge, and locating complementary, similar observations in large datasets can be prohibitively time consuming.
In this work we present an automated, self-supervised method by which the key information from two dimensional data can be encoded into a smaller vector representation. This representation (encoding/embedding) contains the key information describing the data; we can then use the distance between vectors to assess the similarity of the observations.
We showed the potential of this method with two example datasets – spacecraft in situ electron velocity distributions and auroral all sky images. For both datasets we provided the method with a library of over five thousand images, which were then effectively and automatically summarized by the model.
In the case of the electron distributions, we tested a “seed” image of a rare phenomena – corresponding to the region of space near the site of magnetic reconnection. In this region the electron distribution takes a characteristic crescent or arc-like shape [Figure 1, centre]. We can then extract the six closest partners of this image, using the distance between the embedding vectors. The two closest neighbours of the seed image (A and B in Figure 1) represent two separate previously published case study examples known to be close to the site of magnetic reconnection.
This method promises to be a useful tool in locating interesting phenomena in large datasets, providing an efficient method for moving from case studies to thorough statistical surveys. Code to train an example model is available at: https://github.com/SmithAndy005/SpaceSSL .
See publication for details:
Smith, A. W., Rae, I. J., Stawarz, J. E., Sun, W. J., Bentley, S., & Koul, A. (2024). Automatic encoding of unlabeled two dimensional data enabling similarity searches: Electron diffusion regions and auroral arcs. Journal of Geophysical Research: Space Physics, 129, e2023JA032096. https://doi.org/10.1029/2023JA032096