K. Kokilepersaud, M. Prabhushankar, and G. AlRegib
In this work, we identify representational-based metrics that can characterize why certain seismic sections perform better or worse on downstream segmentation tasks without relying on access to label information. Representation learning approaches that perform operations on intermediate outputs of neural networks before the downstream task are seeing increased popularity within annotation-scarce domains due to their focus on training without explicit access to labeled data. However, a major area of research is assessing what properties of produced representations correspond to improved downstream performance. Typical approaches attempt to acquire a performance metric by fine-tuning a representation space for a specific application like semantic segmentation and then acquiring a test set that can reliably characterize the generalizability of the neural network. However, in real-world seismic workflows, this assumption of access to a reliable test set fails due to the constraints of the seismic interpretation process. Oftentimes, very few sections are interpreted to produce a small training set due to the expense associated with the interpretation process. This expense also hinders the development of a reliable testing set to assess the model as well. Instead, a more practical approach is to identify general metrics of a model’s representation space that corresponds with the downstream performance of a model. In this way, this gives the geophysicist an idea of how the model will perform on different parts of a seismic volume even without access to explicit test set labels.

