
The lack of labeling consistency in seismic fault annotations, as well as the high cost of acquiring quality labels represent significant issues when developing machine learning models for fault segmentation. To address this problem, a common solution is to resort to alternatives like training models on large amounts of synthetic data, and then finetuning them on a smaller set of real target data. However, this poses a domain adaptation challenge, given that the distributional and geophysical properties of these two data sources are generally different. In general, harnessing the knowledge from alternative label sources in real seismic settings remains an open challenge.
In this abstract, we analyze the way fault segmentation models transfer knowledge across different data sources under various training and finetuning conditions. To this end, we leverage a natural seismic survey called the Thebe Gas Field, a synthetic seismic dataset used to train the well-known method FaultSeg3D, and a novel crowdsourced seismic dataset called CRACKS built on top of the Netherlands F3 block. CRACKS is further subdivided in three label categories: domain expert, practitioner and novice-level labels. We use all possible pairwise combinations of these labels sources to pretrain and finetune a fault segmentation model, and evaluate the model’s performance on CRACKS (expert label) and synthetic data.

We find that finetuning on Thebe degrades a model’s performance on the same testing setup. A statistical analysis shows that Thebe has a much lower standard deviation (0.124) compared to CRACKS (1.149) and synthetic (1.052). This means the contrast and intensity variations in Thebe are much lower, while CRACKS and synthetic have more diverse intensity distributions. Because of this, models trained on Thebe learn from a narrower data distribution, making them struggle on datasets with wider variations like CRACKS and synthetic. Performance metrics confirm this issue. Models pretrained on CRACKS or synthetic and fine-tuned on Thebe suffer a sharp drop in DICE score and a big increase in Bidirectional Chamfer Distance (BCD) when tested back on CRACKS. This means the model forgets important features needed for fault segmentation in CRACKS, a classic case of catastrophic forgetting. This shows that domain shift in seismic segmentation is caused by data distribution differences rather than feature learning alone.
In seismic data processing, normalization is essential. It helps correct variations in amplitude, signal strength, and noise across datasets. Traditional methods use gain correction, amplitude scaling, and histogram equalization to make seismic data more consistent. But in deep learning, dataset-specific intensity differences are often ignored, making models struggle on new datasets. Our findings suggest that proper normalization before training can reduce domain shift. Methods like global min-max normalization, per-trace standardization, and adaptive histogram equalization could help match dataset distributions and improve model generalization.
In conclusion, our study shows a strong link between dataset statistics and domain adaptation in seismic segmentation. The poor generalization of models trained or fine-tuned on Thebe suggests that normalization differences, not just feature learning, cause domain shift. Using seismic data normalization techniques can help reduce domain shifts and improve model robustness, leading to better deep-learning models for seismic interpretation.