ML4Seismic 2025

Location: Coda, 756 W Peachtree St NW, Atlanta, GA 30308

Date: November 19-21, 2025

Click on each item for more details

Session: Prompting and Attention in Seismic Vision

Session Chair: (Prithwijit Chowdhury)

Prithwijit Chowdhury, Mohit Prabhushankar, and Ghassan AlRegib, “Active Prompt Querying for Agentic AI in Seismic Tasks”

Abstract. Transformer-based foundation models trained on large corpora now provide strong out of the box solutions for a wide range of tasks across multiple domains. However, their behavior and outputs become increasingly uncertain when the domain is difficult or the task is very specific, leading to fragile predictions. To address this, prompts have been added as an explicit control channel. Prompts not only let users interact with the system but also help guide these billion parameter models toward more appropriate responses without the need for fine tuning. In this presentation we i) explore how prompts interact with the data inside the model representation space while also informing one another about the task, and ii) provide a task and label agnostic robustness metric grounded in information theory and causality to tackle common problems of over prompting and mis-prompting in large vision models. Together, these contributions give users clearer levers to intervene, diagnose, and refine task specific outputs from the model while preserving the convenience of out of the box use.

Mohammad Alotaibi, Mohit Prabhushankar, and Ghassan AlRegib, “Understanding Attention: How seismic Features are Attended in Transformer”

Abstract. Transformers are increasingly used in seismic interpretation, but how seismic features attend to each other within attention maps is still not well understood. In this presentation, we analyze how different seismic structures interact through attention. We compare these behaviors with those in natural and medical images to understand differences in locality and globality. By studying which features attend to which, we aim to reveal how Transformers perceive the seismic subsurface and how this understanding can help adapt attention-based models to seismic data, even with limited labeled samples.


Session: Deploying ML in Seismic: Representation, Domain Shift, Generalization & Inferential Behavior

Session Chairs: (Chen Zhou)

Chen Zhou and Ghassan AlRegib, “Pseudo-labels as Signals: Retaining Informative Variability of Interpretation”

Abstract. Interpretation variation provides information in the understanding of data rather than mere noise. However, when modeling variable disagreement through latent representations, the representation space can suffer from dimensionality collapse, where only a subset of dimensions encodes meaningful variation. This phenomenon limits the model’s capability to capture diverse interpretations, especially underrepresented but critical interpretations. This talk explores the utilization of dimension- and information-based regularization to help retain the effective dimensionality of representation spaces, enabling models to better represent multiple plausible interpretations. These insights provide cross-domain implications. In seismic interpretation, capturing subtle differences between plausible subsurface interpretations enriches understanding across distinct geophysical contexts.

Jorge Quesada, Mohit Prabhushankar, and Ghassan AlRegib, “Domain Shift & Model Generalization in Seismic Fault Segmentation”

Abstract. We present a large-scale benchmark of over 200 fault segmentation models trained and evaluated across diverse seismic datasets to investigate how model capacity, training strategy, and data alignment affect generalization under domain shift. Our results show that fine-tuning remains effective when domains are similar but becomes unstable as distributional differences increase, while larger models exhibit greater adaptability than smaller ones. To complement traditional pixel and distance-based scores, we introduce a set of geometric and topological metrics that capture fault-level characteristics such as orientation consistency, tortuosity, and continuity. Our analysis reveals that models inherit structural biases from the datasets they are finetuned on, influencing the geometry of predicted faults beyond what conventional metrics capture. Taken together, the findings provide practical guidance for selecting pretraining and finetuning strategies, balancing model size and data similarity, and integrating structural metrics into the evaluation of deep-learning-assisted seismic interpretation workflows.

Jorge Quesada, Prithwijit Chowdhury, Mohit Prabhushankar, and Ghassan AlRegib, “Stable Transfers: Domain Adaptation and Memory Preservation in Seismic Deep Learning”

Abstract. Deep-learning-based fault segmentation models often struggle to generalize across seismic datasets, suffering from domain shift and catastrophic forgetting. We study two complementary strategies to improve robustness in transfer settings: domain adaptation and regularization-based forgetting mitigation. Domain adaptation methods are shown to enhance model transferability when domains are strongly mismatched, but may reduce performance when the shift is moderate, underscoring the need for adaptive alignment strategies. Regularization approaches such as Elastic Weight Consolidation help preserve knowledge from prior domains, and are designed to stabilize performance during finetuning. Together, these case studies illustrate the trade-offs between adaptation and retention, and emphasize that optimal strategies depend jointly on dataset similarity and the degree of shift. Our analysis provides practical insights into how domain adaptation and forgetting mitigation can be systematically applied to improve the reliability of deep-learning-assisted fault delineation.

Sahil Mithani, Mohit Prabhushankar, and Ghassan AlRegib, “Gradient Features for Post-Training Model Selection and Label-Efficient Fault Detection in Seismic Volumes”

Abstract. In this talk, we propose a trained-model selection technique for zero-shot fault delineation in new seismic surveys. Deploying pretrained fault-segmentation networks are often hindered by label scarcity and data shifts. Fully supervised evaluation or fine-tuning on each new volume is costly and can be counterproductive, for example, fine-tuning on the Thebe field can degrade performance and trigger catastrophic forgetting under domain shift. We introduce a label-free, gradient-based evaluation pipeline that ranks pretrained models by their expected generalization. Without ground-truth labels, we compute a set of gradient-derived metrics from a forward/backward pass on unlabeled seismic slices. These metrics use a confounding target – a shifted, non-informative label – to probe how confidently and stably a model responds when it is not guided by true annotations. The resulting gradient signals serve as proxies for model quality. This enables post-training model selection,choosing the most promising fault detector for a new volume, with zero labels and no fine-tuning. We quantify the effectiveness of our approach across over 240 pretrained models evaluated on synthetic and real seismic datasets.


Session: Representation Dynamics in Seismic ML

Session Chair: (Mohit Prabhushankar)

Mohit Prabhushankar and Ghassan AlRegib, “AI Robustness Certification in Subsurface Interpretation”

Abstract. Foundation models are billion parameter large-scale neural networks that generalize across data domains and tasks. This generalizability allows the usage of foundation models as backbones for multifarious downstream applications and processing. Traditionally, model training involves minimizing some predefined empirical risk on a given task. By doing so, task-dependent minimal sufficient statistics (MSS) is extracted from the current data to make the inference. However, in foundation models, the task is unknown during training and hence, the notion of task-specific MSS at inference becomes degenerate. Hence, a fundamental aspect of optimization guarantees is unavailable. Paradoxically, the more a model trains, the less robust the outputs will be. In this talk, we show the following: (i) transitional information between layers must be preserved, (ii) task specific bias-variance tradeoffs must be replaced by dimensionality-mutual information tradeoffs.

William Stevens, Mohit Prabhushankar, Ghassan AlRegib, “Visualizing Uncertainty in Facies Segmentation by Tracking Epoch-wise Mutual Information”

Abstract. Understanding how neural network representations evolve during training can offer a new lens into learning stability and model uncertainty that is not captured by traditional measures comparing representations with inputs or outputs. This work explores a layerwise, per-epoch, information theory approach using mutual information between sequential internal representations. By trying to understand learning as a dynamic flow through the representation space, this work has discovered trends such as early-layer stabilization and delayed convergence for deeper layers. These patterns give a promising baseline to help quantify uncertainty and effective capacity across training. In this talk, we visualize these representation dynamics and showcase model uncertainty. Looking forward, this framework will be applied to prompt-wise tracking in the Segment Anything Model (SAM) to augment uncertainty estimation and over-prompting indication in seismic imaging tasks such as salt dome identification.


Session: Collapse, Scale & Representation Challenges in Seismic

Session Chairs: (Seulgi Kim & Mohammad Alotaibi)

Abdelrahman Musleh, Mohit Prabhushankar, Ghassan AlRegib, “Information Collapse in Deep Learning and Its Impact on Seismic Interpretation Workflows”

Abstract. Information Collapse in Deep Learning appears in several forms, including Representation Collapse (where feature vectors of all samples map to an identical embedding), Dimensional Collapse (where feature vectors of all samples lie on a low-dimensional manifold), Model Collapse (where the model degrades by recursively training on its own generated or synthetic data, leading to overfitting or suboptimal solutions), and Neural Collapse (where features of a class converge to its class mean, class means are symmetrically arranged, and the classifier for each class aligns with its class mean). Each represents a distinct phenomenon where learned representations exhibit reduced variability or increased alignment, which may either enhance structure or degrade performance depending on context. This talk presents the literature and case studies of Information Collapse in seismic literature. Subsequently, it identifies suitable metrics for detecting and characterizing the occurrence and degree of each collapse type during training. Recognizing when and how these collapses emerge is crucial for understanding their implications on model generalization, interpretability, and stability. Looking forward, the insights will inform strategies for leveraging beneficial collapses while mitigating those that are detrimental for seismic interpretation workflows.

Seulgi Kim, Kiran Kokilepersaud, Mohit Prabhushankar, and Ghassan AlRegib, “Countering Multi-modal Representation Collapse through Rank-targeted Fusion”

Abstract. Multi-modal fusion methods often suffer from two types of representation collapse: feature collapse where individual dimensions lose their discriminative power (as measured by eigenspectra), and modality collapse where one dominant modality overwhelms the other. Applications like human action anticipation that require fusing multifarious sensor data are hindered by both feature and modality collapse. However, existing methods attempt to counter feature collapse and modality collapse separately. This is because there is no unifying framework that efficiently addresses feature and modality collapse in conjunction. In this paper, we posit the utility of effective rank as an informative measure that can be utilized to quantify and counter both the representation collapses. We propose Rank-enhancing Token Fuser, a theoretically grounded fusion framework that selectively blends less informative features from one modality with complementary features from another modality. We show that our method increases the effective rank of the fused representation. To address modality collapse, we evaluate modality combinations that mutually increase each others’ effective rank. The talk presents lessons from fusing information-rich modalities like images and 3D point clouds against 1D measurements. The insights are applicable across domains, including seismic interpretation workflows.

Jorge Quesada and Ghassan AlRegib, “Revealing Scale Dependence in Self-Supervised Fault Segmentation”

Abstract. The high cost of manual fault interpretation in seismic analysis often leads to the label-constrained regime: abundant data but limited annotations. Self-supervised learning (SSL) has emerged as a promising solution in such settings, and while it has been shown to perform well for facies segmentation, it has not translated to fault delineation. In this work, we propose a scale-aware SSL strategy that embeds small-window extraction into the augmentation process, allowing models to “zoom in” on localized structural patterns. Across multiple real seismic datasets, this approach yields up to 13% improvements in segmentation accuracy under label constraints compared to standard SSL and supervised pipelines. By contrast, benefits are far less pronounced for larger-scale features such as facies. These findings highlight the importance of adapting SSL to the inherent scale of geological structures and demonstrate its potential to significantly enhance fault interpretation in real seismic data.


Closing Session

Ghassan AlRegib, and Mohit Prabhushankar, “From Data-Centric to Human-Centric: Building Robust and Trustworthy Visual Intelligence”

Abstract. Current paradigm in visual intelligence often prioritizes massive datasets and model scaling, leading to systems that are brittle, biased, and lack human-level understanding and transparency. This talk advocates for a crucial shift towards a human-centric approach. We will explore our ongoing research in visual AI that integrates explainability, robustness, uncertainty, privacy-preserving models, and multi-modal learning. Moving beyond optimizing for performance alone, we aim to build the next generation of visual intelligence systems that are inherently robust, responsible, trustworthy, and aligned with human expectations and real-world needs.


Tutorials

Understanding Attention: Comparing CNN and Transformer Locality and Globalness Across Natural, Seismic, and Medical Domains

[Mohammad Alotaibi], and [Ghassan AlRegib], [OLIVES]
Tutorial.
In this tutorial, we compare Convolutional Neural Networks (CNNs) and Transformers to better understand how they process image information and optimize the balance between local and global features. We focus on the concept of attention in Transformers and how it differs from the local receptive fields in CNNs. By analyzing their behaviors across three domains: natural images, seismic data, and medical images, we highlight how attention helps the model capture both fine and large-scale structures. To gain deeper insight into the attention mechanism, we examine which features are attended to by each region of the image, allowing us to observe how the models balance local detail with global context. This understanding provides a clearer view of how attention operates across different data types and guides the adaptation of these models to new domains like seismic interpretation.

A Hands-on Tutorial on Prompt Interactions and Robustness in Vision Models

[Prithwijit Chowdhury] and [Ghassan AlRegib], [OLIVES]
Tutorial.
This tutorial treats prompting as an explicit control channel that lets users interact and steer billion parameter models without fine tuning. It examines how prompts engage the model representation space, including interactions between prompts, and introduces a task and label agnostic robustness metric grounded in information theory and causality to detect and mitigate over prompting and mis-prompting in large vision models.

The Return of Matrix Methods: Effective Rank to Evaluate Supervised, Self-supervised, and Multi-modal Learning

[Mohit Prabhushankar], [Seulgi Kim], [Kiran Kokilepersaud], [Jorge Quesada], and [Ghassan AlRegib], [OLIVES]
Tutorial. *A fundamental quantification of learning guarantees in machine learning involves bias-variance tradeoffs. However, task agnostic learning in foundation models does not allow this tradeoff due to the lack of labels. A recent strategy that evaluates learning looks at the matrix properties of the singular value projections of data on trained weights. In this tutorial, we elaborate on the effective rank metric for evaluating supervised, self-supervised and multi-modal learning. We provide case studies where effective rank showcases the: (i) differences between learning from synthetic data vs real data (F3 volume), (ii) learning from natural images vs computed biomedical images, and (iii) learning from multi-modal data when the data modalities have asymmetric information.