IEEE IV 2023 Tutorial – Ghassan AlRegib

Title: A Holistic View of Perception in Intelligent Vehicles

Duration : Half day (3 hours)

Presenters : Ghassan AlRegib and Mohit Prabhushankar

(Georgia Institute of Technology)

Tutorial Description

The goal of the tutorial is to introduce and expand on the challenges and potential solutions for machine learning based perception algorithms in the field of intelligent vehicles. These challenges start with the data collection process itself. With the threat of repurposing datasets for unintentional applications (https://exposing.ai/duke_mtmc/ ), it is imperative to follow best practices that take into account data privacy and fairness. Moreover, with the increase in size of datasets, the logistics of labeling needs to be considered. We introduce and tackle this challenge through an active learning setting where labelers work in conjunction with models to label an optimal subset of data. These are covered in Part 1 of the tutorial. Part 2 of the tutorial deals with the model training itself. We discuss state-of-the-art methods that provide perception solutions to object detection and segmentation. However, these methods are insufficient in safety-critical applications of intelligent vehicles. Specifically, deep learning-based methods suffer from robustness, calibration, and adversarial attack issues that inhibit their deployment in all settings. We discuss the potential challenges in training from the point of view of these inferential challenges and discuss potential recent solutions. Part 3 of the tutorial deals with deployment of models. Deployed models need to earn trust from a number of diverse trustees. These include the end users, the government regulators, the insurers among others. We discuss trust issues, specifically for perception, and expand on the applications of explainability and behavioral prediction applications that quantify it. We conclude with existing notions and technologies of safety and provide clues to how it may expand in the future.

Part I: Challenges in Perception and Autonomy

Objectives:

Summarize the progress of AVs over the years
Discuss the role of perception in AVs and where it fits within the AV workflow
Review well-known failures of AVs in providing safety to drivers and to others
Discuss major technical challenges currently facing AV
Motivate deep learning as a holistic solution to perception challenges

Loading…

Taking too long?

Reload document

Open in new tab

Part II: Deep Learning for Perception

Objectives:

Discuss myths surrounding deep learning
Brief history of deep learning
Review deep learning models for vision
Deep learning extensions into sensor domain
Transfer Learning and foundation models
Self-supervised learning
Case study: Self-supervised learning for fisheye images

Loading…

Taking too long?

Reload document

Open in new tab

Part III: Existing Deep Learning solutions to Challenges in Perception

Objectives:

Challenging conditions at training
Inference
- Deficiencies at Inference
Overcoming deficiencies at Inference
- Anomaly Detection
- Uncertainty
- Explainability
Case study 1: Robustness to challenging conditions
Case study 2: Aberrant Object Detection

Loading…

Taking too long?

Reload document

Open in new tab

Part IV: Remaining Challenges and Future Directions

Objectives:

Takeaway Messages and Key Insights
Unaddressed Challenges in Perception
- Context Awareness
- Embedded Perception
- V2X Perception
Future Research Directions
- Temporal Processing
- Sensor Processing Architectures
- Sensors research
- Infrastructure + AV Datasets

Loading…

Taking too long?

Reload document

Open in new tab

Tutorial Relevance

While perception in recognition of center-surround objects in ImageNet database has exceeded human capacity, it is not transferable to more complicated settings like driving in urban scenarios. The relevance of each part of the tutorial is mentioned below:

Part 1:
- Robustness under challenging conditions, environments, context and surroundings-awareness are challenges in AV perception
- Deep Learning promises a holistic solution to a number of the above challenges
Part 2:
- Transfer Learning and training at scale are essential for foundation model development
- Self-supervised Learning provides a framework for large scale learning on unannotated data
Part 3:
- It is not always clear if aberrant events and challenges must be incorporated in training.
- Instead, they can and should be equipped with diagnostic tools at predictionsThese diagnostic tools are anomaly and uncertainty scores for decision making and contextual explainability for post-hoc stakeholders
- Gradients provide the change induced by an aberrant event in the network and can be used to obtain the required prediction diagnosis
Part 4:
- Robustness under challenging conditions, environments, context and surroundings-awareness are challenges in AV perception
  - Deep Learning provides a holistic solution to a number of the above challenges
- Transfer Learning and training at scale help to create foundation models
  - Self-supervised Learning provides a framework for large scale learning on unannotated data
- It is not always clear if aberrant events and challenges must be incorporated in training
  - Instead, model predictions must be equipped with diagnostic tools at inference
  - These diagnostic tools are anomaly and uncertainty scores for decision making and contextual explainability for post-hoc stakeholders
  - Gradients provide the change induced by an aberrant event in the network and can be used to obtain the required prediction diagnosis

Expected Audience:

This tutorial is intended for PhD students, professors, researchers and engineers working in different topics related to intelligent vehicles.

Recent Relevant Publications

G. AlRegib and M. Prabhushankar, “Explanatory Paradigms in Neural Networks: Towards Relevant and Contextual Explanations,” in IEEE Signal Processing Magazine, Special Issue on Explainability in Data Science, Feb. 18 2022. [PDF][Code]
M. Prabhushankar, K. Kokilepersaud*, Y. Logan*, S. Trejo Corona*, G. AlRegib, C. Wykoff, “OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye Semantics,” in Advances in Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks, New Orleans, LA,, Nov. 29 – Dec. 1 2022 [PDF][Code]
M. Prabhushankar, and G. AlRegib, “Introspective Learning : A Two-Stage Approach for Inference in Neural Networks,” in Advances in Neural Information Processing Systems (NeurIPS), New Orleans, LA,, Nov. 29 – Dec. 1 2022. [PDF][Code]
C. Zhou, G. AlRegib, A. Parchami, and K. Singh, “Learning Trajectory-Conditioned Relations to Predict Pedestrian Crossing Behavior,” in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022. [PDF][Code]
R. Benkert, M. Prabhushankar, and G. AlRegib, “Forgetful Active Learning With Switch Events: Efficient Sampling for Out-of-Distribution Data,” in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022. [PDF]
Y. Logan, R. Benkert, A. Mustafa, G. Kwon, G. AlRegib, “Patient Aware Active Learning for Fine-Grained OCT Classification,” in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022. [PDF][Code]
R. Benkert, M. Prabhushankar, G. AlRegib, A. Parchami, and E. Corona, “Gaussian Switch Sampling: A Second Order Approach to Active Learning,” in IEEE Transactions on Artificial Intelligence (TAI), Feb. 05 2023. [PDF][Code]
G. Kwon, M. Prabhushankar, D. Temel, and G. AlRegib, “Backpropagated Gradient Representations for Anomaly Detection,” in Proceedings of the European Conference on Computer Vision (ECCV), SEC, Glasgow, Aug. 23-28 2020. [PDF][Code][Link]
D. Temel, G. Kwon*, M. Prabhushankar*, and G. AlRegib, “CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition,” in Advances in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Intelligent Transportation Systems, Long Beach, CA, Dec. 2017 [PDF][Code]
D. Temel, M-H. Chen, and G. AlRegib, “Traffic Sign Detection Under Challenging Conditions: A Deeper Look Into Performance Variations and Spectral Characteristics,” in IEEE Transactions on Intelligent Transportation Systems, Jul. 2019. [PDF][Code]

Presenters’ contact information and short biography

Ghassan AlRegib (alregib@gatech.edu) is currently the John and Marilu McCarty Chair Professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. He was a recipient of the ECE Outstanding Graduate Teaching Award in 2001 and both the CSIP Research and the CSIP Service Awards in 2003, the ECE Outstanding Junior Faculty Member Award, in 2008, and the 2017 Denning Faculty Award for Global Engagement. His research group, the Omni Lab for Intelligent Visual Engineering and Science (OLIVES) works on research projects related to machine learning, image and video processing and understanding, seismic interpretation, machine learning for ophthalmology, robustness, large-scale dataset creation, and deployable ML. His research group created more than 11 large-scale datasets. He has participated in several service activities within the IEEE. He served as the TP co-Chair for ICIP 2020 and GlobalSIP 2014.

Mohit Prabhushankar (mohit.p@gatech.edu) received his Ph.D. degree in electrical engineering from the Georgia Institute of Technology (Georgia Tech), Atlanta, Georgia, 30332, USA, in 2021. He is currently a Postdoctoral Fellow in the School of Electrical and Computer Engineering at the Georgia Institute of Technology in the Omni Lab for Intelligent Visual Engineering and Science (OLIVES). He is working in the fields of image processing, machine learning, active learning, healthcare, and robust and explainable AI. He is the recipient of the Best Paper award at ICIP 2019 and Top Viewed Special Session Paper Award at ICIP 2020. He is the recipient of the ECE Outstanding Graduate Teaching Award, the CSIP Research award, and of the Roger P Webb ECE Graduate Research Assistant Excellence award, all in 2022. He participated in creating five large-scale datasets.