Towards Building Unified Autonomous Vehicle Scene Representation for Physical AV Adversarial Attacks and Visual Robustness Enhancement
This project will develop novel multi-view and multi-modal defences based on a unified framework that comprises a novel AV scene representation that will be used to generate realistic scenes with diverse conditions and attack vectors.
Lead PI: LIU Yang, NTU
Co-PIs:
1. GUO Qing, CFAR – IHPC, A*STAR
2. ZHANG Tianwei, NTU
3. XIE Xiaofei, SMU
4. ZHANG Hanwang, NTU
5. LYU Chen, NTU
6. DONG Jin Song, NUS
Our project will develop novel multi-view and multi-modal defences based on a unified framework that comprises a novel AV scene representation that will be used to generate realistic scenes with diverse conditions and attack vectors.
Currently, visual perception of artificial intelligence (AI) models used in AVs fails in the real world when faced with adversarial-designed objects and unexpected environmental conditions. The main reason is that they are trained on limited and discrete samples that can hardly cover all possible scenarios.
An effective path towards robust visual perception of models is to first develop physical attacks and corresponding defence methods, then iterate. However, existing physical attacks typically consider only one or two factors and cannot fully simulate dynamic entities (e.g., moving cars or persons, street structures) and environmental conditions (e.g., weather variation and light variation) simultaneously. Additionally, most defence methods mainly rely on single-view or single-modal information, neglecting the complementary information that can be obtained from the multi-view cameras and multi-modal sensors on AVs.
To overcome these limitations, we will develop a novel way to represent the entire scene around an AV (i.e., unified AV scene representation) based on neural implicit representation, which is an emerging AI research topic. This will allow us to generate more comprehensive attacks that consider multiple physical factors together, and hence uncover novel and more effective defence methods that leverage the rich information from multi-view cameras and multi-modal sensors.
In this proposal, we introduced a new perspective on physical adversarial attacks and defences in the context of AVs. Existing solutions are often adhoc and only focus on specific perception tasks or models. In contrast, we offer a set of unified methodologies that consider many factors, physical conditions, and attack vectors. We achieve this by developing a unified scene representation for AVs, which enables us to generate different types of physical attacks and robust defences with high efficiency, effectiveness, and flexibility.
With autonomous driving fast becoming a reality, it is crucial to build robust and reliable perceptions of models and systems. Our scene representation method can help researchers to better understand the limitations of current AV system designs and implementations. Furthermore, our attack framework can be used as a comprehensive benchmark for evaluating the performance of perception models. Our proposed defence solutions can be easily integrated into existing AV products, enhancing their robustness and reliability.
The improvement of deep learning models’ robustness is also a broad topic that applies to various domains and applications beyond AVs. We believe that our approaches can be extended to address other critical tasks such as face recognition, intrusion detection, and medical diagnosis. We expect the outcomes from our project to contribute to building more trustworthy AI technologies, and to Singapore’s smart nation goals.