PhD Dissertation: Asmaa Loulou, Localization, Trajectory Prediction and Planning for Autonomous Vehicles Using Hybrid Attention Mechanisms and Transformer Neural Networks, Date & Time: 30 June 2026 – 11:00 AM, Place: FENS L027

Localization, Trajectory Prediction and Planning for Autonomous Vehicles Using Hybrid Attention Mechanisms and Transformer Neural Networks

Asmaa Loulou
Mechatronics Engineering, PhD Dissertation, 2026

Thesis Jury

Prof. Mustafa Ünel (Thesis Advisor)

Assoc. Prof. Kemaletttin Erbatur

Asst. Prof. Melih Türkseven

Assoc. Prof. Abdurrahman Eray Baran

Assoc. Prof. Ali Fuat Ergenç

Date & Time: 30th June, 2026 – 11.00 AM

Place: FENS L027

Keywords : Localization, Trajectory Prediction, Path Planning, Attention Mechanism,

Transformer Architecture

Abstract

The safe and efficient deployment of autonomous robotic systems is constrained by computational bottlenecks across localization, prediction, and planning. Traditional deep learning architectures, such as Convolutional and Recurrent Neural Networks, struggle with localized receptive fields and strict sequential processing constraints. To address this, this dissertation introduces hybrid approaches that integrate attention-based modules with traditional Convolutional and Recurrent Neural Networks. This combination allows the models to move beyond restricted local receptive fields and enable them to simultaneously capture fine-grained local details in addition to the broader global context of the environment. Three integrated frameworks are proposed. First, for localization, PoseViTNet and RelViTNet utilize Vision Transformer backbones to encode global context of the scene and suppress moving objects or texture-less surfaces. This enables highly accurate, multi-scene camera pose estimation without requiring scene-specific retraining. Second, for trajectory prediction, a hybrid spatial-temporal transformer architecture is introduced. By utilizing a structured spatial mask that explicitly encodes neighboring vehicles and open navigable space, the model preserves dimensional consistency and delivers better long-horizon predictions. Finally, to resolve inefficiencies in multi-dimensional path planning, the HAGRRT* and 3D-HARRT* frameworks are introduced. These models fuse multi-scale convolutional features with attention mechanisms to generate goal-conditioned, obstacle-aware spatial probability priors. These methods are able to accelerate optimal path convergence for sampling-based planners in highly constrained 2D and complex 3D environments. Overall, the proposed attention-guided frameworks improve the robustness, scalability, and computational efficiency of localization, trajectory prediction, and path planning across diverse autonomous navigation scenarios.