Interpretable Representation Learning for Motion Forecasting
Wagner, Royden
Abstract (englisch):
We address interpretable representation learning for motion forecasting in self-driving cars. Rather than treating transformers as black boxes, we develop methods to interpret and modify learned representations. We introduce self-supervised pre-training with interpretable objectives. Moreover, we probe latent spaces of forecasting models and reveal interpretable features, allowing us to make targeted interventions. Finally, we uncover retrocausal mechanisms, which enable goal-based instructions.