Unlocking Out-of-Distribution Generalization in Transformers via Recursive Latent Space Reasoning Paper • 2510.14095 • Published 6 days ago • 5
Disentangling and Integrating Relational and Sensory Information in Transformer Architectures Paper • 2405.16727 • Published May 26, 2024