Contributed by Kimon Fountoulakis
-
Supervised Neural Networks for the Classification of Structures. Journal of Computer and System Sciences 1995. paper
H.T. Siegelmann, E.D. Sontag
-
Attention is Turing Complete. Journal of Machine Learning Research 2021. paper
Jorge Pérez, Pablo Barceló, Javier Marinkovic
-
Looped Transformers as Programmable Computers. ICML 2023. paper
Angeliki Giannou, Shashank Rajput, Jy-Yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos
-
Exposing Attention Glitches with Flip-Flop Language Modeling. NeurIPS 2023. paper
Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
-
Transformers Learn Shortcuts to Automata. ICLR 2023. paper
Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
-
Memory Augmented Large Language Models are Computationally Universal. arXiv 2023. paper
Dale Schuurmans
-
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems. ICLR 2024. paper
Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma
-
Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper
Sanford, Clayton Hendrick
-
Transformers, parallel computation, and logarithmic depth. ICML 2024. paper
Clayton Sanford, Daniel Hsu, Matus Telgarsky
-
Understanding Transformer Reasoning Capabilities via Graph Algorithms. NeurIPS 2024. paper
Clayton Sanford, Bahare Fatemi, Ethan Hall, Anton Tsitsulin, Mehran Kazemi, Jonathan Halcrow, Bryan Perozzi, Vahab Mirrokni
-
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers. NeurIPS 2024 Workshop M3L. paper
William Merrill, Ashish Sabharwal
-
On Limitations of the Transformer Architecture. COLM 2024. paper
Binghui Peng, Srini Narayanan, Christos Papadimitriou
-
Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers. arXiv 2025. paper
Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher, Orr Fischer, Ran Gilad-Bachrach, Amir Globerson
-
Positional Attention: Expressivity and Learnability of Algorithmic Computation. arXiv 2025. paper
Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis
-
Round and Round We Go! What makes Rotary Positional Encodings useful?. ICLR 2025. paper
Federico Barbero, Alex Vitvitskyi, Christos Perivolaropoulos, Razvan Pascanu, Petar Veličković
-
Reasoning with Latent Thoughts: On the Power of Looped Transformers. ICLR 2025. paper
Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sanjiv Kumar, Sashank J. Reddi
-
Provably good solutions to the knapsack problem via neural networks of bounded size. AAAI 2021. paper
Christoph Hertrich, Martin Skutella
-
ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation. Integer Programming and Combinatorial Optimization 2023. paper
Christoph Hertrich, Leon Sering
-
Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper
Sanford, Clayton Hendrick
-
Graph neural networks extrapolate out-of-distribution for shortest paths. arXiv 2025. paper
Robert R. Nerem, Samantha Chen, Sanjoy Dasgupta, Yusu Wang
-
What graph neural networks cannot learn: depth vs width. ICLR 2020. paper
Andreas Loukas
-
Simulation of Graph Algorithms with Looped Transformers. ICML 2024. paper
Artur Back De Luca, Kimon Fountoulakis
-
Graph Transformers Dream of Electric Flow. ICLR 2025. paper
Xiang Cheng, Lawrence Carin, Suvrit Sra
-
Positional Attention: Expressivity and Learnability of Algorithmic Computation. arXiv 2025. paper
Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis
-
Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks. arXiv 2025. paper
George Giapitzakis, Artur Back de Luca, Kimon Fountoulakis
-
Learning to Execute. arXiv 2015. paper
Wojciech Zaremba, Ilya Sutskever
-
Neural Programmer-Interpreters. arXiv 2015. paper
Scott Reed, Nando de Freitas
-
Neural Programmer: Inducing Latent Programs with Gradient Descent. arXiv 2016. paper
Arvind Neelakantan, Quoc V. Le, Ilya Sutskever
-
Deep Neural Solver for Math Word Problems. arXiv 2017. paper
Yan Wang, Xiaojiang Liu, and Shuming Shi
-
Analysing Mathematical Reasoning Abilities of Neural Models. arXiv 2019. paper
David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli
-
Investigating the Limitations of Transformers with Simple Arithmetic Tasks. arXiv 2021. paper
Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
-
A Primer for Neural Arithmetic Logic Modules. arXiv 2022. paper
Bhumika Mistry, Katayoun Farrahi, Jonathon Hare
-
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. arXiv 2022. paper
Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra
-
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2023. paper
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou
-
Implicit Chain of Thought Reasoning via Knowledge Distillation. arXiv 2023. paper
Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber
-
Positional Description Matters for Transformers Arithmetic. arXiv 2023. paper
Ruoqi Shen, Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang
-
Length Generalization in Arithmetic Transformers. arXiv 2023. paper
Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton
-
Transformers Can Do Arithmetic with the Right Embeddings. arXiv 2024. paper
Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein
-
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step. arXiv 2024. paper
Yuntian Deng, Yejin Choi, Stuart Shieber
-
Neural Networks and the Chomsky Hierarchy. ICLR 2023. paper
Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega
-
Training Neural Networks as Recognizers of Formal Languages. ICLR 2025. paper
Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell