Computational Capability and Efficiency of Neural Networks: A Repository of Papers

Contributed by Kimon Fountoulakis

Content

1. Simulation Results

1.1 Recurrent neural networks

1.2 Transformers

1.3 Feedforward neural networks

1.4 Graph neural networks

2. Learning Results

2.1 Transformers

2.2 Feedforward neural networks

3. Empirical

4. Formal Languages

Simulation Results

Recurrent neural networks

Supervised Neural Networks for the Classification of Structures. Journal of Computer and System Sciences 1995. paper

H.T. Siegelmann, E.D. Sontag

Transformers

Attention is Turing Complete. Journal of Machine Learning Research 2021. paper

Jorge Pérez, Pablo Barceló, Javier Marinkovic
Looped Transformers as Programmable Computers. ICML 2023. paper

Angeliki Giannou, Shashank Rajput, Jy-Yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos
Exposing Attention Glitches with Flip-Flop Language Modeling. NeurIPS 2023. paper

Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
Transformers Learn Shortcuts to Automata. ICLR 2023. paper

Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
Memory Augmented Large Language Models are Computationally Universal. arXiv 2023. paper

Dale Schuurmans
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems. ICLR 2024. paper

Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma
Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper

Sanford, Clayton Hendrick
Transformers, parallel computation, and logarithmic depth. ICML 2024. paper

Clayton Sanford, Daniel Hsu, Matus Telgarsky
Understanding Transformer Reasoning Capabilities via Graph Algorithms. NeurIPS 2024. paper

Clayton Sanford, Bahare Fatemi, Ethan Hall, Anton Tsitsulin, Mehran Kazemi, Jonathan Halcrow, Bryan Perozzi, Vahab Mirrokni
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers. NeurIPS 2024 Workshop M3L. paper

William Merrill, Ashish Sabharwal
On Limitations of the Transformer Architecture. COLM 2024. paper

Binghui Peng, Srini Narayanan, Christos Papadimitriou
Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers. arXiv 2025. paper

Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher, Orr Fischer, Ran Gilad-Bachrach, Amir Globerson
Positional Attention: Expressivity and Learnability of Algorithmic Computation. arXiv 2025. paper

Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis
Round and Round We Go! What makes Rotary Positional Encodings useful?. ICLR 2025. paper

Federico Barbero, Alex Vitvitskyi, Christos Perivolaropoulos, Razvan Pascanu, Petar Veličković
Reasoning with Latent Thoughts: On the Power of Looped Transformers. ICLR 2025. paper

Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sanjiv Kumar, Sashank J. Reddi

Feedforward neural networks

Provably good solutions to the knapsack problem via neural networks of bounded size. AAAI 2021. paper

Christoph Hertrich, Martin Skutella
ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation. Integer Programming and Combinatorial Optimization 2023. paper

Christoph Hertrich, Leon Sering
Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper

Sanford, Clayton Hendrick

Graph neural networks

Graph neural networks extrapolate out-of-distribution for shortest paths. arXiv 2025. paper

Robert R. Nerem, Samantha Chen, Sanjoy Dasgupta, Yusu Wang
What graph neural networks cannot learn: depth vs width. ICLR 2020. paper

Andreas Loukas
Simulation of Graph Algorithms with Looped Transformers. ICML 2024. paper

Artur Back De Luca, Kimon Fountoulakis
Graph Transformers Dream of Electric Flow. ICLR 2025. paper

Xiang Cheng, Lawrence Carin, Suvrit Sra

Learning Results

Transformers

Positional Attention: Expressivity and Learnability of Algorithmic Computation. arXiv 2025. paper

Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis

Feedforward neural networks

Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks. arXiv 2025. paper

George Giapitzakis, Artur Back de Luca, Kimon Fountoulakis

Empirical

Learning to Execute. arXiv 2015. paper

Wojciech Zaremba, Ilya Sutskever
Neural Programmer-Interpreters. arXiv 2015. paper

Scott Reed, Nando de Freitas
Neural Programmer: Inducing Latent Programs with Gradient Descent. arXiv 2016. paper

Arvind Neelakantan, Quoc V. Le, Ilya Sutskever
Deep Neural Solver for Math Word Problems. arXiv 2017. paper

Yan Wang, Xiaojiang Liu, and Shuming Shi
Analysing Mathematical Reasoning Abilities of Neural Models. arXiv 2019. paper

David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli
Investigating the Limitations of Transformers with Simple Arithmetic Tasks. arXiv 2021. paper

Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
A Primer for Neural Arithmetic Logic Modules. arXiv 2022. paper

Bhumika Mistry, Katayoun Farrahi, Jonathon Hare
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. arXiv 2022. paper

Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2023. paper

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou
Implicit Chain of Thought Reasoning via Knowledge Distillation. arXiv 2023. paper

Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber
Positional Description Matters for Transformers Arithmetic. arXiv 2023. paper

Ruoqi Shen, Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang
Length Generalization in Arithmetic Transformers. arXiv 2023. paper

Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton
Transformers Can Do Arithmetic with the Right Embeddings. arXiv 2024. paper

Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step. arXiv 2024. paper

Yuntian Deng, Yejin Choi, Stuart Shieber

Formal Languages

Neural Networks and the Chomsky Hierarchy. ICLR 2023. paper

Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega
Training Neural Networks as Recognizers of Formal Languages. ICLR 2025. paper

Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Computational Capability and Efficiency of Neural Networks: A Repository of Papers

Content

Simulation Results

Recurrent neural networks

Transformers

Feedforward neural networks

Graph neural networks

Learning Results

Transformers

Feedforward neural networks

Empirical

Formal Languages

About

Uh oh!

Releases

Packages

Uh oh!

opallab/neural_networks_and_computation

Folders and files

Latest commit

History

Repository files navigation

Computational Capability and Efficiency of Neural Networks: A Repository of Papers

Content

Simulation Results

Recurrent neural networks

Transformers

Feedforward neural networks

Graph neural networks

Learning Results

Transformers

Feedforward neural networks

Empirical

Formal Languages

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages