Skip to content

opallab/neural_networks_and_computation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 

Repository files navigation

Computational Capability and Efficiency of Neural Networks: A Repository of Papers

Contributed by Kimon Fountoulakis

1. Simulation Results
1.1 Recurrent neural networks
1.2 Transformers
1.3 Feedforward neural networks
1.4 Graph neural networks
2. Learning Results
2.1 Transformers
2.2 Feedforward neural networks
3. Empirical
4. Formal Languages

Simulation Results

Recurrent neural networks

  1. Supervised Neural Networks for the Classification of Structures. Journal of Computer and System Sciences 1995. paper

    H.T. Siegelmann, E.D. Sontag

Transformers

  1. Attention is Turing Complete. Journal of Machine Learning Research 2021. paper

    Jorge Pérez, Pablo Barceló, Javier Marinkovic

  2. Looped Transformers as Programmable Computers. ICML 2023. paper

    Angeliki Giannou, Shashank Rajput, Jy-Yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos

  3. Exposing Attention Glitches with Flip-Flop Language Modeling. NeurIPS 2023. paper

    Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

  4. Transformers Learn Shortcuts to Automata. ICLR 2023. paper

    Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

  5. Memory Augmented Large Language Models are Computationally Universal. arXiv 2023. paper

    Dale Schuurmans

  6. Chain of Thought Empowers Transformers to Solve Inherently Serial Problems. ICLR 2024. paper

    Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma

  7. Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper

    Sanford, Clayton Hendrick

  8. Transformers, parallel computation, and logarithmic depth. ICML 2024. paper

    Clayton Sanford, Daniel Hsu, Matus Telgarsky

  9. Understanding Transformer Reasoning Capabilities via Graph Algorithms. NeurIPS 2024. paper

    Clayton Sanford, Bahare Fatemi, Ethan Hall, Anton Tsitsulin, Mehran Kazemi, Jonathan Halcrow, Bryan Perozzi, Vahab Mirrokni

  10. A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers. NeurIPS 2024 Workshop M3L. paper

    William Merrill, Ashish Sabharwal

  11. On Limitations of the Transformer Architecture. COLM 2024. paper

    Binghui Peng, Srini Narayanan, Christos Papadimitriou

  12. Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers. arXiv 2025. paper

    Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher, Orr Fischer, Ran Gilad-Bachrach, Amir Globerson

  13. Positional Attention: Expressivity and Learnability of Algorithmic Computation. arXiv 2025. paper

    Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis

  14. Round and Round We Go! What makes Rotary Positional Encodings useful?. ICLR 2025. paper

    Federico Barbero, Alex Vitvitskyi, Christos Perivolaropoulos, Razvan Pascanu, Petar Veličković

  15. Reasoning with Latent Thoughts: On the Power of Looped Transformers. ICLR 2025. paper

    Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sanjiv Kumar, Sashank J. Reddi

Feedforward neural networks

  1. Provably good solutions to the knapsack problem via neural networks of bounded size. AAAI 2021. paper

    Christoph Hertrich, Martin Skutella

  2. ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation. Integer Programming and Combinatorial Optimization 2023. paper

    Christoph Hertrich, Leon Sering

  3. Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper

    Sanford, Clayton Hendrick

Graph neural networks

  1. Graph neural networks extrapolate out-of-distribution for shortest paths. arXiv 2025. paper

    Robert R. Nerem, Samantha Chen, Sanjoy Dasgupta, Yusu Wang

  2. What graph neural networks cannot learn: depth vs width. ICLR 2020. paper

    Andreas Loukas

  3. Simulation of Graph Algorithms with Looped Transformers. ICML 2024. paper

    Artur Back De Luca, Kimon Fountoulakis

  4. Graph Transformers Dream of Electric Flow. ICLR 2025. paper

    Xiang Cheng, Lawrence Carin, Suvrit Sra

Learning Results

Transformers

  1. Positional Attention: Expressivity and Learnability of Algorithmic Computation. arXiv 2025. paper

    Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis

Feedforward neural networks

  1. Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks. arXiv 2025. paper

    George Giapitzakis, Artur Back de Luca, Kimon Fountoulakis

Empirical

  1. Learning to Execute. arXiv 2015. paper

    Wojciech Zaremba, Ilya Sutskever

  2. Neural Programmer-Interpreters. arXiv 2015. paper

    Scott Reed, Nando de Freitas

  3. Neural Programmer: Inducing Latent Programs with Gradient Descent. arXiv 2016. paper

    Arvind Neelakantan, Quoc V. Le, Ilya Sutskever

  4. Deep Neural Solver for Math Word Problems. arXiv 2017. paper

    Yan Wang, Xiaojiang Liu, and Shuming Shi

  5. Analysing Mathematical Reasoning Abilities of Neural Models. arXiv 2019. paper

    David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli

  6. Investigating the Limitations of Transformers with Simple Arithmetic Tasks. arXiv 2021. paper

    Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin

  7. A Primer for Neural Arithmetic Logic Modules. arXiv 2022. paper

    Bhumika Mistry, Katayoun Farrahi, Jonathon Hare

  8. Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. arXiv 2022. paper

    Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra

  9. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2023. paper

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

  10. Implicit Chain of Thought Reasoning via Knowledge Distillation. arXiv 2023. paper

    Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber

  11. Positional Description Matters for Transformers Arithmetic. arXiv 2023. paper

    Ruoqi Shen, Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang

  12. Length Generalization in Arithmetic Transformers. arXiv 2023. paper

    Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton

  13. Transformers Can Do Arithmetic with the Right Embeddings. arXiv 2024. paper

    Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

  14. From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step. arXiv 2024. paper

    Yuntian Deng, Yejin Choi, Stuart Shieber

Formal Languages

  1. Neural Networks and the Chomsky Hierarchy. ICLR 2023. paper

    Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega

  2. Training Neural Networks as Recognizers of Formal Languages. ICLR 2025. paper

    Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell

About

Computational abilities and efficiency of neural networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published