Interested in GSoC 2026 Project 3: Benchmarking and performance optimization #5270
Replies: 2 comments 1 reply
-
|
For right now, if you're interested in performance, you should familiarize yourself with ASV and perhaps write a simple benchmark — just choose anything that's not covered already. It's more important to learn the process than to be exhaustive. The MDAnalysis.lib module has some code that is used repeatedly. All coordinate readers/writers are important. Many analysis tools are not covered and are good target. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @orbeckst , I started by adding ASV benchmarks for several performance-critical functions in So far I have implemented benchmarks for:
These benchmarks are parameterized by the number of atoms so that we can observe scaling behavior and detect performance regressions more clearly. Running the benchmarks locally through ASV works correctly and generates results for different Python versions. Next, I plan to:
If there are particular areas (modules or functions) where additional benchmark coverage would be especially useful, I would be happy to focus there. Best Regards, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am Amarendra, a B.Tech undergrad at IIT Kharagpur. I have a strong experience working in Python, machine learning and structured software development. I have worked extensively with data-intensive workflows, performance-sensitive code and modular project design, and I am particularly interested in contributing to MDAnalysis for GSoC 2026 — specifically to Project 3: Benchmarking and Performance Optimization.
I am especially drawn to this project because performance engineering and systematic evaluation of code quality are areas I genuinely enjoy. Through my experience on working with data-intensive pipelines and ML systems, I have gained experience with profiling tools, identifying bottlenecks, improving efficiency and writing maintainable code. I appreciate how structured benchmarking helps ensure long-term scalability and stability in scientific libraries. Over the past semester, I had worked with Slack’s Model optimization using ML for my Bachelor Thesis and I have also published a first-author research paper titled “Spiking Neural Network for Cross-Market Portfolio Optimization in Financial Markets: A Neuromorphic Computing approach.”
From reading the project description and roadmap, I understand that the goal is to expand ASV benchmark coverage across major core functionalities, analyse performance trends and prioritize optimization targets. I find this particularly impactful because improving performance directly benefits the broader scientific community relying on MDAnalysis.
I am currently working on:
• Exploring the MDAnalysis codebase and understanding core modules
• Reviewing the existing ASV benchmarking setup
• Studying profiling tools such as cProfile and line_profiler
• Going through related performance issues and discussions
I would begin with small benchmark contributions and gradually expand coverage while understanding the library’s performance-critical paths. I would also appreciate guidance on which core areas currently need benchmark coverage most urgently.
Looking forward to engaging with the community and contributing meaningfully.
Best regards,
Amarendra
Beta Was this translation helpful? Give feedback.
All reactions