Interest and Contribute in GSoC 2026 Project 4 : Lazy Trajectory Loading and Indexing #5271
Replies: 3 comments
-
|
@orbeckst @yuxuanzhuang @talagayev -would love your guidance on this project! |
Beta Was this translation helpful? Give feedback.
-
|
My quick thoughts:
Try to repeat as little code as possible, so I'd start with trying to put as much as possible common code into XDRBaseReader.
ALways at Reader level. Universe will just pass kwargs through.
Search issues with label format-Gromacs https://github.com/MDAnalysis/mdanalysis/issues?q=is%3Aissue%20state%3Aopen%20label%3AFormat-Gromacs — none of these may be "good" = "easy" but we do not require that you have a PR merged; we want to interact with you productively so just getting deep into a PR is a good start. (Of course, getting it merged is even better but it's not required and as you may see, often our PRs have 50+ comments until they get merged as we take code correctness and quality seriously for a scientific code like MDA.) |
Beta Was this translation helpful? Give feedback.
-
|
Thank you so much @orbeckst for the clear response! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi MDAnalysis Community!
Hi I'm Vaishnavi Gajarla currently pursuing Masters in Data Analytics Engineering at Northeastern University ,Boston (GPA:3.7)
I am working as a Research Assistant under professor Nik Bear Brown where my role is analyzing 500k+ record datasets using python and SQL to generate forecasts and satistically validate performance metrics for research driven decision-making and developed ETL workflows . I have industry experiences optimizing database indexing-improving query speed by 50% during my internship
I'm really interested in Project 4: lazy Trajectory Loading and Indexing for GSoC 2026 and after reading the information you provided i saw the issues #3793 and the XDRBaseReader documentation, I understand the core problem : XTC/TRR readers currenlty build complete offset indices on first file open by scanning the entire file , which can take hours for large trajectories . The proposed solution is lazy indexing skipping index building during simple forward iteration and buiding it progressively as frames are read and the XTC that uses a magic number 1993 in headers with fixed size headers enabling seek-based traversal,while TRR frames have variables sizes depending on what data like coordinates,velocity,forces is stored making lazy indexing slightly different after each format
My Questions:
I have already installed the MDAnalysis 2.10.0 and I 'm ready to contribute!
Vaishnavi Gajarla
Northeastern University
https://github.com/1825Vaishnavi
Beta Was this translation helpful? Give feedback.
All reactions