Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRex loader #400

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

albiangela
Copy link

@albiangela albiangela commented Feb 3, 2025

Before submitting a pull request (PR), please read the contributing guide.

Please fill out as much of this template as you can, but if you have any problems or questions, just leave a comment and we will help out :)

Description

Hi!
My name is Angela Albi and I am a postdoc at MPI of Animal Behavior. I just had a chat with @vigji who introduced me to your repository! I am very excited to start contributing as well, as from our department we do a lot of behavioral analysis - in particular collective behavior.
I work closely with @mooch443 who developed TRex - a tracking software you might have heard of. We are close to releasing a new version and worked extensively on making it more user friendly, and are starting to give workshops to some institutions in Germany. I am just now starting to follow your criteria to write a loader function to add TRex to the list of softwares to load data from, and in the future I am interested in talking to you about possibly adding more collective behavior functions, also together with @jacobdavidson who is a Postdoc in Berlin, who works on large tracking datasets of honeybees

As for the loader function, TRex outputs an .npz (or .csv) file in the format of for example

Keys in the npz file: ['x', 'y']
Shape of array1: (100)
Shape of array2: (100)

And because it’s often used for multi-animal tracking, I normally get the list of files inside a folder (using glob) and loop through the files to append them and convert to a pandas dataframe. This time, I will work to adapt this logic to eventually have tracking data to give as input to your from_numpy function.

Would you prefer me to write a more general function that loads multiple files from a folder, which I can call from the trex loading function, if that could in some way be beneficial to other multiple files data loading?

In any case I will test functions locally now, and I look forward to discussing more to possibly contribute with these new features to the package.

Best,

Angela

  • Bug fix
  • [x ] Addition of a new feature
  • Other

Why is this PR needed?

What does this PR do?

References

Please reference any existing issues/PRs that relate to this PR.

How has this PR been tested?

Please explain how any new code has been tested, and how you have ensured that no existing functionality has changed.

Is this a breaking change?

If this PR breaks any existing functionality, please explain how and why.

Does this PR require an update to the documentation?

If any features have changed, or have been added. Please explain how the
documentation has been updated.

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • [ x] The code has been formatted with pre-commit

Copy link

sonarqubecloud bot commented Feb 3, 2025

Copy link

codecov bot commented Feb 3, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 99.71%. Comparing base (df4b4b1) to head (cdd6ddc).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
movement/io/load_poses.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #400      +/-   ##
==========================================
- Coverage   99.80%   99.71%   -0.10%     
==========================================
  Files          14       15       +1     
  Lines        1025     1050      +25     
==========================================
+ Hits         1023     1047      +24     
- Misses          2        3       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@niksirbi
Copy link
Member

niksirbi commented Feb 3, 2025

Hi @albiangela, and welcome to movement!

Thank you for getting in touch and for opening this PR.

Support for Trex

Trex has indeed been on our list of potential formats to support (see issue 308). If we can bring this PR to a successful conclusion, it will mark our very first "centroid tracking" framework. I'm delighted to have you on board, especially since you likely know Trex and its output formats much better than we do!

Adding collective behaviour functions

We’re also very interested in adding functions to support common analyses in the collective behaviour field. That definitely falls within the scope of movement, although we currently lack a contributor with extensive experience in that area. If you and your collaborators are interested in contributing—by opening issues to document specific metrics or analyses, and/or by submitting pull requests—we would be extremely grateful, and we’ll do our best to support you throughout the process.

That said, it makes the most sense to first focus on supporting Trex data in movement and then move on to adding the collective behaviour functions.

How to load Trex data

As you’ve noted, there are currently two ways one could approach loading the Trex output into movement:

  1. Using from_numpy()
    The "quick" approach is to use the from_numpy() function. You (or the end user) would need to reshape the data into (n_frames, n_space, n_keypoints, n_individuals), after which it can be easily converted into a movement dataset. If you go this route, a straightforward way to contribute would be to add an example demonstrating how to convert Trex data into the movement format. We could then host that example on our website.

  2. Writing a dedicated loader
    The more involved (but longer-term) approach is to create a dedicated loader function, for example load_poses.from_trex_file(). This function would accept .npz or .csv files as input, validate them, parse them into numpy arrays, and then call from_numpy() to return a movement dataset. From what I see, Trex outputs one file per individual, which suggests it might be useful for this loader to accept a list of file paths (or a directory) so that it can load multi-individual datasets in one go.

    If you decide to pursue this second option, we’ll need to clarify a few details first. For example, according to the Trex docs, the output files contain much more than just centroid positions. Although movement already supports some of these data types (e.g. velocity, acceleration), there’s plenty of additional Trex data we don’t yet accommodate. We’d have to decide what portion to load and what options to present to users.

Way forward
My proposal would be to begin with the first option. It’s simpler to tackle, will help you become familiar with our contribution process, and give us all valuable insights into what can and cannot be converted. After that, we’ll be in a better position to tackle a dedicated Trex loader. Let me know what you think!

PS

Since this PR isn’t yet ready for review (the diff only shows an empty function), I’ve taken the liberty of converting it into a "draft" PR, per our policy.

Depending on how we decide to proceed, feel free to update this PR and mark it "ready to review" later on, or open further PRs as needed. We generally encourage contributors to open draft PRs early—even if the code isn’t quite “camera ready”—so we can give feedback as soon as possible.

If you have any questions at all, don’t hesitate to ask. We’re here to help!

Niko

@niksirbi niksirbi marked this pull request as draft February 3, 2025 14:43
@niksirbi
Copy link
Member

niksirbi commented Feb 3, 2025

I forgot to mention that @roaldarbol has already implemented a TRex loader into his animovement, which is an R package with very similar scope and data structures to movement. That would probably be a good place to start for implementing a loader/parser.

@roaldarbol
Copy link

roaldarbol commented Feb 3, 2025

It might be worth following this chat I'm having with @mooch443 about data export formats for TRex v2 - input will be appreciated (at least on my end) as always. :-)

@albiangela out of curiosity, do you have a feel for what the split is of R and Python users across MPI Animal Behaviour/Collective Behaviour for the final analysis of behaviour? Feel free to open a discussion over at animovement to not clutter this PR though. :P

@albiangela
Copy link
Author

Hi all,
Thanks @niksirbi for helping outline the next steps. I am currently working on preparing an example based on your first suggestion and am happy to proceed with the second option later, using a dedicated TRex loader.

I’m also glad to see a general interest in adding collective behavior functions. I’m interested in collaborating to expand on this, possibly with contributions from some colleagues at our institute.

Happy to be in touch with you, @roaldarbol! I often refer R users to animovement. To answer your question, I am actually not sure but I’d estimate a 50/50 split, though it’s heavily dependent on the department. We can follow up on this!

Angela

@niksirbi
Copy link
Member

niksirbi commented Feb 3, 2025

Sounds good @albiangela, don't hesitate to ping me whenever you need my input.

@roaldarbol
Copy link

roaldarbol commented Feb 4, 2025

@albiangela Happy to be in touch too!

I often refer R users to animovement.
Aaaaarhh, I actually didn't know anyone had noticed the package yet haha! Thanks, I really appreciate it!!! 😄

Good to know thanks! The exact split doesn't matter, just wanted to know whether animovement would be filling a niche with you guys - happy to hear that is the case! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants