Replies: 2 comments
-
Hi @asafbenj , thanks for your question. Let's me reply inline:
This is used for "aligning" the data across sessions with respect to a label. It seems like you do not want to use this feature, though?
This is possible and can make sense depending on your research question. The only issue might be that there are sharp breaks in the data due to concatenation. If each session is quite long, this might be negligible, though.
Yes, can be sensible. At your data size, since this uses CEBRA-Time under the hood vs. CEBRA-Behavior on a quite large dataset, it is expected that this runs much faster.
Happy to discuss more. Could you maybe expand a bit more on the details of your analysis, analysis goals, etc.? If you are interested in analysing differences over time, while obtaining a shared feature space, this approach is perfectly reasonable. I would aim to label the embedding spaces somehow, though --- that this number of points, it is expected that a lot of the space will be covered, and it would make sense to think about ways to visualize the data in an insightful way (or run decoding, etc.) Any additional comments? |
Beta Was this translation helpful? Give feedback.
-
Regarding the temperature, you might want to consider setting it to even lower values (without the auto-temp mode) |
Beta Was this translation helpful? Give feedback.
-
Hi,
I have a multi session data set - multiple mice are recorded, each on multiple days. I'm trying to learn an embedding of the behavior across multiple sessions, but CEBRA seems to require a label for multi-session training. Since the behavioral features are always the same, I figured it might make sense to concatenate the sessions into one big session, although the behavior of the different mice/days is considerably different. This runs smoothly (actually suspiciously fast compared to single-session run), but does it seem sensible? Is there a better work-around? Should I add any additional features with some meta-data about the sessions? Any other issues I should consider?
The data comes out to ~7M time-points, with 36 features.
Here are the plots for batch_size = 2**12, output_dimension = 35, num_hidden_units = 64, temperature_mode="auto":
and with temperature set to 1:
taking every 100th row of the embedding:
Also, in a previous discussion you said the loss in the auto-temp-mode seems weird, and setting it to 1 would be better. However, if I run a grid search on this, the auto mode would win by far, right? Should I just leave the temperature out of the grid search?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions