Made writing of data output continuous #4

alexfrostdesu · 2021-01-11T21:52:13Z

Now if --data-output-enabled we write csv file line by line.

This is actually a very crude version, that assumes that first found row would be the most full. Also will give gibberish on any incomplete row. It just so happens that knowing full info is impossible if we are going row by row. So we need to find a way to safeguard it from incomplete data, missing joints and animals.

Also repeatedly opens and closes the actual csv file, which is not ideal for performance. Again, when we are dealing with endless loop, with open(file) is pretty much useless.

Maybe we need to rethink this idea somehow?

JensBlack · 2021-01-12T13:04:43Z

what's the issue with missing data?

alexfrostdesu · 2021-01-12T18:31:51Z

what's the issue with missing data?

Okay, so to write csv file we need to know how many columns there should be from the very first row. We cannot add new columns as the analysis progresses, it would need to restructure the whole file. So the first row should come with the maximum amount of skeletons and joints, which is obviously not always the case.

We don't have such a problem with writing csv after the analysis is done, the sum of all rows give us the fullest number of columns. We can get the first row practically empty and full the table later.

Furthermore, let's assume our first row came the fullest, we still can get some data missing. If, for example, our full header looks like this:
Animal1_nose_x;Animal1_nose_y;Animal1_neck_x;Animal1_neck_y;Animal1_tailroot_x;Animal1_tailroot_y;
and one frame came later with only these joints:
Animal1_nose_x;Animal1_nose_y;Animal1_tailroot_x;Animal1_tailroot_y;
in the case when we have to write csv post-analysis - no problem, it will be put in the table accordingly, like that:
x1;y1;None;None;x3;y3
but in the case when we are writing line by line we can't write it any other way than:
x1;y1;x3;y3;
which is obviously losing us more data.

alexfrostdesu · 2021-01-13T08:47:50Z

Okay, I kinda can get around it for DLC, as it has all joints name in config.yaml, but not really for other networks.

JensBlack · 2021-01-13T13:17:02Z

Okay, now I understand.

Sidenote:
With our current handling it is impossible to get :
x1, y1, None, None, x3, y3
we would always get:
None, None, None, None

Anyway:
Skeleton is a dictionary with keys for each bodypart (found in the config.yaml or from ALL_BODYPARTS), so we already know all possible columns from the start (loading the network). Even if not, "None" is not reported by DLC's pose estimation (it always returns the coordinates of the highest probability), we only get it from the thresholding and calculate_skeleton steps. We could take the initial number of bodyparts from the first get_pose call. The same goes for DLC-LIVE and DeepPoseKit (afaik). Sleap returns NaN if it can't find the body part, but always returns a full skeleton (all bodyparts as keys present)

Even if we have no previous bodypart names, the autonaming in calculate_skeleton returns named bodyparts (bp1, bp2, etc.). So we should always get a full set of column names.

If its about empty entries "x1;y1;;;x3;y3", we should consider using a spaceholder. What do you think?

JensBlack · 2021-01-14T10:47:36Z

Suggestion: Save data in chunks (what size?) to reduce the open/close hell.

JensBlack

Change location of CSV_DELIMITER to advanced settings (keep ";" as default)

Made writing of data output continuous

79f1199

alexfrostdesu requested a review from JensBlack January 11, 2021 21:52

alexfrostdesu self-assigned this Jan 11, 2021

JensBlack requested changes Jan 15, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Made writing of data output continuous #4

Made writing of data output continuous #4

alexfrostdesu commented Jan 11, 2021

JensBlack commented Jan 12, 2021

alexfrostdesu commented Jan 12, 2021

alexfrostdesu commented Jan 13, 2021

JensBlack commented Jan 13, 2021

JensBlack commented Jan 14, 2021

JensBlack left a comment

Made writing of data output continuous #4

Are you sure you want to change the base?

Made writing of data output continuous #4

Conversation

alexfrostdesu commented Jan 11, 2021

JensBlack commented Jan 12, 2021

alexfrostdesu commented Jan 12, 2021

alexfrostdesu commented Jan 13, 2021

JensBlack commented Jan 13, 2021

JensBlack commented Jan 14, 2021

JensBlack left a comment

Choose a reason for hiding this comment