-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Made writing of data output continuous #4
base: master
Are you sure you want to change the base?
Conversation
what's the issue with missing data? |
Okay, so to write csv file we need to know how many columns there should be from the very first row. We cannot add new columns as the analysis progresses, it would need to restructure the whole file. So the first row should come with the maximum amount of skeletons and joints, which is obviously not always the case. We don't have such a problem with writing csv after the analysis is done, the sum of all rows give us the fullest number of columns. We can get the first row practically empty and full the table later. Furthermore, let's assume our first row came the fullest, we still can get some data missing. If, for example, our full header looks like this: |
Okay, I kinda can get around it for DLC, as it has all joints name in |
Okay, now I understand. Sidenote: Anyway: Even if we have no previous bodypart names, the autonaming in calculate_skeleton returns named bodyparts (bp1, bp2, etc.). So we should always get a full set of column names. If its about empty entries "x1;y1;;;x3;y3", we should consider using a spaceholder. What do you think? |
Suggestion: Save data in chunks (what size?) to reduce the open/close hell. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change location of CSV_DELIMITER to advanced settings (keep ";" as default)
Now if
--data-output-enabled
we write csv file line by line.This is actually a very crude version, that assumes that first found row would be the most full. Also will give gibberish on any incomplete row. It just so happens that knowing full info is impossible if we are going row by row. So we need to find a way to safeguard it from incomplete data, missing joints and animals.
Also repeatedly opens and closes the actual csv file, which is not ideal for performance. Again, when we are dealing with endless loop,
with open(file)
is pretty much useless.Maybe we need to rethink this idea somehow?