-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue #41 add error layer #45
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this, Juergen! Because this is such a vital part of the code/backbone, I'm scrutinizing it a bit more than normal. I added a few comments to clean up the code for the future.
I also think we need to make a design choice with @maxwellmb and possibly others. Do we want to be able to handle both 2-D and 3-D error formats, or do we require all corgidrp data to have 3-D error formats. I'm in favor of the latter just so we simplify which use cases we need to handle.
corgidrp/data.py
Outdated
|
||
new_err = np.resize(err, tuple(sh)) | ||
#append new error as layer on 3D cube | ||
new_err[sh[0] - 1,:, :] = error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think np.append(self.err, [error], axis=0)
is shorter and makes the code more readable than the several lines above. Also please avoid naming things like err = self.err
and having an input named error
. It makes it difficult to track which variable belongs to which. I would suggest just to use self.err
whenever possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use input_error
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will look into this. My problem was that np.append only takes arrays with the same dimension, but I try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the little snippet of code I provided above should work. The [error]
should give the extra 3rd dimension that makes the shapes match.
tests/test_err_dq.py
Outdated
|
||
#test copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at this point, it would be useful to break up this giant test into several smaller unit tests: instead of one test_error_dq
, we can have a test_err_dq_creation
, test_copy
, test_add_errors
, test_dataset_err_dq
, test_masked_arr
. It would be cleaner, and also make the tests easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind also comment it up a bit so that its clear what the checks are all meant to be checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will update that.
corgidrp/data.py
Outdated
#append new error as layer on 3D cube | ||
new_err[sh[0] - 1,:, :] = error | ||
|
||
comb_err = np.sqrt(new_err[0,:,:]**2 + error**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this for a bit. There are cases where we're going to divide by something (e.g. a flat field, the K-gain, even the EMGain) with an error on them as well and so when combining things like that we won't be able to manipulate the errors directly like this. How do we want to handle that? Do we need separate functions? How do we track the errors in that case? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose one way is to have a "Layer_N Type" (or something smaller) header keyword that clarifies it. and if we ever need to re-build the combined_error then we also read the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very valuable comment. We left that out for the sake of simplicity. I have to think about it. To add a multiplicative error we would need not only the error of the data (e.g. flat) but also the data itself to calculate the error propagation, so we would need more optional parameters. The other way would be to do this directly in the e.g. flatfield step and use this function only for additive errors. We will also have the case where the signal/data is multiplied with a constant (e.g. GAIN, Flux calibration), also the errors should be multiplicated with that constant. What about to have functions for all these cases: add_additive_error, add_multiplicative_error, multiply_error..., just brainstorming. Error propagation is generally a complex thing (JWST pipeline hasn't solve it correctly up to now in my opinion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like there are possibly a few different things going on:
- If we change the units of the data (e.g., by a scale factor), the errors should scale too. I think the function that does this scaling should obviously also change the units of the errors. If it is just a single scale factor, the error on the scale factor doesn't change the SNR of the data. It merely changes the "photometric calibration" of the units. I think we should keep the error map to track the SNR of the data. And we can use header keywords, or something else to track errors that need to be accounted for in the photometric calibration.
- For the flat field correction, you can imagine each pixel has a different error on the multiplicative factor. However, we can rewrite a multiplicative error as just an error to be added in quadrature (via standard error propagation equations) with no loss in information. Thus, I don't think this requires special handling.
- For something like gain, the error on the gain affects the photon noise calculation. I would say the error on the photon noise error is probably small enough for us to not keep track of, but it certainly could just be an extra slice in the error map.
Overall, it feels like to me we can handle it in our current framework. Additionally, I think these are probably small things that we shouldn't focus on now. I'd rather cross these bridges in the future, after we get the main features of the DRP complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a good table of error propagation for different functions (the covariances can be neglected) in:
https://en.wikipedia.org/wiki/Propagation_of_uncertainty
I think I just keep the additive function as is for now and we come to more complex things if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the conversation on this. I agree that we can make it work in the current framework. Users will have to make sure to treat everything correctly as we go forward within the individual functions. Including making sure to multiply/divide the error terms appropriately and add the error terms. With the current framework, we'd have to keep track of that in them primary header's history only, as the error headers aren't accessed via e.g. the update_after_processing_step
function. I think that's probably ok though. @JuergenSchreiber, @JuergenSchreiber any last thoughts on that?
Thanks for reviewing, the question 2 or 3-D, I think now both is possible, but I find it a bit awkward as a users to have a 3D error array on a 2D signal array to initialize it. |
I pushed my corrections of add_error, so please review again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies.. I think I lead you in the wrong direction saying we should only accept a 2-D error array, because I forgot the copy
command uses the same constructor. I think we should actually go in a different direction and make the init() function bigger and accept the err/dq arrays as well (so that the copy command is more natural).
corgidrp/data.py
Outdated
@@ -296,6 +306,8 @@ def copy(self, copy_data=True): | |||
# annoying, but we got to manually update some parameters. Need to keep track of which ones to update | |||
new_img.filename = self.filename | |||
new_img.filedir = self.filedir | |||
new_img.errhd = self.errhd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ok. I forgot about the fact copy() needs to access the __init__()
function. Perhaps I suggested going in the wrong direction here.
First, I think we should copy() the err/dq headers as well for completeness.
Second, I think given we have to set the err/dq headers anyways, we might as well make the err/dq headers optional arguments into the Image.__init__()
function.
Also, I'm wondering how the copy() code works when you pass in an 3-D error, but __init__
assumes it is 2-D.
Thanks for your comments. I tried to update accordingly, so please check again. err_hdr, dq_hdr is now an optional parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making those changes! I think it looks good.
I added a function add_error_term to the Image class, treat 3d error arrays including header update