issue #41 add error layer #45

JuergenSchreiber · 2024-02-19T13:09:02Z

I added a function add_error_term to the Image class, treat 3d error arrays including header update

semaphoreP

Thanks for doing this, Juergen! Because this is such a vital part of the code/backbone, I'm scrutinizing it a bit more than normal. I added a few comments to clean up the code for the future.

I also think we need to make a design choice with @maxwellmb and possibly others. Do we want to be able to handle both 2-D and 3-D error formats, or do we require all corgidrp data to have 3-D error formats. I'm in favor of the latter just so we simplify which use cases we need to handle.

corgidrp/data.py

semaphoreP · 2024-02-20T14:54:07Z

corgidrp/data.py

+
+        new_err = np.resize(err, tuple(sh))
+        #append new error as layer on 3D cube
+        new_err[sh[0] - 1,:, :] = error


I think np.append(self.err, [error], axis=0) is shorter and makes the code more readable than the several lines above. Also please avoid naming things like err = self.err and having an input named error. It makes it difficult to track which variable belongs to which. I would suggest just to use self.err whenever possible.

maybe use input_error?

Ok, I will look into this. My problem was that np.append only takes arrays with the same dimension, but I try again.

I think the little snippet of code I provided above should work. The [error] should give the extra 3rd dimension that makes the shapes match.

semaphoreP · 2024-02-20T14:58:24Z

tests/test_err_dq.py


+    #test copy


I think at this point, it would be useful to break up this giant test into several smaller unit tests: instead of one test_error_dq, we can have a test_err_dq_creation, test_copy, test_add_errors, test_dataset_err_dq, test_masked_arr. It would be cleaner, and also make the tests easier to read.

Would you mind also comment it up a bit so that its clear what the checks are all meant to be checking?

Ok, I will update that.

maxwellmb · 2024-02-21T03:30:09Z

corgidrp/data.py

+        #append new error as layer on 3D cube
+        new_err[sh[0] - 1,:, :] = error
+
+        comb_err = np.sqrt(new_err[0,:,:]**2 + error**2)


I've been thinking about this for a bit. There are cases where we're going to divide by something (e.g. a flat field, the K-gain, even the EMGain) with an error on them as well and so when combining things like that we won't be able to manipulate the errors directly like this. How do we want to handle that? Do we need separate functions? How do we track the errors in that case? Thoughts?

I suppose one way is to have a "Layer_N Type" (or something smaller) header keyword that clarifies it. and if we ever need to re-build the combined_error then we also read the type.

That is a very valuable comment. We left that out for the sake of simplicity. I have to think about it. To add a multiplicative error we would need not only the error of the data (e.g. flat) but also the data itself to calculate the error propagation, so we would need more optional parameters. The other way would be to do this directly in the e.g. flatfield step and use this function only for additive errors. We will also have the case where the signal/data is multiplied with a constant (e.g. GAIN, Flux calibration), also the errors should be multiplicated with that constant. What about to have functions for all these cases: add_additive_error, add_multiplicative_error, multiply_error..., just brainstorming. Error propagation is generally a complex thing (JWST pipeline hasn't solve it correctly up to now in my opinion)

It sounds like there are possibly a few different things going on:

If we change the units of the data (e.g., by a scale factor), the errors should scale too. I think the function that does this scaling should obviously also change the units of the errors. If it is just a single scale factor, the error on the scale factor doesn't change the SNR of the data. It merely changes the "photometric calibration" of the units. I think we should keep the error map to track the SNR of the data. And we can use header keywords, or something else to track errors that need to be accounted for in the photometric calibration.

For the flat field correction, you can imagine each pixel has a different error on the multiplicative factor. However, we can rewrite a multiplicative error as just an error to be added in quadrature (via standard error propagation equations) with no loss in information. Thus, I don't think this requires special handling.

For something like gain, the error on the gain affects the photon noise calculation. I would say the error on the photon noise error is probably small enough for us to not keep track of, but it certainly could just be an extra slice in the error map.

Overall, it feels like to me we can handle it in our current framework. Additionally, I think these are probably small things that we shouldn't focus on now. I'd rather cross these bridges in the future, after we get the main features of the DRP complete.

There is a good table of error propagation for different functions (the covariances can be neglected) in:
https://en.wikipedia.org/wiki/Propagation_of_uncertainty
I think I just keep the additive function as is for now and we come to more complex things if necessary

Thanks for the conversation on this. I agree that we can make it work in the current framework. Users will have to make sure to treat everything correctly as we go forward within the individual functions. Including making sure to multiply/divide the error terms appropriately and add the error terms. With the current framework, we'd have to keep track of that in them primary header's history only, as the error headers aren't accessed via e.g. the update_after_processing_step function. I think that's probably ok though. @JuergenSchreiber, @JuergenSchreiber any last thoughts on that?

corgidrp/data.py

JuergenSchreiber · 2024-02-21T13:50:06Z

Thanks for doing this, Juergen! Because this is such a vital part of the code/backbone, I'm scrutinizing it a bit more than normal. I added a few comments to clean up the code for the future.

I also think we need to make a design choice with @maxwellmb and possibly others. Do we want to be able to handle both 2-D and 3-D error formats, or do we require all corgidrp data to have 3-D error formats. I'm in favor of the latter just so we simplify which use cases we need to handle.

Thanks for reviewing, the question 2 or 3-D, I think now both is possible, but I find it a bit awkward as a users to have a 3D error array on a 2D signal array to initialize it.

JuergenSchreiber · 2024-02-23T14:05:04Z

I pushed my corrections of add_error, so please review again.
I kept add_error_term() only for additive errors for now, I think we can taggle the other sorts of error propagation as needed.
A bit annoying is that I had to keep the possibility of a 3d err input due to the copying functionality. But I could change this and do image.err = new_err in the copy function. What do you think.

semaphoreP

Apologies.. I think I lead you in the wrong direction saying we should only accept a 2-D error array, because I forgot the copy command uses the same constructor. I think we should actually go in a different direction and make the init() function bigger and accept the err/dq arrays as well (so that the copy command is more natural).

semaphoreP · 2024-02-23T22:11:21Z

corgidrp/data.py

@@ -296,6 +306,8 @@ def copy(self, copy_data=True):
        # annoying, but we got to manually update some parameters. Need to keep track of which ones to update
        new_img.filename = self.filename
        new_img.filedir = self.filedir
+        new_img.errhd = self.errhd


Hm ok. I forgot about the fact copy() needs to access the __init__() function. Perhaps I suggested going in the wrong direction here.

First, I think we should copy() the err/dq headers as well for completeness.
Second, I think given we have to set the err/dq headers anyways, we might as well make the err/dq headers optional arguments into the Image.__init__() function.

Also, I'm wondering how the copy() code works when you pass in an 3-D error, but __init__ assumes it is 2-D.

tests/test_err_dq.py

JuergenSchreiber · 2024-02-29T10:17:34Z

Thanks for your comments. I tried to update accordingly, so please check again. err_hdr, dq_hdr is now an optional parameter.

semaphoreP

Thanks for making those changes! I think it looks good.

JuergenSchreiber added 2 commits February 19, 2024 13:57

new branch add_error for issue #41

3e867c4

corrected test

26b052c

JuergenSchreiber assigned JuergenSchreiber and semaphoreP and unassigned semaphoreP and JuergenSchreiber Feb 19, 2024

semaphoreP self-requested a review February 20, 2024 14:37

semaphoreP reviewed Feb 20, 2024

View reviewed changes

maxwellmb reviewed Feb 21, 2024

View reviewed changes

corgidrp/data.py Show resolved Hide resolved

corrections of pull request based on comments

0805508

semaphoreP reviewed Feb 26, 2024

View reviewed changes

update according to recommendations

154bfb9

Merge branch 'main' into add_error

85dac99

semaphoreP approved these changes Mar 4, 2024

View reviewed changes

semaphoreP merged commit d517d26 into main Mar 4, 2024
1 check passed

semaphoreP deleted the add_error branch March 4, 2024 05:11

semaphoreP mentioned this pull request Mar 4, 2024

Update Image Class to track each individual error term #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue #41 add error layer #45

issue #41 add error layer #45

JuergenSchreiber commented Feb 19, 2024

semaphoreP left a comment

semaphoreP Feb 20, 2024

maxwellmb Feb 21, 2024

JuergenSchreiber Feb 21, 2024

semaphoreP Feb 22, 2024

semaphoreP Feb 20, 2024

maxwellmb Feb 21, 2024

JuergenSchreiber Feb 21, 2024

maxwellmb Feb 21, 2024

maxwellmb Feb 21, 2024

JuergenSchreiber Feb 21, 2024 •

edited

Loading

semaphoreP Feb 22, 2024

JuergenSchreiber Feb 23, 2024

maxwellmb Feb 23, 2024

JuergenSchreiber commented Feb 21, 2024

JuergenSchreiber commented Feb 23, 2024

semaphoreP left a comment

semaphoreP Feb 23, 2024

JuergenSchreiber commented Feb 29, 2024

semaphoreP left a comment

issue #41 add error layer #45

issue #41 add error layer #45

Conversation

JuergenSchreiber commented Feb 19, 2024

semaphoreP left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JuergenSchreiber Feb 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JuergenSchreiber commented Feb 21, 2024

JuergenSchreiber commented Feb 23, 2024

semaphoreP left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JuergenSchreiber commented Feb 29, 2024

semaphoreP left a comment

Choose a reason for hiding this comment

JuergenSchreiber Feb 21, 2024 •

edited

Loading