Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested changes to JOSS paper #17

Closed
rtmills opened this issue Jun 15, 2023 · 1 comment
Closed

Suggested changes to JOSS paper #17

rtmills opened this issue Jun 15, 2023 · 1 comment

Comments

@rtmills
Copy link

rtmills commented Jun 15, 2023

DPFEHM authors: Thank you for both making this code publicly available and for submitting a description of it to JOSS. This is very nice work! I sincerely apologize for being months behind on reviewing it.

It appears that many concerns that I might have had with the original submission have already been addressed by the other referees and the DPFEHM authors. The only real concerns I have now have to do with the JOSS paper itself (the PDF "paper", I mean). I think that it is find, overall, but it makes a few statements that I think need to be softened or qualified, or that need some additional context or information:

Item 1: The paper begins by naming some subsurface flow and transport codes and then makes the statement that "...these models cannot be efficiently integrated with machine learning frameworks. They cannot because they do not support automatic differentiation...". While it is certainly easiest and ideal to have automatic differentiation for such integration, I disagree that it is not possible to use such models in a machine learning framework; I point to a recent paper by Satish Karra et al. (https://arxiv.org/pdf/2109.03956.pdf) where PFLOTRAN is integrated -- with at least enough efficiency to be practicable -- into a machine learning workflow. The efficiency could be further improved by using a discrete adjoint approach. Although this is not done in the Karra et al. paper, I am fairly confident that such an approach could be implemented without any heroic coding effort in PFLOTRAN by using the TSAdjoint component of PETSc (https://arxiv.org/abs/1912.07696), since PFLOTRAN already has some support for the PETSc TS timesteppers and because Jacobians for some relevant adjoint systems are already coded. Given the need to code up the adjoint Jacobians, though, a nicely working automatic differentiation system (like you have with DPFEHM) seems like the ideal solution. But I suggest revising the language a bit to state that there are difficulties with using machine learning workflows with the aforementioned codes, rather than making the statement that it is impossible.

Item 2: The paper mentions other subsurface flow/transport models, but does not make it clear (for those who are less familiar with subsurface science models) that DPFEHM serves a different purpose and has different target usage than a code like PFLOTRAN or FEHM (or CrunchFlow, or TOUGH2, or several other codes). These codes support a very diverse set of features and process models that are not found in DPFEHM, which aims to support a much leaner set of features, but also opens up several interesting use cases by virtue of its differentiability and its full integration with the Julia ecosystem. It would be helpful to have just a sentence or two to make this clear.

Item 3: The paper says "DPFEHM uses a two-point flux approximation finite volume scheme. This means that an orthogonal grid is required to ensure convergence, similar to other codes such as FEHM and PFLOTRAN Alternative codes such as Amanzi-ATS (Mercer-Smith, 2020), which use a more advanced mimetic finite difference discretization, do not require orthogonal meshes". This is not an incorrect statement, but is somewhat misleading insofar as it only mentions MFD discretizations as ways to deal with non-orthogonal (I would suggest saying "non-K-orthogonal", actually) grids. There are, however, many approaches that have been developed to deal with such meshes, including multi-point flux approximations, least-squares gradient reconstruction, many different mixed finite element approaches, etc. It should be made clear that MFD approaches are not the only way to deal with such grids. I also suggest noting that virtually all commercial reservoir simulator codes use the two-point flux approximation, and that there are some good reasons that they do so, choosing simplicity and low computational expense over more robust discretizations such as MFD that involve higher complexity of implementation and greater computational expense. Indeed, PFLOTRAN used to support MFD, but this support was dropped since the cost of maintaining the MFD code was hard to justify when users found that other approaches (generally just working to get reasonably close to K-orthogonal grids) were satisfactory for accomplishing their science or engineering goals.

Item 4: This really is linked to item 1. In the second to last sentence, the paper states that "For non-differentiable models, the cost is equal to performing a number of simulations that is proportional to the number of parameters". This is not true for codes that can use the discrete adjoint solvers. (Though it is true that implementing support for this requires some developer work.)

I hope that the authors of the paper do not interpret my verbose comments as being overly critical of this work. I think this work is very nice, actually; I just want to ensure that the accompanying paper does not over-generalize or mislead readers.

I note one minor typographical issue:

  • There is a period missing between these sentences: "This means that an orthogonal grid is required to ensure convergence, similar to other codes such as FEHM and PFLOTRAN Alternative codes such as Amanzi-ATS (Mercer-Smith, 2020), which use a more advanced mimetic finite difference discretization, do not require orthogonal meshes"

Refs openjournals/joss-reviews#4560

@omalled
Copy link
Member

omalled commented Jun 17, 2023

Thanks, @rtmills! I appreciate your feedback and help in improving the paper.

Item 1: You are correct that the statement was too strong, and I have softened the text. Instead of saying it "cannot" be done, the revised version says it is challenging for high-dimensional problems where one must implement Jacobians to enable reverse-mode (adjoint-based) differentiation. Based on my understanding of the AdjointNet paper (I have discussed these topics with Satish extensively), it utilizes the (non-adjoint) perturbative approach described at the end of page 4 and the beginning of page 5. All the problems they solve have 1 or 2 parameters, so the adjoint method would provide little or no advantage over the perturbative approach in that context.

Item 2: This is a good point; I clarified this at the end of the second paragraph of the statement of need.

Item 3: Thank you for pointing this out. I am not familiar with all these methods, but it is good to mention more possibilities. The text has been modified to reflect this.

Item 4: You are correct that this was overstated. It has been modified to say that "for models that lack discrete adjoints, which covers most non-differentiable models, the cost is...".

Thanks again for your help!

@omalled omalled closed this as completed Jun 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants