Skip to content

Conversation

@climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Dec 8, 2025

Initial configuration of Codee Fortran formatter with examples (src/*.F90) - for discussion

This PR adds an initial .codee-format configuration file for the free Codee Fortran formatter and a temporary script run_codee_tmp.sh to get started with the tool (note that the day-to-day usage will be much simpler than in the script). The PR also demonstrates the effect of the formatting rules for the four Fortran source files in src/.

The goal of this PR is to provide the necessary information to decide on the formatting rules we want to use. Once we have agreed on the format, we can update the src/*.F90 files as needed and merge this PR. The integration with GitHub actions and the simplified usage on the command line will come after that. In a 3rd PR, we will work on a tighter integration of the codee format config with the capgen Fortran write to make sure that the auto-generated code also complies with the formatting rules.

The majority of the diffs are because NEPTUNE (and UFS) use two whitespaces as indentation, whereas the files in src/ currently use 3 whitespaces. We can change the codee config, but every space we save makes the lines shorter. To see the remaining differences, please select "Hide whitespaces" when looking at the files changed.

User interface changes?: No

Working toward #703

Testing: No changes to the tests - GitHub actions CI tests passed
test removed:
unit tests:
system tests:
manual testing:

ColumnLimit: 120
CommentDirectivePrefixes: []
DisabledDirectivePrefixes: []
IndentSize: 2
Copy link
Collaborator Author

@climbfuji climbfuji Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be discussed

NEPTUNE and UFS use 2 spaces for indentation. The CCPP framework code in src uses 3 at the moment. This particular change is responsible for most of the differences in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently have issues with lines that are too long so while this will not help a huge amount but it's at least in the correct direction.

FirstLineFit: FitIfPossible
BreakBeforeBinaryOperators: true
Casing:
Identifiers: Lowercase # Preserve
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEPTUNE uses Lowercase, "Preserve" is the codee default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Six of one, half of the other in the UFS.
My OCD is suggesting that we should adopt a convention and apply across UFS, but "Preserve" is fine for now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with whatever so perhaps lowercase is best to appease the NEPTUNE?

Copy link
Collaborator Author

@climbfuji climbfuji Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase is nice, but whatever works for SIMA and the UFS will do. Within NEPTUNE, we had to make an exception for a nuopc subdirecty, because esmf/nuopc uses CamelCase extensively, and the notorious long names in esmf code become unreadable unless you add underscores. Long story short, no strong preference here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just for the Fortran bits of the framework? If so, we can just make sure it is readable in lowercase.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only Fortran.

FixedFormLabelAlignment: Right
ContinuationIndentSize: DoubleIndentSize
DoubleColonSeparator: AddAlways
EndOfLineNormalization: Unix # Autodetect
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unix is NEPTUNE, Autodetect is Codee default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the documentation and don't fully understand this one?
(Maybe you or someone could explain (slowly) to me at out meeting tomorrow?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to look this up, but as far as we all are concerned, Unix should be the correct choice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know if I've ever seen a case that mixes line endings in one file (as described in the codee documentation). If you use "auto-detect", it will use the first line ending across the entire file. We should definitely use "Unix" here so that all line endings, including Windows line endings for entire files/parts of files are converted to Unix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unix please!

LeftParenthesisKeyword: OnlyLeading
RightParenthesisExpression: NoLeading
RightParenthesisGeneric: NoLeading
RightParenthesisKeyword: OnlyTrailing # NoLeading
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnlyTrailing is NEPTUNE, NoLeading is Codee default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As expected, UFS is all over the place wrt "SpacesAroundOperators".
(I'm super impressed that NEPTUNE is following a standard!)

I do love the idea of adopting this (more granular?) level of code formatting.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong feeling but squeezing spaces makes lines shorter.

"src/ccpp_types.F90:free"
)

for entry in "${files[@]}"; do
Copy link
Collaborator Author

@climbfuji climbfuji Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is overly complicated and not really needed for ccpp-framework. Just for initial testing/demonstration purposes. This file will be either before this PR is merged, or at the latest once codee is integrated in GitHub actions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that we will pass any file modified in the current PR through the formatter as part of the PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is to format all existing Fortran files in the repository as part of the PR. In a next step, implement GitHub actions. In a third step, integrate with capgen and make sure that the auto-generated code also complies with the formatting rules.

@climbfuji climbfuji self-assigned this Dec 8, 2025
@climbfuji climbfuji marked this pull request as ready for review December 8, 2025 20:23
# requested 2025/12/08 using the Codee Online Form
IndentExceptions:
ModuleContains: IndentBeforeAndAfter
Comments: IndentIfAlreadyIndented # Indent
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEPTUNE is "Indent", here I am using "IndentIfAlreadyIndented"

Also note the comment above for ModuleContains exceptions and the missing support for TypeContains/FunctionContains etc exceptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I'd prefer "Indent" over "IndentIfAlreadyIndented".

Screenshot 2025-12-10 at 10 49 48 AM vs. Screenshot 2025-12-10 at 10 50 43 AM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because of the CCPP metadata hooks in the Fortran code that are typically not indented. I think capgen's parser is fine if they get indented, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for consistency so I prefer 'Indent'.

Copy link
Member

@dustinswales dustinswales left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@climbfuji Thanks for incorporating the code formatter here!
I love the idea, we just need to decide on a few details and incorporate this into our workflows.
As a github action, running the formatter could be invoked either diagnostically (report format violations) or prognostically (fix format violations).

FirstLineFit: FitIfPossible
BreakBeforeBinaryOperators: true
Casing:
Identifiers: Lowercase # Preserve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Six of one, half of the other in the UFS.
My OCD is suggesting that we should adopt a convention and apply across UFS, but "Preserve" is fine for now

# requested 2025/12/08 using the Codee Online Form
IndentExceptions:
ModuleContains: IndentBeforeAndAfter
Comments: IndentIfAlreadyIndented # Indent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I'd prefer "Indent" over "IndentIfAlreadyIndented".

Screenshot 2025-12-10 at 10 49 48 AM vs. Screenshot 2025-12-10 at 10 50 43 AM

FixedFormLabelAlignment: Right
ContinuationIndentSize: DoubleIndentSize
DoubleColonSeparator: AddAlways
EndOfLineNormalization: Unix # Autodetect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the documentation and don't fully understand this one?
(Maybe you or someone could explain (slowly) to me at out meeting tomorrow?)

LeftParenthesisKeyword: OnlyLeading
RightParenthesisExpression: NoLeading
RightParenthesisGeneric: NoLeading
RightParenthesisKeyword: OnlyTrailing # NoLeading
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As expected, UFS is all over the place wrt "SpacesAroundOperators".
(I'm super impressed that NEPTUNE is following a standard!)

I do love the idea of adopting this (more granular?) level of code formatting.

#!/usr/bin/env bash

files=(
"src/ccpp_constituent_prop_mod.F90:free"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@climbfuji Is the "free" suffix here to use "free formatted" version of Codee?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I had some logic in that script previously that would use this, it's no longer there because not needed.

Since the script won't be merged, I didn't bother taking this out.

"src/ccpp_types.F90:free"
)

for entry in "${files[@]}"; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that we will pass any file modified in the current PR through the formatter as part of the PR?

FirstLineFit: FitIfPossible
BreakBeforeBinaryOperators: true
Casing:
Identifiers: Lowercase # Preserve
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with whatever so perhaps lowercase is best to appease the NEPTUNE?

Comment on lines +95 to +97
Comma: OnlyTrailing
Concat: Both
DoubleColon: Both
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the enforcement of Comma and DoubleColon white spacing?

We do this kind of thing (lining up intents, function arguments, etc) in SIMA quite often (which we prefer for ease of readability):

subroutine whatever(argument1, arg2, a3)
  integer,         intent(in) :: argument1
  real(kind_phys), intent(in) :: arg2
  integer,        intent(out) :: a3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very good reason to not line up intents. If the longest line changes, then every single line has to change, even though there are no code changes. This increases the possibility of merge conflicts and make changes/pull requests unnecessarily longer. This is the reason why the code default is single whitespace, and why we've adopted it in NEPTUNE.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dustinswales @gold2718 do you have a preference on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am sympathetic to the argument that the one-space rule minimizes potential merge conflicts and PR simplicity (one of the goals of these tools), at the end of the day, the code exists for its human users and they need to be able to read it to understand it. I know at NCAR, the users (scientists and RSEs) strongly preferred lining up the colons to make it easy to pick out the variables. Who is the customer here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this today in the meeting; we agreed that aligning the colons doesn't need to be done for the auto-generated code, it's enough to do this for the few handwritten Fortran files in the repository. Thus, we just "preserve" the whitespaces on a per-file basis.

And importantly, host models can do whatever they want for the rest of their code ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants