Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STY: manual fixes for newly flagged violations of UP031 #5064

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

neutrinoceros
Copy link
Member

@neutrinoceros neutrinoceros commented Nov 23, 2024

PR Summary

ruff 0.8.0 is just out and this version flags all violations to rule UP031, including the ones it cannot autofix.
In preparation for the auto upgrade, I'm fixing all these by hand.
At the time of opening, I only fixed 62/162 errors, but I'd like to run CI at this point to see if I've already introduced a mistake that the test suite would catch.

useful resources for reviewers:

PR Checklist

  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.

@neutrinoceros neutrinoceros added code style Related to linting tools refactor improve readability, maintainability, modularity labels Nov 23, 2024
@neutrinoceros neutrinoceros force-pushed the sty/UP031_manual_fixing branch from c86c4b0 to 1c6b552 Compare November 23, 2024 12:41
@@ -1497,8 +1495,8 @@ def __init__(self, ds, dataset_type="boxlib_native"):
for key, val in self.warpx_header.data.items():
if key.startswith("species_"):
i = int(key.split("_")[-1])
charge_name = "particle%.1d_charge" % i
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a weird one. As far as I understand %.1d is accepted but meaningless, and simply equivalent to %d.

@@ -320,7 +320,7 @@ def _parse_parameter_file(self):
self.parameters["wspecies"] = wspecies[:n]
self.parameters["lspecies"] = lspecies[:n]
for specie in range(n):
self.particle_types.append("specie%i" % specie)
self.particle_types.append(f"specie{specie}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in English, 'species' is actually invariable, but I'm keeping bug-for-bug compatibility in this pure refactor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the etymology rabbit hole for "specie" is kinda fun. I could be convinced that it's more appropriate to use specie for particle types in a numerical simulation: maybe having a variety of particle types is more similar to minting different types of coins than it is to variations in taxonomical classification. But most likely species was meant here :)

@@ -27,7 +27,7 @@ def write_record(fp, data, endian):

def write_block(fp, data, endian, fmt, block_id):
assert fmt in [1, 2]
block_id = "%-4s" % block_id
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the first (and so far only) time I've ever seen this format specifier.
for reference, let me remind reviewers of:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this file, refactors are quite involved, so I would like to request a review from @cphyc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in particular, I introduced some calls to os.path.join where it looked intended, but I'm not completely sure that's the case)

basename = "%s/%%s_%s.out%05i" % (basedir, num, domain_id)
part_file_descriptor = f"{basedir}/part_file_descriptor.txt"
num = ds.basename.split(".")[0].split("_")[1]
basename = os.path.join(ds.directory, f"%s_{num}.out{domain_id:05}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm being 100% backward compatible here, but I'm not completely sure if the leftover %s is intentional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is - it is formatted later with a % to match files named e.g. output_00123/hydro_XXXXX.outYYYYY, part_XXXXX.outYYYYY, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks !

@@ -450,7 +451,7 @@ def write_to_gdf(self, fn, grid):

## --------- Store Grid Data --------- ##

g0 = data_g.create_group("grid_%010i" % 0)
g0 = data_g.create_group("grid_{0:010}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is a little suprising perharps. Effectively this is equivalent to hardcoding '0000000000' or, perharps more clearly '0'*10. What's disturbing is that the first 0 is sort of meaningless: one would get the same result from f"{'':010}"

@neutrinoceros neutrinoceros force-pushed the sty/UP031_manual_fixing branch from 1c6b552 to 1846185 Compare November 23, 2024 14:56
@neutrinoceros neutrinoceros force-pushed the sty/UP031_manual_fixing branch from 1846185 to 0988a0e Compare November 23, 2024 15:02
@neutrinoceros
Copy link
Member Author

Alright. This took me... 3 hours (!?) but this time includes a pass of self-review were I was able to catch a few (hopefully most, or even all) of my mistakes. The failure on macOS is unrelated (see #5065), so this should now be ready for review.

@neutrinoceros neutrinoceros marked this pull request as ready for review November 23, 2024 15:05
Copy link
Contributor

@chrishavlin chrishavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made it about halfway and need a break :) couple of questions so far.

@@ -335,7 +335,7 @@ def trajectory_from_index(self, index):
"""
mask = np.isin(self.indices, (index,), assume_unique=True)
if not np.any(mask):
print("The particle index %d is not in the list!" % (index))
print(f"The particle index {index} is not in the list!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know you're going for a 1:1 refactor here... but what about moving this string to the actual IndexError below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but I'd prefer to do it in a follow up PR. I'm worried about making a breaking change for anyone and unintentionally hiding it behind a giant refactor.

yt/frontends/athena/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/enzo/data_structures.py Outdated Show resolved Hide resolved
@neutrinoceros neutrinoceros marked this pull request as draft November 25, 2024 18:31
Copy link
Member

@cphyc cphyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, you have replaced all occurences of %d and %i (which are equivalent) with f"{whatever}. This isn't 100% equivalent, see for example this example:

a = 10
print("a=%i" % a, f"a={a}")  # a=10 a=10

v = 10.
print("v=%i" % v, f"v={v}")  # v=10 v=10.

I know most of the values being formatted are actual ints, but if not we have diverging implementations.

Note that formatting with f"{x:d}" will raise an exception if x is not of type int...

@@ -2612,7 +2612,7 @@ def _export_ply(
)
else:
v = np.empty(self.vertices.shape[1], dtype=vs[:3])
line = "element face %i\n" % (nv / 3)
line = f"element face {nv/3}\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be explicitly specified as follows?

Suggested change
line = f"element face {nv/3}\n"
line = f"element face {nv/3:.0f}\n"

Or

Suggested change
line = f"element face {nv/3}\n"
line = f"element face {int(nv/3)}\n"

nv/3 may indeed be a float rather than an int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch. I think your second suggestion is more in line with the original. Thanks !

@@ -253,7 +253,7 @@ def save_as_dataset(self, filename=None, fields=None):
"""

ds = self.data.ds
keyword = "%s_clump_%d" % (str(ds), self.clump_id)
keyword = f"{ds}_clump_{self.clump_id}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, isn't f"{self.clump_id}" != "%d" % self.clump_id if the latter isn't an int?

basename = "%s/%%s_%s.out%05i" % (basedir, num, domain_id)
part_file_descriptor = f"{basedir}/part_file_descriptor.txt"
num = ds.basename.split(".")[0].split("_")[1]
basename = os.path.join(ds.directory, f"%s_{num}.out{domain_id:05}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is - it is formatted later with a % to match files named e.g. output_00123/hydro_XXXXX.outYYYYY, part_XXXXX.outYYYYY, etc.

@neutrinoceros
Copy link
Member Author

neutrinoceros commented Nov 26, 2024

Note that formatting with f"{x:d}" will raise an exception if x is not of type int...

yeah that's why I didn't try to preserve these format specifiers... If you'd like to give a try to re-introducing them, we can probably co-write this PR, but as far as I'm concerned I already spent much longer than I wanted to here so I don't fancy trying it out myself.

Maybe forcing conversion to int in these occurences would be much more robust, but that's still more than I can chew.

@neutrinoceros
Copy link
Member Author

current reviews taken into account, let me undraft now !
I pushed revisions as a fixup commit: I intend to squash this branch ahead of merging but I don't want to make reviewing harder than it needs to in the mean time.

@neutrinoceros neutrinoceros marked this pull request as ready for review November 26, 2024 08:45
raise ValueError(
f"Grid {grid.id}, Child {c.id}, "
f"Grid->EdgeL {grid.LeftEdge[d]:14.7e}, "
f"Children->EdgeL {c.LeftEdge:14.7e}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed an index here:

Suggested change
f"Children->EdgeL {c.LeftEdge:14.7e}"
f"Children->EdgeL {c.LeftEdge[d]:14.7e}"

raise ValueError(
f"Grid {grid.id}, Child {c.id}, "
f"Grid->EdgeR {grid.RightEdge[d]:14.7e}, "
f"Children->EdgeR {c.RightEdge:14.7e}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Children->EdgeR {c.RightEdge:14.7e}"
f"Children->EdgeR {c.RightEdge[d]:14.7e}"

@@ -137,7 +137,7 @@ def humanize_time(secs):
"""
mins, secs = divmod(secs, 60)
hours, mins = divmod(mins, 60)
return "%02d:%02d:%02d" % (hours, mins, secs)
return ":".join(f"{t:02}" for t in (hours, mins, secs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is one case where i vastly prefer the older string formatting syntax :)

but more importantly, if the incoming secs is a float (which it probably is? hard to say, this function doesn't seem to be used anywhere in yt...), the formatting differs significantly, should cast to float to keep the behavior identical:

Suggested change
return ":".join(f"{t:02}" for t in (hours, mins, secs))
return ":".join(f"{int(t):02}" for t in (hours, mins, secs))

For reference, as written, the new version looks like

>>> humanize_time(1000.1)
'0.0:16.0:40.10000000000002'

vs the old

>>> humanize_time(1000.1)
'00:16:40'

yt/geometry/grid_geometry_handler.py Outdated Show resolved Hide resolved
yt/utilities/sdf.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code style Related to linting tools refactor improve readability, maintainability, modularity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants