STY: manual fixes for newly flagged violations of UP031 #5064

neutrinoceros · 2024-11-23T11:31:19Z

PR Summary

ruff 0.8.0 is just out and this version flags all violations to rule UP031, including the ones it cannot autofix.
In preparation for the auto upgrade, I'm fixing all these by hand.
At the time of opening, I only fixed 62/162 errors, but I'd like to run CI at this point to see if I've already introduced a mistake that the test suite would catch.

useful resources for reviewers:

PR Checklist

New features are documented, with docstrings and narrative docs
Adds a test for any bugs fixed. Adds tests for new features.

neutrinoceros · 2024-11-23T14:17:40Z

yt/frontends/amrex/data_structures.py

@@ -1497,8 +1495,8 @@ def __init__(self, ds, dataset_type="boxlib_native"):
        for key, val in self.warpx_header.data.items():
            if key.startswith("species_"):
                i = int(key.split("_")[-1])
-                charge_name = "particle%.1d_charge" % i


this is a weird one. As far as I understand %.1d is accepted but meaningless, and simply equivalent to %d.

neutrinoceros · 2024-11-23T14:19:38Z

yt/frontends/art/data_structures.py

@@ -320,7 +320,7 @@ def _parse_parameter_file(self):
            self.parameters["wspecies"] = wspecies[:n]
            self.parameters["lspecies"] = lspecies[:n]
            for specie in range(n):
-                self.particle_types.append("specie%i" % specie)
+                self.particle_types.append(f"specie{specie}")


I believe in English, 'species' is actually invariable, but I'm keeping bug-for-bug compatibility in this pure refactor

the etymology rabbit hole for "specie" is kinda fun. I could be convinced that it's more appropriate to use specie for particle types in a numerical simulation: maybe having a variety of particle types is more similar to minting different types of coins than it is to variations in taxonomical classification. But most likely species was meant here :)

neutrinoceros · 2024-11-23T14:26:25Z

yt/frontends/gadget/testing.py

@@ -27,7 +27,7 @@ def write_record(fp, data, endian):

 def write_block(fp, data, endian, fmt, block_id):
    assert fmt in [1, 2]
-    block_id = "%-4s" % block_id


this is the first (and so far only) time I've ever seen this format specifier.
for reference, let me remind reviewers of:

https://cplusplus.com/reference/cstdio/printf/

https://fstring.help/

neutrinoceros · 2024-11-23T14:31:35Z

yt/frontends/ramses/data_structures.py

for this file, refactors are quite involved, so I would like to request a review from @cphyc

(in particular, I introduced some calls to os.path.join where it looked intended, but I'm not completely sure that's the case)

neutrinoceros · 2024-11-23T14:32:25Z

yt/frontends/ramses/data_structures.py

-        basename = "%s/%%s_%s.out%05i" % (basedir, num, domain_id)
-        part_file_descriptor = f"{basedir}/part_file_descriptor.txt"
+        num = ds.basename.split(".")[0].split("_")[1]
+        basename = os.path.join(ds.directory, f"%s_{num}.out{domain_id:05}")


I'm being 100% backward compatible here, but I'm not completely sure if the leftover %s is intentional.

It is - it is formatted later with a % to match files named e.g. output_00123/hydro_XXXXX.outYYYYY, part_XXXXX.outYYYYY, etc.

neutrinoceros · 2024-11-23T14:45:23Z

yt/utilities/grid_data_format/conversion/conversion_athena.py

@@ -450,7 +451,7 @@ def write_to_gdf(self, fn, grid):

        ## --------- Store Grid Data --------- ##

-        g0 = data_g.create_group("grid_%010i" % 0)
+        g0 = data_g.create_group("grid_{0:010}")


this one is a little suprising perharps. Effectively this is equivalent to hardcoding '0000000000' or, perharps more clearly '0'*10. What's disturbing is that the first 0 is sort of meaningless: one would get the same result from f"{'':010}"

neutrinoceros · 2024-11-23T15:05:51Z

Alright. This took me... 3 hours (!?) but this time includes a pass of self-review were I was able to catch a few (hopefully most, or even all) of my mistakes. The failure on macOS is unrelated (see #5065), so this should now be ready for review.

chrishavlin

made it about halfway and need a break :) couple of questions so far.

chrishavlin · 2024-11-25T17:02:34Z

yt/data_objects/particle_trajectories.py

@@ -335,7 +335,7 @@ def trajectory_from_index(self, index):
        """
        mask = np.isin(self.indices, (index,), assume_unique=True)
        if not np.any(mask):
-            print("The particle index %d is not in the list!" % (index))
+            print(f"The particle index {index} is not in the list!")


i know you're going for a 1:1 refactor here... but what about moving this string to the actual IndexError below?

Good idea, but I'd prefer to do it in a follow up PR. I'm worried about making a breaking change for anyone and unintentionally hiding it behind a giant refactor.

yt/frontends/athena/data_structures.py

yt/frontends/enzo/data_structures.py

cphyc

In general, you have replaced all occurences of %d and %i (which are equivalent) with f"{whatever}. This isn't 100% equivalent, see for example this example:

a = 10
print("a=%i" % a, f"a={a}")  # a=10 a=10

v = 10.
print("v=%i" % v, f"v={v}")  # v=10 v=10.

I know most of the values being formatted are actual ints, but if not we have diverging implementations.

Note that formatting with f"{x:d}" will raise an exception if x is not of type int...

cphyc · 2024-11-25T05:37:14Z

yt/data_objects/construction_data_containers.py

@@ -2612,7 +2612,7 @@ def _export_ply(
            )
        else:
            v = np.empty(self.vertices.shape[1], dtype=vs[:3])
-        line = "element face %i\n" % (nv / 3)
+        line = f"element face {nv/3}\n"


Should this be explicitly specified as follows?

Suggested change

line = f"element face {nv/3}\n"

line = f"element face {nv/3:.0f}\n"

Or

Suggested change

line = f"element face {nv/3}\n"

line = f"element face {int(nv/3)}\n"

nv/3 may indeed be a float rather than an int.

nice catch. I think your second suggestion is more in line with the original. Thanks !

cphyc · 2024-11-25T05:39:26Z

yt/data_objects/level_sets/clump_handling.py

@@ -253,7 +253,7 @@ def save_as_dataset(self, filename=None, fields=None):
        """

        ds = self.data.ds
-        keyword = "%s_clump_%d" % (str(ds), self.clump_id)
+        keyword = f"{ds}_clump_{self.clump_id}"


Same here, isn't f"{self.clump_id}" != "%d" % self.clump_id if the latter isn't an int?

cphyc · 2024-11-26T08:10:30Z

yt/frontends/ramses/data_structures.py

-        basename = "%s/%%s_%s.out%05i" % (basedir, num, domain_id)
-        part_file_descriptor = f"{basedir}/part_file_descriptor.txt"
+        num = ds.basename.split(".")[0].split("_")[1]
+        basename = os.path.join(ds.directory, f"%s_{num}.out{domain_id:05}")


It is - it is formatted later with a % to match files named e.g. output_00123/hydro_XXXXX.outYYYYY, part_XXXXX.outYYYYY, etc.

neutrinoceros · 2024-11-26T08:31:05Z

Note that formatting with f"{x:d}" will raise an exception if x is not of type int...

yeah that's why I didn't try to preserve these format specifiers... If you'd like to give a try to re-introducing them, we can probably co-write this PR, but as far as I'm concerned I already spent much longer than I wanted to here so I don't fancy trying it out myself.

Maybe forcing conversion to int in these occurences would be much more robust, but that's still more than I can chew.

neutrinoceros · 2024-11-26T08:45:45Z

current reviews taken into account, let me undraft now !
I pushed revisions as a fixup commit: I intend to squash this branch ahead of merging but I don't want to make reviewing harder than it needs to in the mean time.

chrishavlin · 2024-11-26T15:38:05Z

yt/frontends/gamer/data_structures.py

+                        raise ValueError(
+                            f"Grid {grid.id}, Child {c.id}, "
+                            f"Grid->EdgeL {grid.LeftEdge[d]:14.7e}, "
+                            f"Children->EdgeL {c.LeftEdge:14.7e}"


missed an index here:

Suggested change

f"Children->EdgeL {c.LeftEdge:14.7e}"

f"Children->EdgeL {c.LeftEdge[d]:14.7e}"

chrishavlin · 2024-11-26T15:38:17Z

yt/frontends/gamer/data_structures.py

+                        raise ValueError(
+                            f"Grid {grid.id}, Child {c.id}, "
+                            f"Grid->EdgeR {grid.RightEdge[d]:14.7e}, "
+                            f"Children->EdgeR {c.RightEdge:14.7e}"


Suggested change

f"Children->EdgeR {c.RightEdge:14.7e}"

f"Children->EdgeR {c.RightEdge[d]:14.7e}"

chrishavlin · 2024-11-26T17:00:14Z

yt/funcs.py

@@ -137,7 +137,7 @@ def humanize_time(secs):
    """
    mins, secs = divmod(secs, 60)
    hours, mins = divmod(mins, 60)
-    return "%02d:%02d:%02d" % (hours, mins, secs)
+    return ":".join(f"{t:02}" for t in (hours, mins, secs))


this is one case where i vastly prefer the older string formatting syntax :)

but more importantly, if the incoming secs is a float (which it probably is? hard to say, this function doesn't seem to be used anywhere in yt...), the formatting differs significantly, should cast to float to keep the behavior identical:

Suggested change

return ":".join(f"{t:02}" for t in (hours, mins, secs))

return ":".join(f"{int(t):02}" for t in (hours, mins, secs))

For reference, as written, the new version looks like

>>> humanize_time(1000.1) '0.0:16.0:40.10000000000002'

vs the old

>>> humanize_time(1000.1) '00:16:40'

yt/geometry/grid_geometry_handler.py

yt/utilities/grid_data_format/conversion/conversion_athena.py

yt/utilities/sdf.py

Co-authored-by: Chris Havlin <[email protected]>

neutrinoceros added code style Related to linting tools refactor improve readability, maintainability, modularity labels Nov 23, 2024

neutrinoceros force-pushed the sty/UP031_manual_fixing branch from c86c4b0 to 1c6b552 Compare November 23, 2024 12:41

neutrinoceros commented Nov 23, 2024

View reviewed changes

neutrinoceros force-pushed the sty/UP031_manual_fixing branch from 1c6b552 to 1846185 Compare November 23, 2024 14:56

STY: manual fixes for newly flagged violations of UP031

0988a0e

neutrinoceros force-pushed the sty/UP031_manual_fixing branch from 1846185 to 0988a0e Compare November 23, 2024 15:02

neutrinoceros marked this pull request as ready for review November 23, 2024 15:05

chrishavlin reviewed Nov 25, 2024

View reviewed changes

neutrinoceros marked this pull request as draft November 25, 2024 18:31

cphyc reviewed Nov 26, 2024

View reviewed changes

fixup! STY: manual fixes for newly flagged violations of UP031

147548d

neutrinoceros marked this pull request as ready for review November 26, 2024 08:45

chrishavlin reviewed Nov 26, 2024

View reviewed changes

Update yt/geometry/grid_geometry_handler.py

96fe344

Co-authored-by: Chris Havlin <[email protected]>

This was referenced Dec 1, 2024

STY: apply autofixes for RUF031 #5070

Merged

[pre-commit.ci] pre-commit autoupdate #5073

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STY: manual fixes for newly flagged violations of UP031 #5064

STY: manual fixes for newly flagged violations of UP031 #5064

neutrinoceros commented Nov 23, 2024 •

edited

Loading

neutrinoceros Nov 23, 2024

neutrinoceros Nov 23, 2024

chrishavlin Nov 25, 2024

neutrinoceros Nov 23, 2024

neutrinoceros Nov 23, 2024

neutrinoceros Nov 23, 2024

neutrinoceros Nov 23, 2024

cphyc Nov 26, 2024

neutrinoceros Nov 26, 2024

neutrinoceros Nov 23, 2024

neutrinoceros commented Nov 23, 2024

chrishavlin left a comment

chrishavlin Nov 25, 2024

neutrinoceros Nov 25, 2024

cphyc left a comment •

edited

Loading

cphyc Nov 25, 2024

neutrinoceros Nov 26, 2024

cphyc Nov 25, 2024

cphyc Nov 26, 2024

neutrinoceros commented Nov 26, 2024 •

edited

Loading

neutrinoceros commented Nov 26, 2024

chrishavlin Nov 26, 2024

chrishavlin Nov 26, 2024

chrishavlin Nov 26, 2024

	line = f"element face {nv/3}\n"
	line = f"element face {nv/3:.0f}\n"

	line = f"element face {nv/3}\n"
	line = f"element face {int(nv/3)}\n"

	f"Children->EdgeL {c.LeftEdge:14.7e}"
	f"Children->EdgeL {c.LeftEdge[d]:14.7e}"

	f"Children->EdgeR {c.RightEdge:14.7e}"
	f"Children->EdgeR {c.RightEdge[d]:14.7e}"

	return ":".join(f"{t:02}" for t in (hours, mins, secs))
	return ":".join(f"{int(t):02}" for t in (hours, mins, secs))

STY: manual fixes for newly flagged violations of UP031 #5064

Are you sure you want to change the base?

STY: manual fixes for newly flagged violations of UP031 #5064

Conversation

neutrinoceros commented Nov 23, 2024 • edited Loading

PR Summary

PR Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neutrinoceros commented Nov 23, 2024

chrishavlin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cphyc left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neutrinoceros commented Nov 26, 2024 • edited Loading

neutrinoceros commented Nov 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neutrinoceros commented Nov 23, 2024 •

edited

Loading

cphyc left a comment •

edited

Loading

neutrinoceros commented Nov 26, 2024 •

edited

Loading