Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opinionated code formatting #367

Open
yallup opened this issue Mar 4, 2024 · 15 comments
Open

Opinionated code formatting #367

yallup opened this issue Mar 4, 2024 · 15 comments

Comments

@yallup
Copy link
Collaborator

yallup commented Mar 4, 2024

Suggestion that this project would benefit from using black https://github.com/psf/black
as a code formatter

Personally on other projects I have found this to be incredibly useful for homogenising code and removing and formatting questions, I think it does reach its goal of reducing developer strain in this direction almost entirely. Requested as a by product of #363

@AdamOrmondroyd
Copy link
Collaborator

If you'd asked me 18 months ago I would have seconded this, but I'm less keen on automatically formatting these days.

The biggest thing for me is that doing the formatting myself forces me to think about writing clean code, if it's a pain to format nicely, perhaps there's something better I could have done?

There are plenty of other tools besides autoformatting, even in nvim I have pylint warning me about formatting issues, and I recently found NeoColumn.nvim which indicates in orange when a line has reached 80 characters.

Screenshot 2024-03-04 at 10 42 06

One benefit of black specifically is black is what pandas itself uses, which might help with familiarity with the pandas classes which anesthetic inherits. I do just find myself disagreeing with it sometimes; the 79 character limit of flake8 is very handy when I've got lots of files open at once on my laptop, I don't think black enforces this?

@williamjameshandley
Copy link
Collaborator

Would be good to collect all everyone's opinions here @lukashergt @ThomasGesseyJones @htjb @JohannesBuchner @tobyLovick @DilyOng @Stefan-Heimersheim @EthanCarragher before committing one way or another to such a major cosmetic change.

@htjb
Copy link
Collaborator

htjb commented Mar 4, 2024

@yallup @williamjameshandley what does black do differently in comparison to say flake8? Isn't flake8 run as part of the tests?

I agree with @AdamOrmondroyd that thinking about formatting helps me write better code.

@htjb
Copy link
Collaborator

htjb commented Mar 4, 2024

Is black the one that does the horrible

def some_function(
    var1=var1,
    var2=var2,
    var3=var3):

thing? Not a fan of this when we can write

def some_function(var1=var1, var2=var2, var3=var3):

it all on one line in less than 79 characters or break on to a new line at 78 characters

@AdamOrmondroyd
Copy link
Collaborator

AdamOrmondroyd commented Mar 4, 2024

Is black the one that does the horrible

def some_function(
    var1=var1,
    var2=var2,
    var3=var3):

thing? Not a fan of this when we can write

def some_function(var1=var1, var2=var2, var3=var3):

Admittedly it only does it if you'd probably have to do a line break anyway

it all on one line in less than 79 characters or break on to a new line at 78 characters

Off-by-one error (79 characters is fine, 80 isn't) 😉

@appetrosyan
Copy link
Collaborator

I'd recommend https://github.com/astral-sh/ruff

Black isn't ideal and it's slow (fitting for a tool written in Python). Given that Ruff is more configurable and around 10x faster, it's at least something to consider.

@appetrosyan
Copy link
Collaborator

If you'd asked me 18 months ago I would have seconded this, but I'm less keen on automatically formatting these days.

The biggest thing for me is that doing the formatting myself forces me to think about writing clean code, if it's a pain to format nicely, perhaps there's something better I could have done?

Code that is shorter isn't always better, and in fact, often the most readable code runs counter to the age old advice of single-page functions and fewer than 80 columns. But only sometimes.

The reason why the industry is using automatic formatting is very simple. It lets you tune out and write code fast. There's value in that for fast growing projects with multiple external contributors.

@AdamOrmondroyd
Copy link
Collaborator

I'd recommend https://github.com/astral-sh/ruff

Black isn't ideal and it's slow (fitting for a tool written in Python). Given that Ruff is more configurable and around 10x faster, it's at least something to consider.

This looks good (and crucially pip-installable), and if we can configure it to something like the current anesthetic style I'd like to give it a go.

If you'd asked me 18 months ago I would have seconded this, but I'm less keen on automatically formatting these days.

The biggest thing for me is that doing the formatting myself forces me to think about writing clean code, if it's a pain to format nicely, perhaps there's something better I could have done?

Code that is shorter isn't always better, and in fact, often the most readable code runs counter to the age old advice of single-page functions and fewer than 80 columns. But only sometimes.

The reason why the industry is using automatic formatting is very simple. It lets you tune out and write code fast. There's value in that for fast growing projects with multiple external contributors.

Notice I said clean code, not shorter. If I'm struggling to find the perfect spot to put in a line break, self-formatting forces me to consider breaking it into two or more separate statements.

@appetrosyan
Copy link
Collaborator

Notice I said clean code, not shorter. If I'm struggling to find the perfect spot to put in a line break, self-formatting forces me to consider breaking it into two or more separate statements.

I suppose, if the terms are given meaningful names, the code is both longer and cleaner.

@yallup
Copy link
Collaborator Author

yallup commented Mar 5, 2024

My 2c:

The reason why the industry is using automatic formatting is very simple. It lets you tune out and write code fast. There's value in that for fast growing projects with multiple external contributors.

I think this is the most important, if you try a serious project for an extended period of time with enforced formatting it is hard to go back to a more manual workflow.

Ruff looks cool but I would say the following for black:

  • Also pip installable
  • In use in other group projects
  • Relative lack of configurability a plus for non experts
  • Wider adoption in general (it is an official ms extension to vscode, imo a plus again for non experts)
  • Speed is something of a non-issue imo for academic scale projects, we are an order of magnitude off in codebase size from it being a noticeable factor

@htjb
Copy link
Collaborator

htjb commented Mar 5, 2024

I wouldn't necessarily be against adoption of something like Black but worry that it will encourage bad coding practices/poor code (even if well formatted). Also agree with David that speed is not an issue for the kind of projects we work on in academia.

Better to have a group of developers who write well thought out code than a tool that tidies up bad code.

@AdamOrmondroyd
Copy link
Collaborator

Also agreed that speed is not an issue.

The other worry I have is that changing the formatting of the entire codebase will make it more confusing to look back at old commits, having to mentally un/re format to understand the actual diff

@appetrosyan
Copy link
Collaborator

appetrosyan commented Mar 5, 2024

replying to these points.

Relative lack of configurability a plus for non experts

Configuration is almost never user-facing and almost always a single file that these tools read automatically.

In other words, the only people for whom it would be a plus are capable of reading documentation and making decisions. The user facing product is identically simple and identically automatic.

Wider adoption in general (it is an official ms extension to vscode, imo a plus again for non experts)

https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff

Wider adoption can change, because Ruff is far newer than Black.

Also agreed that speed is not an issue.

Speed is something of a non-issue imo for academic scale projects, we are an order of magnitude off in codebase size from it being a noticeable factor

It often means the difference between running a formatter "live" in the sense that code is formatted almost instantly as soon as it's written and becomes invisible, to being an explicit step that somebody has to worry about.

The other worry I have is that changing the formatting of the entire codebase will make it more confusing to look back at old commits, having to mentally un/re format to understand the actual diff

Then the ability to replicate the formatting with Ruff, becomes a big positive, contrary to the "lack of configuration argument".

Additionally, the introduction of a standard formatting tool means that old commits only before the introduction are to be considered old. The whole point is to take out the human element out of formatting and ensure that all code appears the same way.

That's the main reason why projects adopt formatting guidelines early: to not have to think about it.


I should probably note that my participation in this discussion is rather historical in nature; I no longer use anesthetic, and don't plan to contribute to it further. I'm subscribed to the issue tracker prompting a reply, but you are free to ignore the advice.

@Stefan-Heimersheim
Copy link
Collaborator

I'm not involved in anesthetic development for the foreseeable future, happy to give me experiences though.


I've been using various code formatting tools since I started my new job, and found them very nice to have. When I added an extra argument I could just press Save and my VS code extensions would automatically format everything. Everyone is using the same setup, pre-commit hooks etc. and we check formatting in the CI.


  • I think you should definitely not enforce formatted code (CI checks or similar) in a way that requires more effort from new contributors (volunteers, academics, ...).
  • Not having a consistent style makes automatic code formatting tools unusable for anyone working on the project (the tool will always try to reformat the whole code base).

I would recommend to adopt some formatting convention, but only loosely enforce it (when someone makes a PR with wrong formatting just fix it for them before or after merging). This won't get around the git blame issue, but I think that's not too bad.

@lukashergt
Copy link
Collaborator

To better understand the issue, what prompts this? The code not being sufficiently homogeneous or people being annoyed by flake8 complaints after modifications?

I have mostly internalised flake8, so I do not really feel the need for an additional tool. Also, I am not that fond of the idea that the code that is committed might look different from the code that I have typed, making it potentially harder to find where I left off.

How does this affect docstrings? Is this compatible with sphinx, numpy convention, autodoc and its various extensions, etc.?

When talking about speed, how slow are we talking? If this makes my cursor hang for a second every time I type :w that would be annoying.

So, personally, not too keen, but obviously will adapt if the majority wants this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants