Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backslash escaping of invalid haddock triggers seems excessive #1131

Open
ChickenProp opened this issue Sep 3, 2024 · 5 comments
Open

Backslash escaping of invalid haddock triggers seems excessive #1131

ChickenProp opened this issue Sep 3, 2024 · 5 comments
Labels
question Further information is requested

Comments

@ChickenProp
Copy link

Describe the bug

In #837 (following #816), backslashes were added to comments that start with haddock triggers (|^$*), where a haddock trigger isn't allowed.

Um, I confess I don't fully understand why. I hear that haddock and/or GHC sometimes has problems when there's a haddock comment somewhere unexpected, but I haven't been able to run into them myself. Is that still an issue with recent GHCs? (I'm actually only using 9.2.) But in any case, reading the issue, it sounds like that's not actually the problem it was intended to resolve?

But regardless of why, it looks to me like this is done unnecessarily in some cases. In particular, it's done at the start of lines in the middle of a multiline comment, where I believe haddock won't pick them up.

As far as I can tell, the current behavior is:

  • At the top-level, single line comments are left alone, whether they start with haddock triggers or not and whether they're allowed or not. (E.g. a -- | at the end of the file, which I think is forbidden.)
  • Elsewhere, if a haddock trigger isn't allowed, single line comments that start with one are given a backslash.
  • Everywhere, multiline strings that start with a haddock trigger are converted to multiple single-line comments. If a haddock trigger isn't allowed, it's given a backslash and subsequent lines are indented one space extra, but not given backslashes.
  • Everywhere, multiline strings that don't start with a haddock trigger have backslashes added to any internal lines that start with a haddock trigger.

Some of this behavior confuses me, and if possible I'd like to be able to have comments starting with -- * in the middle of a function. (I like bullet points, and I'm not sure if haddock cares about those.) The final point is the one that seems most straightforwardly a bug to me.

To Reproduce

Input:

-- Excessive backslashes in multi line:
{-
*
|
-}

test = do
  -- Maybe excessive in single line:
  -- * (if there's nothing after the * the line is completely dropped)
  line1
  -- Maybe excessive in multi line:
  {- | is this excessive?
  * no excessive here at least
  -}
  line2

Output:

-- Excessive backslashes in multi line:
{-
\*
\|
-}

test = do
  -- Maybe excessive in single line:
  -- \* (if there's nothing after the * the line is completely dropped)
  line1
  -- Maybe excessive in multi line:
  -- \| is this excessive?
  --  * no excessive here at least
  --
  line2

Expected behavior

I think I'd expect this output instead:

-- Excessive backslashes in multi line:
{-
*
|
-}

test = do
  -- Maybe excessive in single line:
  -- * (if there's nothing after the * the line is completely dropped)
  line1
  -- Maybe excessive in multi line:
  -- | is this excessive?
  -- * no excessive here at least
  --
  line2

Environment

Tested on the web version at https://ormolu-live.tweag.io/. Currently Version 0.7.7.0, commit 392b2bc, using ghc-lib-parser 9.10.1.20240511.

@amesgen
Copy link
Member

amesgen commented Sep 3, 2024

Thanks for opening this, the situation around invalid Haddock comments certainly is a bit subtle.

In particular, it's done at the start of lines in the middle of a multiline comment, where I believe haddock won't pick them up.

That's exactly why they are invalid: GHC has the -Winvalid-haddock flag that will warn you about these cases. For example, for the snippet you posted, we get

Test.hs:9:3: warning: [GHC-94458] [-Winvalid-haddock]
    A Haddock comment cannot appear in this position and will be ignored.
  |
9 |   -- * (if there's nothing after the * the line is completely dropped)
  |   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Test.hs:12:3: warning: [GHC-94458] [-Winvalid-haddock]
    A Haddock comment cannot appear in this position and will be ignored.
   |
12 |   {- | is this excessive?
   |   ^^^^^^^^^^^^^^^^^^^^^^^...

That's exactly why these two markers are escaped, Ormolu just reuses the GHC logic here.

Your specific use case with bullet points can be fixed by adding an additional space before the *, see #1018.

@amesgen amesgen added the question Further information is requested label Sep 3, 2024
@ChickenProp
Copy link
Author

I see, thanks.

But "at the start of lines in the middle of a multiline comment" doesn't generate a warning, right? That is,

{-
*
|
-}

gets backslashes added, but there's no warning if they're omitted.

@amesgen
Copy link
Member

amesgen commented Sep 3, 2024

Oh right, I overlooked that. We sometimes escape Haddock-like triggers even if sth is not an invalid Haddock comment:

in x :| ((commentPrefix <>) . escapeHaddockTriggers . T.drop n <$> xs)

Removing this causes test failures, so this was added for a reason, but I would need to dive into the details again to understand why exactly. Maybe this logic could be made less aggressive, such that this snippet is an Ormolu fixpoint.

@ChickenProp
Copy link
Author

Hm. I'm not familiar with the codebase or tests, but when I remove the escapeHaddockTriggers in that line, the only test failures I get are in data/examples/other/invalid-haddock-2.hs and data/examples/other/invalid-haddock-double-trigger.hs. Neither of those has a multiline comment. Could it be that the behavior in multiline comments was just an unintended side effect of something else?

@LightAndLight
Copy link

LightAndLight commented Oct 24, 2024

Two examples where I hit this. Workaround: add leading spaces (#1018 (comment))

Bullets

f =
  {- I'm explaining something about the code.

  * Point 1
  * Point 2
  * Point 3
  -}
  g

goes to

f =
  {- I'm explaining something about the code.

  \* Point 1
  \* Point 2
  \* Point 3
  -}
  g

Carets

f =
  {- I'm explaining something about the code.

  asdfasdf
  ^

  Pay attention to the above 'a'.
  -}
  g

goes to

f =
  {- I'm explaining something about the code.

  asdfasdf
  \^

  Pay attention to the above 'a'.
  -}
  g

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants