Handle bold/strong nested inside italics/em #307

cddude229 · 2024-05-02T02:13:55Z

While trying out this library, my initial input ran up against some edge cases. (It handled everything else perfectly though, which is awesome.). I decided to take a stab at fixing the bugs, which is this PR. I've broken down every test case addition + related fix into its own commit for easy review.

Here's the summary of changes in input/output:

Input	Output in production	Output after this PR
`italics and bold* end*`	`<p>italics <strong>and bold</strong> end</p>`	`<p><em>italics <strong>and bold</strong> end</em></p>`
`italics and bold**`	`<p>italics <strong>and bold</strong></p>`	`<p><em>italics <strong>and bold</strong></em></p>`
`*bold and italics*`	`<p><strong>bold</strong> and italics</p>`	`<p><em><strong>bold</strong> and italics</em></p>`
`start bold* and italics*`	`<p>start <strong>bold</strong> and italics</p>`	`<p><em>start <strong>bold</strong> and italics</em></p>`
`improper nesting is** bad`	`<p>improper <strong>nesting is</strong> bad</p>`	`<p><em>improper nesting</em> is bad</p>`

miekg · 2024-05-02T07:01:11Z

I don't see how the testcase makes the case for this PR?

Don't understand the random i +=2. Comments also didn't change even though the code around it changed very much

kjk · 2024-05-02T14:26:39Z

@miekg It does seem to fix the following:
In current code:
*italics **and bold** end*
=>
*italics and bold end*
See https://tools.arslexis.io/goplayground/#iPKnCL5art6

Per babelmark, most other return:
italics and bold end
see: https://babelmark.github.io/?text=*italics+**and+bold**+end*

The case *improper **nesting* is** bad gets closer to what majority does but not fully (see https://babelmark.github.io/?text=*improper+**nesting*+is**+bad)

…t treat them as the italics marker This is giving us left-to-right precedence for applying styles (which is why the one existing test case changed)

This is a feedback loop from the "is next character c?" check. Without this move, we may as well just `return 0, nil` from the "is next character c?" check. This is an explicit edge case of ending with a triple

…bold and italics ordering case

cddude229 · 2024-05-02T19:13:36Z

I don't see how the testcase makes the case for this PR?

Don't understand the random i +=2. Comments also didn't change even though the code around it changed very much

Apologies - I got lazy at the end of the work day and should have provided more context. I've updated the PR description and split the main commit into two (to better highlight what each change does) in hopes it better communicates the goals.

I also realized another edge case and added a test for that + fixed it. This reverted the optimization case I was talking about as the bug was there instead.

Lemme know if you need more information!

cddude229 · 2024-05-02T19:19:33Z

parser/inline.go

@@ -1201,7 +1201,7 @@ func helperEmphasis(p *Parser, data []byte, c byte) (int, ast.Node) {
 		}

 		if i+1 < len(data) && data[i+1] == c {
-			i++
+			i += 2


i++ guarantees the next character checked is c. This means helperFindEmphChar above returns a guaranteed 0 for length, which then short-circuits to the return 0, nil case. We choose to skip the next c character here by incrementing by 2 instead. (This will expose a separate bug if we're actually in the triple c case, which is what the next commit fixes.)

cddude229 · 2024-05-02T19:20:44Z

parser/inline.go

@@ -1192,9 +1192,6 @@ func helperEmphasis(p *Parser, data []byte, c byte) (int, ast.Node) {

 	for i < len(data) {
 		length := helperFindEmphChar(data[i:], c)
-		if length == 0 {


Building on last commit, in the triple c case, we'd still have short-circuited here - but we actually want to claim this c as our own. So we do the rest of the validity checks first, and then do this check at the end to exit the loop

cddude229 · 2024-06-03T20:54:48Z

@miekg Just following up on this, as I think I addressed all your concerns. Let me know your thoughts and if you need me to do anything else!

based on #307

kjk · 2024-07-29T22:01:21Z

Thanks, I pushed it in 77f4768

Had to resolve a conflict and add a fix for infinite loop from #311 so couldn't just merge it

cddude229 marked this pull request as ready for review May 2, 2024 02:14

cddude229 added 4 commits May 2, 2024 11:57

When bold nested inside italics, skip the bold markers as we shouldn'…

a9ad544

…t treat them as the italics marker This is giving us left-to-right precedence for applying styles (which is why the one existing test case changed)

Check length == 0 after checking the value of the next characters

e161098

This is a feedback loop from the "is next character c?" check. Without this move, we may as well just `return 0, nil` from the "is next character c?" check. This is an explicit edge case of ending with a triple

Skip two symbols if coming from emph3, which will handle the reverse …

9e52993

…bold and italics ordering case

Typo fix

46938c9

cddude229 force-pushed the chris-patch-nested-italics-and-bold branch from b27d897 to 46938c9 Compare May 2, 2024 19:11

cddude229 commented May 2, 2024

View reviewed changes

Merge branch 'master' into chris-patch-nested-italics-and-bold

eda51b6

kjk added a commit that referenced this pull request Jul 29, 2024

better handle nested bold/italic

77f4768

based on #307

kjk closed this Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle bold/strong nested inside italics/em #307

Handle bold/strong nested inside italics/em #307

cddude229 commented May 2, 2024 •

edited

Loading

miekg commented May 2, 2024

kjk commented May 2, 2024

cddude229 commented May 2, 2024

cddude229 May 2, 2024 •

edited

Loading

cddude229 May 2, 2024

cddude229 commented Jun 3, 2024

kjk commented Jul 29, 2024

Handle bold/strong nested inside italics/em #307

Handle bold/strong nested inside italics/em #307

Conversation

cddude229 commented May 2, 2024 • edited Loading

miekg commented May 2, 2024

kjk commented May 2, 2024

cddude229 commented May 2, 2024

cddude229 May 2, 2024 • edited Loading

Choose a reason for hiding this comment

cddude229 May 2, 2024

Choose a reason for hiding this comment

cddude229 commented Jun 3, 2024

kjk commented Jul 29, 2024

cddude229 commented May 2, 2024 •

edited

Loading

cddude229 May 2, 2024 •

edited

Loading