Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MathML 4 extensions for alignment and possible deprecation of <maligngroup/> and <malignmark/> #181

Open
NSoiffer opened this issue Jan 2, 2020 · 37 comments
Labels
compatibility Issues affecting backward compatibility MathML 4 Issues affecting the MathML 4 specification need polyfill Issues requiring implementation changes need specification update Issues requiring specification changes

Comments

@NSoiffer
Copy link
Contributor

NSoiffer commented Jan 2, 2020

Having just implemented a polyfill for elementary math, that got me thinking about some related ideas:

  1. The most obvious concept related to long division is synthetic division. It is basically the same idea as long division except that you are dividing polynomials. With synthetic division, the columns contain numbers (the coefficients of the polynomial), not just digits. As a refresher, see this page and the example taken from it below):
Polynomial Division Synthetic Division
image image
  1. Synthetic division is a shorthand for long division of polynomials (left example above). Long division of polynomials is basically the same idea as long division of numbers except that instead of digits, you have monomials that need to go into their own column. Doing that automatically requires knowing the variable you want to "sort" on so that each monomial goes into the proper column.
  2. A very similar property is needed when displaying systems of equations -- each monomial wants to be in it's own column (in this case, the top level element would not be mlongdiv, but mstack).

There are a few complications such as decimal alignment of the coefficients:

      8.44x + 55  y =  0
      3.1 x -  0.7y = -1.1

Note that alignment requires knowing what characters/operators act as column separators (e.g., +and -, along with = and a few other relational operators). These would be inside of mo elements, so potentially any mo element could be a separator, or maybe an attribute specifies what the separators are (something to think about/discuss).

The above example is taken from the MathML 3 spec formaligngroup and malignmark. I think only MathPlayer ever implemented those elements and I suspect that you can count on your fingers the number of times they have been used. It is a very complicated feature to implement and to use. In contrast, I think the above features are an incremental extension to elementary math layout, so implementation (especially via an extension to the polyfill I wrote), means that supporting these features would be universal (assuming I or someone else extended the polyfill). Just as important, using this extension would be simple as it is a declarative notation that doesn't require modifying the generated layout other than at a high level (wrapping with mstack or mlongdiv). It would be less powerful though.

I suspect that this proposed extension to elementary math handles the large majority of cases where people play games with tables to achieve alignment, both in MathML and in TeX. @davidcarlisle: do you have any estimate of how many uses of table for alignment in TeX would be covered by this proposed extension? What are some of the cases that are missed by it?

@NSoiffer NSoiffer added compatibility Issues affecting backward compatibility MathML 4 Issues affecting the MathML 4 specification need polyfill Issues requiring implementation changes need resolution Issues needing resolution at MathML Refresh CG meeting need specification update Issues requiring specification changes labels Jan 2, 2020
@fred-wang fred-wang removed the need resolution Issues needing resolution at MathML Refresh CG meeting label Aug 12, 2020
@dginev
Copy link
Contributor

dginev commented Oct 9, 2020

Hello. I was looking for an appropriate issue to attach a recent piece of news I spotted, and since this issue discusses malignmark, it seems appropriate. There is a recent post about bypassing the sanitization of DOMPurify through an abuse of parsing MathML in HTML, details here:
https://portswigger.net/daily-swig/dompurify-mutation-xss-bypass-achieved-through-mathml-namespace-confusion

Summarized as:

In the MathML namespace, two special elements – mglyph and malignmark – allow the creation of a markup that is “in HTML namespace, but on reparsing it is in MathML namespace, [meaning that] the subsequent style tag [is] parsed differently and leading to XSS,” the researcher explained.

This might be relevant if you're searching for additional reasons for deprecation.

@NSoiffer
Copy link
Contributor Author

NSoiffer commented Oct 15, 2020 via email

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Oct 15, 2020 via email

@NSoiffer
Copy link
Contributor Author

NSoiffer commented Oct 17, 2020 via email

@davidcarlisle
Copy link
Collaborator

the schema has been updated to restrict use of malignmark, and to remove grouplaign attribute except in legacy schema

w3c/mathml-schema@4e897dc

@NSoiffer
Copy link
Contributor Author

The group tentatively agreed that these are no longer needed: they aren't implemented in browsers and are rarely generated. The main use is to get alignment right at the character level (e.g., decimal alignment). @davidcarlisle pointed out that there is a Unicode space character (U+2007) that is meant to be the width of a digit and that this can be used as padding to get decimal alignment to work.

I pointed out that with intent table properties, the accessibility problem of splitting equations across columns goes away.

This section (which is somewhat simplified from MathML 3) still is quite large, so removing this would be a good simplification to the spec and help align it with it.

@MurrayIII currently uses malignmark in his UnicodeMath implementation. He will investigate whether this is really needed. If not, then we can remove these from the spec.

@MurrayIII
Copy link

MurrayIII commented Nov 14, 2024

For the equation

image

Word copies the following MathML

<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:maligngroup/><mml:mn>10</mml:mn><mml:malignmark/><mml:mi>x</mml:mi><mml:mi> </mml:mi><mml:mo>+</mml:mo><mml:maligngroup/><mml:mn>3</mml:mn><mml:malignmark/><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:maligngroup/><mml:mn>3</mml:mn><mml:malignmark/><mml:mi>x</mml:mi><mml:mi> </mml:mi><mml:mo>+</mml:mo><mml:maligngroup/><mml:mn>13</mml:mn><mml:malignmark/><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>4</mml:mn></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math>

@NSoiffer
Copy link
Contributor Author

Reformatted the above MathML to be more readable:

<math><mtable>
  <mtr><mtd><mrow>
        <maligngroup/><mn>10</mn><malignmark/><mi>x</mi><mi/><mo>+</mo>
        <maligngroup/><mn>3</mn><malignmark/><mi>y</mi><mo>=</mo><mn>2</mn>
   </mrow></mtd></mtr>
   <mtr><mtd><mrow>
        <maligngroup/><mn>3</mn><malignmark/><mi>x</mi><mi/><mo>+</mo>
        <maligngroup/><mn>13</mn><malignmark/><mi>y</mi><mo>=</mo><mn>4</mn>
   </mrow></mtd></mtr>
</mtable></math>

This makes it easier to see the maligngroup and malignmark elements.

@davidcarlisle
Copy link
Collaborator

If I change that example to 103 rather than 13, so the expressions have different lengths

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8"/>
    <title>malign</title>
  </head>
  <body>
   <math><mtable>
  <mtr><mtd><mrow>
        <maligngroup/><mn>10</mn><malignmark/><mi>x</mi><mi/><mo>+</mo>
        <maligngroup/><mn>3</mn><malignmark/><mi>y</mi><mo>=</mo><mn>2</mn>
   </mrow></mtd></mtr>
   <mtr><mtd><mrow>
        <maligngroup/><mn>3</mn><malignmark/><mi>x</mi><mi/><mo>+</mo>
        <maligngroup/><mn>103</mn><malignmark/><mi>y</mi><mo>=</mo><mn>4</mn>
   </mrow></mtd></mtr>
</mtable></math>
  </body>
</html>

that renders (in Edge here) as

image

So there is no alignment at all visually (similarly in firefox)

If I add

<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_SVG"></script>

to get mathjax rendering it still doesn't align:

image

@dginev
Copy link
Contributor

dginev commented Nov 14, 2024

Here is a Nov 2024, Chrome workaround for that example using float: right; and extra mtd elements, for archival purposes:

https://codepen.io/dginev/pen/XWvGEEZ

image

@davidcarlisle
Copy link
Collaborator

@dginev ooh scary, I can confirm that works (and essentially works the same way in HTML table markup)

So the containing box for a table cell for the purpose of float positioning is the implicit column box?

I tried to navigate mdn or the css specs to find a clear statement on what's supposed to happen if you apply float:... to a table cell, but failed to find anything definitive, did you find something, or did you just find this works?

@dginev
Copy link
Contributor

dginev commented Nov 15, 2024

or did you just find this works?

Indeed, just finding things that work by analogy with HTML.
I believe I found float:right; in a demonstration of how to "right-align a <div> element with CSS".

@davidcarlisle
Copy link
Collaborator

yes right aligning a div in its container with float:right is clear enough but It had never occurred to me you could apply it to a table cell or where it would float to if you did. It's clearly legal (as I found documentation that using float implicitly changes the table-cell display property to block) but I couldn't find a clear description of what happens. Nice example in any case

@davidcarlisle
Copy link
Collaborator

with firefox it seems I just need text-align, with chrome based browsers float:right is needed, but float-left in left aligned columns doesn't work at all. So this document renders Murray's example in both.

<!DOCTYPE html>
<html>
  <head>
    <script>
      window.addEventListener("load",function () {
	  document.querySelectorAll("mtable").forEach(m =>
	      m.innerHTML=m.innerHTML
		  .replace(/<maligngroup><\/maligngroup>/g,
			   "</mrow></mtd><mtd style='padding:0pt;text-align:right;float:right'><mrow>")
		  .replace(/<malignmark><\/malignmark>/g,
			   "</mrow></mtd><mtd style='padding:0pt;text-align:left;'><mrow>")
	  )},false);
     </script>
    <meta charset="UTF-8"/>
    <title>malign</title>
  </head>
  <body>
   <math><mtable>
  <mtr><mtd><mrow>
        <maligngroup/><mn>1000</mn><malignmark/><mi>x</mi><mi/><mo>+</mo>
        <maligngroup/><mn>3</mn><malignmark/><mi>y</mi><mo>=</mo><mn>2</mn>
   </mrow></mtd></mtr>
   <mtr><mtd><mrow>
        <maligngroup/><mn>3</mn><malignmark/><mi>x</mi><mi/><mo>+</mo>
        <maligngroup/><mn>103</mn><malignmark/><mi>y</mi><mo>=</mo><mn>4444</mn>
   </mrow></mtd></mtr>
</mtable></math>
  </body>
</html>

firefox

image

edge

image

@NSoiffer
Copy link
Contributor Author

Can someone with Safari verify @davidcarlisle's solution works on Safari?

If so, it seems like there is a reasonable solution for dealing with Word's output and we can simplify the MathML full spec and get it in closer agreement to core. Given this solution, it seems highly unlikely even this scaled down version of maligngroup/malignmark would ever get added to core.

@FrankMittelbach
Copy link

Here is the output from firefox (left) and safari on the right:

Screenshot 2024-11-16 at 23 43 57

@davidcarlisle
Copy link
Collaborator

Here is the output from firefox (left) and safari on the right:
Screenshot 2024-11-16 at 23 43 57

thanks

@davidcarlisle
Copy link
Collaborator

@NSoiffer so looks like it basically works in chrome based browsers., firefox and Safari

@MurrayIII
Copy link

MurrayIII commented Dec 18, 2024

Small addition: since mtd is mrow-like, the polyfill above also needs to handle an mtd without an mrow enclosing the mtd contents.

@MurrayIII
Copy link

The float:right displays a column with a single character correctly, but it stacks multiple characters above one another as in

image

Without the float:right, this displays as

image

Perhaps the polyfill should only emit float:right for single-character maligngroups.

@davidcarlisle
Copy link
Collaborator

I took an action item on the WG call on 2024-12-19 to suggest a specification update to resolve this issue. The suggestion below is not what I had in mind yesterday but having looked at this issue again and the current text I think it is perhaps the most viable option.

@NSoiffer @MurrayIII I think even if we cut down malignmark and maligngroup to what is needed to cover MS Word output, plus whatever makes sense to be allowed in a schema that targets those cases, the alignment spec in chapter 3 will end up being quite complicated.

As the elements are not in Core, doing this won't directly help compatibility of Office-produced documents. As now, and as shown in the comments above, they will require additional javascript and CSS to make the alignments work.

If however we remove them from MathML4 completely, the Office generated HTML+MathML would be flagged as invalid by validators which is not a desirable outcome.

Given that the desired end result is that Office generated output works in current browsers (via a Math WG supplied polyfill) and that that output is considered valid, I think a simpler approach to the specification would be:

  • Declare maligngroup and malignmark as legacy compatibility elements that are valid in (any) mtd and mrow
  • Specify in full that they are valid but have no defined behaviour.
  • Provide a polyfil that handles as much of the MathML3 alignment as reasonable, covering at minimum all output from the Office suite.

This would simplify (remove) almost all the current text while keeping office output valid and improving the actual rendering of these alignments in current browsers by providing code so that the alignment works (whereas it typically does not work currently)

It allows flexibility in the polyfill to adapt to cover any exiting cases "in the wild" (not just office output) without having to formally specify the behaviour to an extent that would be required if the alignment is specified in the full spec.

@polx
Copy link

polx commented Jan 8, 2025

I just tested the above example on safari and iPhone and it works as in the pictures.

I like the polyfill approach to cover the word cases but we should keep this minimal... while MathJax is not really a polyfill it has been used as such by many to excuse non-implementations of MathML and this is not healthy.

@dginev
Copy link
Contributor

dginev commented Jan 8, 2025

For Murray's example it looks like one needs to restore display: math; to the floated mtd's children as well as their vertical-align: bottom;. Adding a padding:0; to the floated mtd also looks reasonable.

Here is a codepen:
https://codepen.io/dginev/pen/LEPQNEX

image

(screenshot from Chrome 131.0.6778.204)

@MurrayIII
Copy link

MurrayIII commented Jan 9, 2025

I fixed the problem in my UnicodeMathML app by including the float:right only when the group contains one element. But this only works correctly when the members of the vertically aligned group all have the same width. I tried various combinations of float right with guarded display, padding and vertical-align, but wasn't able to flush the shorter maligngroup's to the right.

@MurrayIII
Copy link

Firefox displays the equation arrays correctly without needing float:right. So, this appears to be a Chromium bug

@davidcarlisle
Copy link
Collaborator

@dginev apart from the fact that it works, do you have any documentable reason why we need to use float for right alignment (and not left)? Even the fact that the notional column acts as the container boundary for floating isn't immediately clear (and why does left alignment work differently?).

Ah I see @MurrayIII basically hinted at same thing, that the float:right looks wrong. I wonder if there is a version we could specify in a deprecation note for malign that did not use float (but we believe should work) even if current actual polyfills need to use float in practice.

@dginev
Copy link
Contributor

dginev commented Jan 10, 2025

@davidcarlisle Right and not left? I am not sure I understand. One could use float: left; as needed too. Polyfilling is a bit of an art, so I am a little short on documentable reasons.

One attempt is:
"Consistently floating to the left or right edge of a table cell could be used to emulate the appearance of an alignment at multiple marked horizontal locations, while preserving each group's overall width across rows."

@davidcarlisle
Copy link
Collaborator

@dginev

I am not sure I understand.

I did the testing before christmas, so maybe I am confused and you should ignore me and I'll look later but when I looked before, I thought that left alignment worked without using float but right not. But it's possible I was using forefox where Murray suggests the float isn't needed anyway. I need to come back to this before suggesting any spec wording change but I clearly need to check your exmples better first.

@MurrayIII
Copy link

Since Firefox displays the groups correctly without the float's, I recommend that the maligngroup/malignmark polyfill drop them and that we add an issue for mathml-core to fix Chromium. Interestingly, MathJax doesn't uniformly right justify mtd right-align text with or without a float, so probably we should add an issue for MathJax as well.

@davidcarlisle
Copy link
Collaborator

The float:right displays a column with a single character correctly, but it stacks multiple characters above one another as in

image

Without the float:right, this displays as

image

Perhaps the polyfill should only emit float:right for single-character maligngroups.

@MurrayIII sorry for the late reply, but do you have a mathm source for that example that wraps badly?

Perhaps we can prevent that by adding some css white space properties.

I feel a bit hesitant to go with the asymmetric float:right but no float:left but I can't get anything else to work in Chrome/Edge etc.
(it's not needed in firefox where text-align:right works as expected). It would be good to have a few more examples.

@dginev
Copy link
Contributor

dginev commented Feb 13, 2025

@davidcarlisle This was captured by the second example in the codepen from my adjacent comment

https://codepen.io/dginev/pen/LEPQNEX

@MurrayIII
Copy link

In the codepen, the '𝑥≥0' should be aligned right. Adding 'float:right' bunches the character to the right. These problems occur in Chromium, but not in FireFox or Safari. If the first case includes the leading 'if ', the cases have the same widths and are aligned correctly. But omitting the 'if ' illustrates the bug. Here's a snip of the result in the codepen

Image

@davidcarlisle
Copy link
Collaborator

@MurrayIII thanks yes the problem is that float forces the object to have display property block which then allows it to wrap to be as narrow as possible. when asking earlier I had missed @dginev 's codepen above, which adds

mtd.guard > * {
  color:red;
  display: math inline;
  vertical-align: bottom;
}

(actually inline but I think inline math is more correct, but either work. this corrects for the block property on the children and you get

Image

with the second one (with some red debugging colour here) being your example with wrapping stopped.

so it's adding more css to compensate for the bad effects of float:right when it isn't clear why float is needed at all, but it does have the good property of working, rather than not working.

@MurrayIII
Copy link

I changed my codepen to use Deyen's class="right guard". Looks almost correct, but the cases aren't centered vertically on the math axis.
Image
Doesn't work as well in Firefox (although the minus sign is on the math axis)
Image

@MurrayIII
Copy link

MurrayIII commented Feb 13, 2025

In Safari, the codepen output is
Image
In MathJax it looks like

Image
In Microsoft Word it looks like (if maligngroup and malignmark are used)

Image

@davidcarlisle
Copy link
Collaborator

@MurrayIII thanks, I suspect in the end we'll need to add this via a javascript polyfill rather than direct css, in which case we'll be able to tweak things per-browser and perhaps not use float at all for firefox and safari.
In any case we don't need to specify the code in the spec just hint that code is possible.
As malignmark code isn't implemented in mathml-core we'll need to provide something in any case whatever the spec says. So I think the experiments here (thanks @dginev ) show enough is possible that I think I can write in the spec that it is possible to align mtable columns via css, and still sleep at night with a clear concience.

@MurrayIII
Copy link

Changing the vertical alignments slightly in the codepen, we get

Image
That looks good! Firefox centers columns and Safari isn't perfect, but at least Chromium works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Issues affecting backward compatibility MathML 4 Issues affecting the MathML 4 specification need polyfill Issues requiring implementation changes need specification update Issues requiring specification changes
Projects
None yet
Development

No branches or pull requests

7 participants