Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intent for large operators #482

Open
NSoiffer opened this issue Jan 4, 2024 · 15 comments
Open

Intent for large operators #482

NSoiffer opened this issue Jan 4, 2024 · 15 comments
Labels
intent Issues involving the proposed "intent" attr

Comments

@NSoiffer
Copy link
Contributor

NSoiffer commented Jan 4, 2024

There are about 10 large operators that probably make sense to go into core (maybe integral, double integral, triple integral, contour integral, surface integral, volume integral, sum, product, coproduct, union, intersection). Likely what result is decided for core should be extended to open for the other large operators (e.g., ⊍).

These are all very similar in structure in that intent potentially goes on msub/munder with one argument (typically specifying a domain for the "index") and msubsup/munderover with two arguments ("... from xxx to yyy"). Or they go on some containing mrow with an additional argument (e.g., "... from xxx to yyy of zzz"). If they go on one of the scripting elements, then there is no need for intents for indefinite integration or sums that don't have limits. If they go on an mrow, then maybe it makes sense to have an intent for them although Neil felt the speech needs no intent because there is no other sensible speech for "integral", "sum", etc.

In the Dec 21 meeting, no one stood up for the "dx" being part of the argument for integral as it would be spoken "dx" wherever it was and didn't need help from an intent.

In the meeting, Neil felt that listing these all out both uses up a lot space (and hence appears complicated) and more importantly, obscures their similarity making it harder on both generators and consumers of the spec. His suggestion is to create another list between the "Core Concept Default Fixity properties" and the "Core Concept Templates". Others were not enthusiastic with that idea.

This issue provides a place to discuss the pros and cons of how intents for large operators should be handled.

@dginev dginev added the intent Issues involving the proposed "intent" attr label Jan 4, 2024
@dginev
Copy link
Contributor

dginev commented Jan 5, 2024

I'm generally skeptical that custom names offer a significant advantage (e.g. reduced list length) over consistently following a uniform naming convention. Uniform naming keeps the learning curve as small as possible, and aids adoption.

To an adopter, a new large-operator entry raises the question why we don't have prefix-operator, postfix-operator or infix-operator. To me this looks like the same kind of pattern that is addressed by :prefix, :infix, etc properties. I would have imagined those fond of fixity properties would have added yet another fixity construct, as in :indexed-operator or simply :indexed, and documenting the list of (10?) Core large operators to be in that "default fixity".

In the absence of some consistent rule for choosing argument order, we'd need to document each case separately, which is why I had previously raised #478 .

@NSoiffer
Copy link
Contributor Author

NSoiffer commented Jan 5, 2024

I'm not advocating adding a "large-operator" concept name -- I'm merely advocating for an organizational arrangement of the names that groups the large operators together to avoid a lot of repetition. My goal is to reduce the size and apparent complexity of the spec.

I do think we need to add a few more "fixity" options, but that's not what this thread is about. This thread is about where the intent for large operators should be placed/what the number of arguments are to the intent along with how we should describe them.

@davidcarlisle
Copy link
Collaborator

I'll make a fork with an experimental rendering of the condensed list

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Nov 16, 2024

At the last WG meeting I took an action item to look again at this.

Unicode 16 (and MathClass-15) have 66 characters classified as largeop (mathclass="L" in unicode.xml) of which 17 have Unicode name containing N-ARY

The full list is at the end of this post.

This shows several categories of concept/common character that could potentially be compressed

  • Integrals
  • radicals
  • Bigop where the same concept is used :infix on a binary character
    so the concept intersection may be used with U+22C2 ⋂ or as :infix with U+2229 ∩
  • As above where (for whatever reason) the infix concept is considered separate
    so sum U+2211 ∑ and plus U+002B +

We could have one of entries in the current style for each of these groups then in each case list the concept names and default characters for the other entries in the group.

But split this way there are not really many in each group and I wonder if the indirection really helps or whether it would just be simpler to list each of them separatelyi n the main list, as the list would not be that long as probably several of these characters do not correspond to any concept that we would have in the core list.


mathclass L 66

U00606 ARABIC-INDIC CUBE ROOT 
U00607 ARABIC-INDIC FOURTH ROOT 
U02140 DOUBLE-STRUCK N-ARY SUMMATION 
U0220F N-ARY PRODUCT 
U02210 N-ARY COPRODUCT 
U02211 N-ARY SUMMATION 
U0221A SQUARE ROOT 
U0221B CUBE ROOT 
U0221C FOURTH ROOT 
U0222B INTEGRAL 
U0222C DOUBLE INTEGRAL 
U0222D TRIPLE INTEGRAL 
U0222E CONTOUR INTEGRAL 
U0222F SURFACE INTEGRAL 
U02230 VOLUME INTEGRAL 
U02231 CLOCKWISE INTEGRAL 
U02232 CLOCKWISE CONTOUR INTEGRAL 
U02233 ANTICLOCKWISE CONTOUR INTEGRAL 
U022C0 N-ARY LOGICAL AND 
U022C1 N-ARY LOGICAL OR 
U022C2 N-ARY INTERSECTION 
U022C3 N-ARY UNION 
U027CC LONG DIVISION 
U027D5 LEFT OUTER JOIN 
U027D6 RIGHT OUTER JOIN 
U027D7 FULL OUTER JOIN 
U027D8 LARGE UP TACK 
U027D9 LARGE DOWN TACK 
U029F8 BIG SOLIDUS 
U029F9 BIG REVERSE SOLIDUS 
U02A00 N-ARY CIRCLED DOT OPERATOR 
U02A01 N-ARY CIRCLED PLUS OPERATOR 
U02A02 N-ARY CIRCLED TIMES OPERATOR 
U02A03 N-ARY UNION OPERATOR WITH DOT 
U02A04 N-ARY UNION OPERATOR WITH PLUS 
U02A05 N-ARY SQUARE INTERSECTION OPERATOR 
U02A06 N-ARY SQUARE UNION OPERATOR 
U02A07 TWO LOGICAL AND OPERATOR 
U02A08 TWO LOGICAL OR OPERATOR 
U02A09 N-ARY TIMES OPERATOR 
U02A0A MODULO TWO SUM 
U02A0B SUMMATION WITH INTEGRAL 
U02A0C QUADRUPLE INTEGRAL OPERATOR 
U02A0D FINITE PART INTEGRAL 
U02A0E INTEGRAL WITH DOUBLE STROKE 
U02A0F INTEGRAL AVERAGE WITH SLASH 
U02A10 CIRCULATION FUNCTION 
U02A11 ANTICLOCKWISE INTEGRATION 
U02A12 LINE INTEGRATION WITH RECTANGULAR PATH AROUND POLE 
U02A13 LINE INTEGRATION WITH SEMICIRCULAR PATH AROUND POLE 
U02A14 LINE INTEGRATION NOT INCLUDING THE POLE 
U02A15 INTEGRAL AROUND A POINT OPERATOR 
U02A16 QUATERNION INTEGRAL OPERATOR 
U02A17 INTEGRAL WITH LEFTWARDS ARROW WITH HOOK 
U02A18 INTEGRAL WITH TIMES SIGN 
U02A19 INTEGRAL WITH INTERSECTION 
U02A1A INTEGRAL WITH UNION 
U02A1B INTEGRAL WITH OVERBAR 
U02A1C INTEGRAL WITH UNDERBAR 
U02A1D JOIN 
U02A1E LARGE LEFT TRIANGLE OPERATOR 
U02A1F Z NOTATION SCHEMA COMPOSITION 
U02A20 Z NOTATION SCHEMA PIPING 
U02A21 Z NOTATION SCHEMA PROJECTION 
U02AFC LARGE TRIPLE VERTICAL BAR OPERATOR 
U02AFF N-ARY WHITE VERTICAL BAR 

n-ary 17


U02140 DOUBLE-STRUCK N-ARY SUMMATION 
U0220F N-ARY PRODUCT 
U02210 N-ARY COPRODUCT 
U02211 N-ARY SUMMATION 
U022C0 N-ARY LOGICAL AND 
U022C1 N-ARY LOGICAL OR 
U022C2 N-ARY INTERSECTION 
U022C3 N-ARY UNION 
U02A00 N-ARY CIRCLED DOT OPERATOR 
U02A01 N-ARY CIRCLED PLUS OPERATOR 
U02A02 N-ARY CIRCLED TIMES OPERATOR 
U02A03 N-ARY UNION OPERATOR WITH DOT 
U02A04 N-ARY UNION OPERATOR WITH PLUS 
U02A05 N-ARY SQUARE INTERSECTION OPERATOR 
U02A06 N-ARY SQUARE UNION OPERATOR 
U02A09 N-ARY TIMES OPERATOR 
U02AFF N-ARY WHITE VERTICAL BAR 

@NSoiffer
Copy link
Contributor Author

The size of the lists is actually smaller than those lists, because they would be split across core and open lists. So that might argue against have a special category that consolidates the lists.

However, as mentioned in the initial comment, each operator has three variants: unadorned, adorned with just a subscript/underscript, adorned with two scripts. All of those need to be listed. So that makes the lists 3 times larger than the number of characters, or maybe 5 times larger if we break out msub/munder, etc.

On top of that, we still (I think) need to decide whether the core concept for the intent goes on the adorned large operator or on mrow for the entire concept, or both. If both, that's two times more listings on top of the other multiplies. That's a lot of spec space for essentially identical prose. Based on a philosophy that we've agreed on over the years but I don't think written down, the intent should be as low as possible in the MathML tree. So I'm in favor of it only being shown on a potentially adorned script.

Because of this multiplicative effect, I'm in favor a condensed list. If we agree on the 11 large operators mentioned in the first comment as what goes into core, that's potentially one extended listing versus 33 or maybe 55 individual listings. That's a lot of space savings. For the open list (assuming we add all or most of the large operator to that list), the space savings is huge.

@davidcarlisle
Copy link
Collaborator

OK I'll experiment on my fork, see what it looks like...

@dginev
Copy link
Contributor

dginev commented Nov 16, 2024

Practical suggestion: for very related cases, such as some integral signs, develop one concept fully and for the others just add 1 row with:

expressions follow the structure for the 'integral' concept to the Comments column.

the space savings is huge.

Is this an organizational question for the HTML concept list pages? There are standard approaches to manage length, for example pagination (e.g. max 100 concepts per page) and sub-pages (e.g. the different aritiy rows could be subpages linked from the outer page that has 1 row per concept).

If the Open list grows as much as it should, these techniques may become necessary on the frontend side. There are also js frameworks capable of navigating extremely large lists (100,000 rows with 22 columns in that example).

At least for the Open side of the question, it would be better to prepare for healthy growth, rather than try to constrain the space with custom conventions.

The condensed version of the table @NSoiffer is suggesting is starting to read like a math grammar to me. I hope that remains out of scope for the list pages, as it changes their character and makes it harder to contribute new concepts, as we no longer have a uniform organization.

Btw, one design philosophy we have written down are the Guidelines for core list curation. Note item 4.

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Nov 16, 2024

Practical suggestion: for very related cases, such as some integral signs, develop one concept fully and for the others just add 1 row with:

yes some possibly slightly more formalised version of that is the plan (think)

Currently the issue is I think mostly about re-organising the yaml input (to make it possibly easier for implemntations to deal with similar concepts with shared code)

However you are correct the html display may also become an issue

Is this an organizational question for the HTML concept list pages? There are standard approaches to manage length, for example pagination (e.g. max 100 concepts per page

Yes the current display is very minimalist. On the other hand pagination is possibly less needed than it was, eg in previous iterations we always had the mathml spec split by chapter as the whole thing was too big to load in practice, but these days loading the whole spec isn't really an issue at all.

But jekyll does have some built in pagination features we could invoke without having to change the build too much if that does prove to be an issue in the open list (I can't see it being needed for core list)

@davidcarlisle
Copy link
Collaborator

@NSoiffer no PR yet but made a start

https://davidcarlisle.github.io/mathml-docs/intent-core-concepts/#default-large-operator-concepts

source diff

w3c/mathml-docs@main...davidcarlisle:mathml-docs:main

Currently it pulls the sum template out of the main list (so it appears twice, but we could hide the second) not sure whether the yaml is easier to understand if this is on its own at the start, or if it is in its "correct" mathematical section as now.

not sure yet how best to resolve conflicts with infix ops eg

https://davidcarlisle.github.io/mathml-docs/intent-core-concepts/#union

is currently double defined, the one that works actual goes to the default infix fixity list, not the largeop.

We could have the default infix line for plus / + and then for each character say whether it has an infix version or just the large-op, not sure....

@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Nov 17, 2024

the existing comments in the sum core entry seemed to indicate that the intent should or could go on the munderover but in the examples I added to the fork, I only managed to place it on the mrow.

you could use the

0-arity form on the mo

<mo intent="sum">∑</mo>

sum

1-arity form for implict sum
<mrow intent="sum($f)"><mo>∑</mo><mrow arg="f"><mi>f</mi><mo>(</mo><mi>x</mmi>...

sum of f of x

2-arity form for sum over a range
<mrow intent="sum($r,$f)"><munder><mo>∑</mo><mi arg=r>R</mi></munder><mrow arg="f"><mi>f</mi><mo>(</mo><mi>x</mi>...

sum over R of f of ..

3-arity form for sum between limits
<mrow intent="sum($a,$b,$f)"><munderover><mo>∑</mo><mi arg="a">a</mi><mi arg="b">b</mi></munderover><mi>f</mi><mo>(</mo><mi>x</mmi>...

sum from a to b of f of ..

But that doesn't really leave any good way to mark up the summation without the summand expression

Perhaps this?

<munderover intent="sum($a,$b,_)"><mo>∑</mo><mi arg="a">a</mi><mi arg="b">b</mi></munderover>

sum from a to b

or in principle we could give a different interpretation for the arguments in the 2-arity form with a different property so

<munderover intent="sum:prefix($a,$b)"><mo>∑</mo><mi arg="a">a</mi><mi arg="b">b</mi></munderover>

sum from a to b

@dginev
Copy link
Contributor

dginev commented Nov 19, 2024

One additional markup variant that has been discussed in general and fits with David's sum examples is using a "higher-order" application. There the summation is first attached to its indexing signature. And then, a level up, is applied to an argument.

In the process of writing my example I also noticed that summations typically have an explicit indexing variable, which is also used in the argument being summed over. And wondered if that is motivation to reuse the same intent concept index? Maybe not, but here is some prototype markup for what I am wondering about:

$$ \sum_{i=a}^b f(x_i) $$

<mrow intent="$sum_op($arg)">
  <munderover arg="sum_op" intent="index($op, in($index_var, interval($from,$to)))">
    <mo intent="sum" arg="op">∑</mo>
    <mrow>
      <mi arg="index_var">i</mi>
      <mo>=</mo>
      <mi arg="from">a</mi>
    </mrow>
    <mi arg="to">b</mi>
  </munderover>
  <mrow arg="arg">
    <mi>f</mi><mo>(</mo>
      <msub intent="index(x,$index_var)">
        <mi>x</mi>
        <mi arg="index_var">i</mi>
      </msub>
    <mo>)</mo>
  </mrow>
</mrow>

Some of that was discussed in #454.

This is one of the cases where one can really feel that intent has a compact syntax. There are various extensions that jump to mind to tidy up and clearly mark the different kinds of arguments.

The way I wrote the intent expressions above backs out of any assumptions about "sum" being a known operator, and ought to be usable even without custom conventions. But it gets verbose and functional, much more so than a convention-based sum(i,a,b,$arg) or such.

Here are two additional examples, showing how the markup changes when i is left "to the reader", and when there is no variable at all.

$$ \sum_{i} f(x_i) $$

<mrow intent="$sum_op($arg)">
  <munder arg="sum_op" intent="index($op, $index_var))">
    <mo intent="sum" arg="op">∑</mo>
    <mi arg="index_var">i</mi>
  </munder>
  <mrow arg="arg">
    <mi>f</mi><mo>(</mo>
      <msub intent="index(x,$index_var)">
        <mi>x</mi>
        <mi arg="index_var">i</mi>
      </msub>
    <mo>)</mo>
  </mrow>
</mrow>

$$ \sum f(x_i) $$

<mrow intent="$sum_op($arg)">
  <mo intent="sum" arg="sum_op">∑</mo>
  <mrow arg="arg">
    <mi>f</mi><mo>(</mo>
      <msub intent="index(x,$index_var)">
        <mi>x</mi>
        <mi arg="index_var">i</mi>
      </msub>
    <mo>)</mo>
  </mrow>
</mrow>

To summarize: A sum may be indexed or bare, and its indexing variable may be constrained by a range or bare. And each of those cases may benefit from specialized speech.

@NSoiffer
Copy link
Contributor Author

@davidcarlisle : what you did is not bad. I started to reply with a suggestion and then realized that this is not needed at all: we have :largeop. With this property, there is no need for a sum, etc., concept name. You just need to mark the munderover, with intent=":largeop" and hopefully good speech is generated by the AT. This will work for any of the 83 characters you found... plus any others that are (rightly or wrongly) tagged with this property.

Doing this is a much simpler approach than the other suggestions in this issue. And the only changes that need to be made to the core concept list are to remove sum, product, and integral and any other large op that we in the list (from the distant past).

@dginev
Copy link
Contributor

dginev commented Nov 21, 2024

Indeed, one choice here is deciding whether we want "hopefully" (Core list) or "certainly" (Open list) for large operators.

@davidcarlisle
Copy link
Collaborator

@NSoiffer oh :largeop had completely gone out of my mind, yes that changes things...

@NSoiffer
Copy link
Contributor Author

Indeed, one choice here is deciding whether we want "hopefully" (Core list) or "certainly" (Open list) for large operators.

Apologies. I'm missing your point other than my use of the word "hopefully" (AT SHOULD use the core list, but it doesn't have to, hence "hopefully"). I don't understand '"certainly" (Open list) for large operators'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
intent Issues involving the proposed "intent" attr
Projects
None yet
Development

No branches or pull requests

3 participants