IntentName view of literals #459

dginev · 2023-04-05T21:34:19Z

This PR is a follow-up on the discussion in #457 , and also attempts at consolidating a range of perspectives that I have held for some time, but haven't succeeded in contributing to the spec text (yet?)

The grammar is quite useful for summarizing such changesets I think, so I am adding it excerpted here:

intent             := self-property-list | expression
self-property-list := property+
expression         := S ( atom property* | application ) S
atom               := concept | literal | number | reference
application        := expression '(' arguments ')'
arguments          := expression ( ',' expression )*

number             := '-'? \d+ ( '.' \d+ )?
reference          := '$' NCName
concept            := IntentName
literal            := '_' IntentName?
property           := S ':' IntentName
IntentName         := L ('-'? L)*
S                  := [ \t\n\r]*

It can also be viewed at the gh-pages of dginev/mathml, though I suspect that preview is quite temporary.

Itemized list of proposals:

Introduce an IntentName that is a dash-separated sequence of letters, and use it for all name-like categories. Keep NCName for id-like categories.
Separate concept from literal where only entries starting with _ are considered grammatically "literal".
- Unknown concepts are still to be spoken as if they were literal, but AT can assume their intention was to provide a name, rather than a raw textual override. So _by should always be a literal in its English use, while _factorial should only ever be a literal if it is trying to override the factorial name to always use that string.
~~The top-level intent is either provided, an expression, or instead it is left implied while carrying a list of properties.~~ - independently added via table and script properties #462
I've grouped all variants in the grammar that do not have inner syntax to an atom category. They happen to be the only expressions allowed properties for now (based on discussions so far)

As to the substance of #457 - I have de-emphasized the role of "known properties" in the text, and re-emphasized the grammatical structure for application. I added the clarification for the _ function head to the application rule, which seems to be a good place to encounter it.

I still find it hard to write these PRs, and my workflow is a bit clunky, so there may be markup errors here, or awkward phrasing. Language feedback is very welcome - I can improve the text as needed.

I also have a hat which likes to criticize grammars for getting more verbose - and this PR indeed does that. A lot of that comes as part of trying to wrestle with seeking clarity between the types of data we are working with (e.g. separate fully concepts vs literal) while also living with the realities imposed by basing this all on NCName. Not sure if this is optimal, but I think I like it more than what's currently in the spec.

davidcarlisle · 2023-04-05T23:19:13Z

In the discussions around switching to "template" variants, my feeling was the consensus was that _(_xxx,... was accepted as a "consequence of the grammar" so not something the spec would advertise as a feature, more than the examples already there. This goes in rather the opposite direction.

I think separating concept and literal in the gramar is problematical as there are no testable differences in behaviour as the concept list is open. I don't think we should be encouraging over-use of _ prefixed names.

Specifically on letters, why restrict to just letters? Currently valid examples such as intent='l2-norm($x) seem natural to me.
The actual defintion of letter used, https://www.w3.org/TR/REC-xml/#NT-Letter is essentially Unicode 2. That proved problematic and XML dropped that definition, that reference is prefixed by the text

Because of changes to productions [4] and [5], the productions in this Appendix are now orphaned and not used anymore in determining name characters. This Appendix may be removed in a future edition of this specification;

As disussed in one of the issues, we could use something like categories L Mn for "letter" without restricting to Unicode 2.

Moving the description of _(.... to be read "without additional connectives" leaves open how that interacts with properties, notably _:function . I think the current version where the reading styles are associated to property names, and _( is defaulted to :silent gives a more uniform description, and matches how we can similarly describe reading styles for tables in terms of properties.

But other parts seem good

intent             := implied | expression

I'm not sure about the name implied but that's a minor point but restructuring the property-only intents to be a separate named clause improves the description of them in the text as you have a named thing to describe. Also not allowing these as function heads is probably for the best, unless someone has a pressing need for :property(thing)

NSoiffer · 2023-04-06T06:24:43Z

I'm still skeptical of having a special case meaning for an empty head. I'd rather see the production

literal            := '_' IntentName

That makes the template a little more awkward to generate since you have to come up with a name. You can still get the temple with something like _this_has_no_meaning:silent(...).

Other than that, I like the rewrite. Unlike David, I think separating out literals makes it clearer that they are different.

Not part of the rewrite, but something to change: "5.1.5.1: Tables". With properties, I think we have something to say. Or maybe nothing needs to be said since there is potentially a solution to the problem discussed in that section.

davidcarlisle · 2023-04-06T07:38:28Z

@NSoiffer

Unlike David, I think separating out literals makes it clearer that they are different.

My issue is that it makes them seem different when they are the same.

We haven't finalised what is in core, but your suggested list in google sheets has 21 entries.

So of the infinitely many names matching concept all but 21 are "unknown" and operationally identical to literal so it will lead to endless confusion and debate on whether it should be intent="rabbit" or intent="_rabbit" with the answer being, it makes no difference but the form without _ looks nicer.

Probably the final core list will be bigger than 21, but it will be a small finite number and not include rabbit

dginev · 2023-04-06T13:51:31Z

endless confusion and debate on whether it should be intent="rabbit" or intent="_rabbit"

I think there is a simple rule of thumb - "does this string denote a concept or a textual override?"

In arXiv:2209.06099 the author may have wanted to speak their rabbit polynomials directly, as in (from Theorem 1.1):

<msub intent="_rabbit($arg)">
  <mi>R</mi>
  <mi arg="arg">d</mi>
</msub>

But in a system that had pre-defined speech for rabbit polynomials (a hypothetical MathRabbit AT), they would have leveraged the concept-based AT support and marked it as:

<msub intent="rabbit-polynomial($arg)">
  <mi>R</mi>
  <mi arg="arg">d</mi>
</msub>

Similarly, a primary school teacher who wants to tone down the speech for their use of rabbit emojis could reach for

<mi intent="_bunny">🐇</mi>

but would otherwise have no reason to use a concept here otherwise, as simple variables tend to be self-voicing.

brucemiller · 2023-04-06T14:01:40Z

I would have thought that the arXiv author would have been thinking of "rabbit" (or "rabbit-polynomial") as a concept rather than "just speech". Moreover, that MathRabbit AT may or may not exist at the time of authoring, but the intent markup should be valid (or "sensible") even for other AT.

davidcarlisle · 2023-04-06T14:05:03Z

I see no reason to use _ in any of those examples

<msub intent="rabbit($arg)">
  <mi>R</mi>
  <mi arg="arg">d</mi>
</msub>

or

<msub>
  <mi  intent="rabbit">R</mi>
  <mi arg="arg">d</mi>
</msub>

make perfect sense. The paper has rabbit in the title, hard to argue it's not a concept.

The distinction is so arbitrary, subjective and time-dependent, I don't think we should separate them in the grammar.

davidfarmer · 2023-04-06T14:14:34Z

I understand the principle, but it still seems odd to me to distinguish between `intent="multiplication"` and `intent="_times"`. Another issue for me is (correct me if I am wrong), but the leading underscore says "pronounce it this way". Without the underscore it says "pronounce it this way unless you have additional information about this concept". So, why would I add an underscore when all that does is stop AT from possibly doing a better job?

davidcarlisle · 2023-04-06T15:40:23Z

according to the version here properties have no effect on unknown concepts.

f:postfix($x) would make "f of x"

To get a reading "x f` You would need a syntactic literal head,

_f:postfix($x) would make "x f

Is that intentional?

If not, more or less every reference to literal needs to be changed to say "literal or unknown concept name".

In the current spec version, this isn't an issue as a "literal" is an "unknown concept name" by definition.

dginev · 2023-04-10T13:49:41Z

To answer some of the points that were raised:

So, why would I add an underscore when all that does is stop AT from possibly doing a better job?

If the state of AT at time of publication can do a good job vocalizing via a given Intent concept, that is certainly the right approach, I agree.

There are two different situations when an underscore is helpful:

Working with an AT which does a bad job on any given rabbit concept and has no outlook for quick updates. The author / generator can still achieve a good outcome on a short notice by rewriting the annotation to an underscore form.
- mixfix notations in higher mathematics are the usual example here: there are many such notations, used rarely when compared to K12 notations, for which it would be easier for an author to directly specify the speech, than for them to reach out to some (or is it all ?) AT developer(s), in order to register the notation.
Transformation to Content forms is aided by the clear separation of literal text from concepts.
- Consider dimension:silent(8, _by, 8) where extraction is possible and _(8, _by, 8) where extraction is clearly not possible. Contrast that to dimension:silent(8, by, 8) and _(8, by, 8) where extraction looks possible in both cases, but is very slippery to do right - as it isn't immediately clear if by should be carried along as a <ci>by</ci> or <csymbol>by</csymbol> or ignored.

odd to me to distinguish between intent="multiplication" and intent="_times".

For multiplication vs _times, this is closer to the aliasing feature/consideration see issue. The focus for the underscore feature is distinguishing between intent="times" and intent="_times" in this regard, where the latter forces the exact string.

I would have thought that the arXiv author would have been thinking of "rabbit" (or "rabbit-polynomial") as a concept rather than "just speech".

That is perfectly fine, then the concept use is appropriate. Since aliasing is unresolved, it's hard to know what MathRabbit will or will not do with the "rabbit" concept without trying out a concrete implementation. If it didn't do what was expected, and showed no desire to issue patches, the underscore mechanism gives the author an out. Maybe it's better to refocus on the primary school variation, where the speech override is the only aim:

<mi intent="_bunny">🐇</mi>

according to the version here properties have no effect on unknown concepts.

When I was proof-reading, I thought my changes relegated the SHOULD to MAY for unknown concepts, but I would have to double-check. The text I was looking at was:

"In the case of a [=concept=] name, the property MAY be used in choosing the alternatives supported by the AT."

I am quite novice on the subtleties of spec use for MAY and SHOULD, but it seemed to me that if MAY was good enough for the known concepts, it should be just as adequate for the unknown ones.

Thanks for the mention of using Unicode L, I'll incorporate that. Not sure which link to use as a primary reference to it however. Ch4 of Unicode 15 defines character properties, but doesn't discuss L in particular.

davidcarlisle · 2023-04-10T14:04:34Z

by is a type of multiplication an alias for dimensional product, I can't see any advantage for using underscore at all here

by:infix(8,8)

gives the correct reading and natural functional form.

dginev · 2023-04-10T14:33:17Z

by is a type of multiplication an alias for dimensional product

We part ways here. By is first and foremost an English preposition or adverb (dictionary). The use in 8-by-8 is abbreviated from "8 rows followed by 8 columns" (where one could argue for a different verb than "follow", it is the verb for reading out the written notation).

A good article illustrating this is here:

The dimensions or order of a matrix gives the number of rows followed by the number of columns in a matrix. The order of a matrix with 3 rows and 2 columns is 3 × 2 or 3 by 2. [...]
C is a matrix of order 2 × 4 (read as ‘2 by 4’)

One can imagine "multiply x by y", "divide x by y", "represent x by y" ... where the prepositional nature of the word is also clear.

The principle is even clearer if we reach for conjunction words - and and or may be used in their formal logic denotations, but they may also be used as raw text for narration, completely disconnected from their formal senses. In _(_half-open-interval,_between,$x,_and,$y) the _and is not the same as logic1.and.

brucemiller · 2023-04-10T14:52:25Z

The usage mentioned in the Dictionary you referenced is closest to "By and measurements and amounts". where the meaning is more about combining components of a vector space. As in "I'll whack you with a two by four". I personally don't see it as being an abbreviation of anything (or I can't quite guess what). It is also a different usage than "multiply two by four". Restricting concepts to only be "Formal" concepts, seems a very painful route. OTOH, using such obviously overloaded (and thus ambiguous) terms as (probably unknown) concents, rather than literals, does also seem prone to eventual collisions.

dginev · 2023-04-10T17:16:33Z

Restricting concepts to only be "Formal" concepts, seems a very painful route.

It's more the converse - I want a vehicle that is never a concept, and is always raw text. Well-meaning practitioners may differ on what they view as a concept, so the author should always be the ultimate arbiter for their own materials. And the group can rely upon encyclopedic resources for example values that may easily reach consensus (as their presence in such resource is already proof of social consensus).

I personally don't see it as being an abbreviation of anything (or I can't quite guess what).

I cited a primary source for "followed by" above. Here is another use in compass arithmetic wiki page, where it is possibly closer to "bisecting by":

The sixteen quarter-winds are the direction points obtained by bisecting the angles between the points on the 16-wind compass rose.
[...]
As a mnemonic (memory device), minds familiar encode the meaning of "X by Y" as "one small measure" from X towards Y". It can be noted such measure is 11+1⁄4°. So, for example, "northeast by east" means "one quarter of the gap from NE towards E".
[...]
The quarter winds are expressed with an Italian phrase, "Quarto di X verso Y" (one quarter from X towards Y), or "X al Y" (X to Y) or "X per Y" (X by Y). There are no irregularities to trip over; the closest principal wind always comes first, the more distant one second, for example: north-by-east is "Quarto di Tramontana verso Greco"; and northeast-by-north is "Quarto di Greco verso Tramontana".

To me the natural way to represent such English use of by is to mark it as text:

<mi mathvariant="normal" intent="_(northeast,_by,east)">NEbE</mi>

and once the question is raised how to formalize, establish the concept at play (here it seems to be "quarter-wind") and mark it:

<mi mathvariant="normal" intent="quarter-wind:silent(northeast,_by,east)">NEbE</mi>

Maybe it is painful, but it is "the pain of formalization reserved for formalization" rather than "the pain of formalization extended into accessibility". And certainly less painful than getting whacked with a two by four :>

Edit: I should mention, once we also have a SailorCAT system that is capable of dealing with nautical units, and general vernacular, this may get trimmed down to the simplest (and originally intended) functional form:

<mi mathvariant="normal" intent="quarter-wind(northeast,east)">NEbE</mi>

davidcarlisle · 2023-04-10T17:55:53Z

you can force a literal interpretation on the current draft using a leading underdcore you do not need this pr for that.roots

on "by" of course there are multiple unrelated uses of that word, but that doesn't imply you should force it to be text with no implied semantics. root might be a radical or it might refer to roots of equations or possible other uses. but that has not stopped us using root() as a function name.

dginev · 2023-04-10T19:19:56Z

you can force a literal interpretation on the current draft using a leading underdcore you do not need this pr for that.

Yes, the PR changes the framing of that feature, it does not introduce it. I will be cleaning it up in the next couple of days to be compatible with the current state of the spec. We're likely to discuss it again next Thursday (April 20th). I doubt much will be resolved before then.

of course there are multiple unrelated uses of that word, but that doesn't imply you should force it to be text with no implied semantics

It will be the "lack of AT support in cases deemed important to remediate" that will lead to forcing text, a very practical motivator. If AT does a good job, and/or the author is satisfied with the functional notation outcome, there won't be any need for overrides.

root might be a radical or it might refer to roots of equations or possible other uses. but that has not stopped us using root() as a function name.

I had brought an example to the group some time back where one could no longer use "root" directly, which was to emphasize a conceptual nuance. In order to vocalize the principal-square-root, especially in pedagogical materials, one would reach to the differently named - but very much related - concept. Leveraging simple words for Core makes sense for the sake of convenience, such as plus, times, root, power, instead of the more conceptually correct (but longer), addition, multiplication, radical-expression, exponentiation. But using that as a basis for any claims for the Open terrain isn't really sound. If it so happened that the preferred preposition for the principal square root happened to be "over" rather than "of", there should be no issue with AT receiving the annotation:

<msqrt intent="_(principal-square-root,_over,$x)">
  <mi arg="x">x</mi>
</msqrt>

and delivering the desired narration.

davidcarlisle · 2023-04-10T20:42:11Z

<msqrt intent="_(principal-square-root,_over,$x)">

In the event I really wanted to force that wording I can't see ever wanting to encode a one argument function as a silently named function of three spurious arguments.

<msqrt intent="principal-square-root_over:prefix($x)">

is a far more reasonable way to express this. But I'd probably use

<msqrt intent="principal-square-root($x)">

"over" seems a strange word to use (although in other examples using :prefix to prevent "of" is natural.)

dginev · 2023-04-10T21:02:04Z

In the event I really wanted to force that wording I can't see ever wanting to encode a one argument function as a silently named function of three spurious arguments.

See, this is why this PR is needed. Your perspective on underscore is as if it was a symbol from a Content Dictionary. Instead, it is just a pragmatic means to an end that interoperates with the functional syntax. If the syntax is completely unpalatable we still have the option to reach for something different, such as square brackets:

<msqrt intent="[principal-square-root, _over, $x]">

Btw, principal-square-root_over:prefix is something that I think is so artificial that I wouldn't write it if I had to. I'd rather introduce empty mrow wrappers than be caught generating that :)

davidcarlisle · 2023-04-10T21:10:09Z

No sorry, this PR is a move in the wrong direction. as seen with _by it introduces a completely spurious choice for every function introduced: whether to make it grammatically a concept or a literal. In that case I would say clearly by is preferable, but if for whatever reason you prefer _by or BY or by_ that's OK. It is not helpful to force a grammatical distinction here.

NSoiffer · 2023-04-12T06:46:39Z

In reading through the discussion, I think I have a third point of view. Part of this is that although I'm the AT advocate and not the content person, I'd still like to make the two needs as compatible as possible. So I'd rather see us encouraging intent='some-meaningful-name:silent(_forced_speech, $x, _more_forced_speech) rather than _(_forced_speech, $x, _more_forced_speech). That at least gives AT a chance to do something else if it knows anything about the needs of the user and some-meaningful-name along with giving any content or search program some hope.

One thought about the difference between literals and concept names that I have not seen mentioned is internationalization. With a repository of open names, then Deyan's idea of downloading them before an AT release and doing a translation is a possibility (I'd really like to see a column that uses the name in a phrase to improve the chance that auto translation does a reasonable job, but that's not this issue).

My feeling is that literals never get translated because they wouldn't be listed in a repository of open names. Does anyone else think there is a difference between literals and concept names wrt to translation?

davidcarlisle · 2023-04-12T08:19:54Z

@NSoiffer

Does anyone else think there is a difference between literals and concept names wrt to translation?

As currently worded in the spec that's necessarily true as names known to the system are concepts, names not known are literals. As the list of names known to the system is system specific, and, as you say, possibly dynamic, there should not be a syntactic distinction and forcing the author to choose.

I would say more or less any name usable on its own or as a function head or "real" argument of a function could be known by some system so should be in the same syntactic category as concept.

If we want a syntactic "literal" for connective words, we could make them share a grammatical category with comma

if you had something like

application        := expression '(' arguments? S literal? S')'
arguments          := S literal? expression ( (',' | literal) expression )*

then you could replace commas by words

closed-interval:prefix(_from $a _to $b _inclusive)

and recover a semantic expression by dropping initial and final literals and replacing intermediate ones by commas.

This would also allow dropping spurious comma separated arguments from _(

_( _free $r _algebra_over $x )

still looks horrible but better than the version with commas, although

free-r-algebra:prefix(_over $x)

would be preferable, and easier to extract a semantically meaningful expression.

dginev · 2023-04-12T11:24:39Z

@NSoiffer

Part of this is that although I'm the AT advocate and not the content person, I'd still like to make the two needs as compatible as possible.

Same here. But I don't believe in forcing the issue of creating Content forms, because the symbols we receive (even from group members) are often too artificial to be useful.

Notice that even in David's last example he added a symbol that is too close to language with free-r-algebra. Unless there is discipline for consistent use of free-algebra:silent() we'll end up with free-r-algebra, free-z-algebra, free-c-algebra, ...

I have taken a view closer to "progressive enhancement", which I tried to illustrate recently with my quarter-wind example here. Remediators focused on speech can solve the harder issues they have quickly with _(), using the natural speech patterns they already know, and return to Content remediation after/later when that becomes a goal.

What I expect to see in practice if the _() feature is removed is a zoo of hacks such as a:silent(), z:silent(), silent:silent(), pieces:silent(), ... where uninterested remediators will simply add any string to move forward with the task they have at hand. Ending up with symbols of that quality, or indeed with symbols hiding language as in principal-square-root_over:prefix.

That will not really improve on _ but worse - will make it hard to predict which forms were intended as Content, and which weren't.

The underscore allows well-meaning remediators to state "this is really just a text override", and avoid that confusion.

Does anyone else think there is a difference between literals and concept names wrt to translation?

The state of art in translation currently uses neural language models, and they can be made to work with either setup, as long as there is a prior stage that fully serializes an intent expression in a textual form from the source language. ("free R algebra on X" already works in Google Translate for the Bulgarian translation, but Bing translate makes a mistake, translating "free" as in "cost-free algebra" rather than "libre algebra")

For workflows that do not want to rely on neural models, dictionary lookup on the fixed parts of speech is possible (e.g. prepositions, determiners, conjunctions, common adverbs, numeral words). Mapping _over to _над in Bulgarian for example will lead to a possibly bumpy but understandable baseline translation. Likely not ideal, but not useless.

I think again here there is a question of timescale - if someone wanted a translation to their language today, they would use a translation engine that exists today. If we wanted a perfect symbolic translation of free-algebra($x,$y) we would need to wait for some (all?) AT vendors to implement that. Realistically, I suspect we'll see both approaches used in practice.

davidcarlisle · 2023-04-12T11:32:11Z

Notice that even in David's last example he added a symbol that is too close to language with free-r-algebra. Unless there is discipline for consistent use of free-algebra:silent() we'll end up with free-r-algebra, free-z-algebra, free-c-algebra, .

Yes as I say I would use free-algebra($r,$x) but using those named forms would be preferable to just generating strings with _( and spurious arguments.

brucemiller · 2023-04-12T12:36:40Z

although I'm the AT advocate and not the content person, I'd still like to make the two needs as compatible as possible.

Compatible, yes, but not (I think) from the point of view of *translating* presentation+intent to content. That will be perceived as a replacement for Content MathML; and would be a very poor one unless designed to be a *complete* replacement from the start, which is way out of scope. OTOH, where the semantic slant of Concepts helps accessibility, that's a good thing! And if a MathML generator aspires to create both intent and Content MathML, it ought not to have to work with two completely different views of "semantics" and collections of dictionaries and such. So, compatibility is good!

My feeling is that literals never get translated because they wouldn't be listed in a repository of open names. Does anyone else think there is a difference between literals and concept names wrt to translation?

Although you might implement that way, ersonally, I wouldn't make that assumption at the spec level; it seems to force a particular style of implementation. But then the logistics of translation are definitely something we need to come to grips with. I see two basic strategies: If dictionaries (core, open) have translation information, then presumably the AT ends up with a sequence of translated phrases and random literals. I think you're suggesting just keeping the literals as-is, which will work sometimes and be awkward others. But the AT can attempt to translate the literals in isolation, with some external tool, and paste them back in; it might be not quite grammatical however. The other approach is to generate the complete phrase in (say) English and then pass that to an external translator. This, as @dginev suggests, is probably the first, easiest, approach. This would likely be more grammatical, but probably not mathematics appropriate.

brucemiller · 2023-04-20T13:45:47Z

I'm still finding myself with mixed feelings here. I agree with @davidcarlisle that this PR goes too far in giving prominence to literals and seeming to encourage forcing specific speech. My understanding was that this was discouraged by the AT folks. OTOH, without some part of this PR, we're leaving the notion of "literal" a bit too vague with people implicitly landing on quite different interpretations.

I think that at least we should be clearer about distinguishing "known" concepts (found in Core or Open) from unknown ones, and be clear that unknown concepts are treated like literals --- but are not literals.

davidcarlisle · 2023-04-20T16:05:42Z

@brucemiller

I think that at least we should be clearer about distinguishing "known" concepts (found in Core or Open) from unknown ones

I don't think that there should be any syntactic difference. "open" (and even possibly "core") are not (in current proposals) machine readable, but rather simply web accessible lists where implementers can record the names for which they implement rules. "open" in particular is very time dependent. What matters at run time is neither of those lists but rather the system specific list implemented by the consuming system. This is unknown to the document author so the author should not have to distinguish known concepts from unknown ones. I can not see any cases where intent="foo" should be treated as a different grammatical category to intent="_foo" We already say that the latter won't be in the lists so will be a literal.

brucemiller · 2023-04-20T17:00:01Z

distinguish known concepts from unkown ones. I can not see any cases where |intent="foo"| should be treated as a different grammatical category to `intent="_foo"`` We already say that the latter won't be in the lists so will be a literal.

IF "foo" eventually showed up in an Open Dictionary, with some behavioral information (translations or something?) and the AT used that open dictionary, the "foo" would be treated differently than "_foo". I'm not suggesting as major a change as Deyan's PR. I'm just suggesting that we be careful with the language and that: In the case where "foo" was NOT found in a dictionary, it would be "treated as a literal", but NOT that it would BE a literal.

dginev · 2023-04-24T15:10:00Z

Bookkeeping another clarifying example on the lines of by vs _by, this time using the _of preposition.

I was delighted to catch in this lecture recording the same Core concept (multiplication) spoken with different words in rapid succession, transcribed:

Theta is the same as one half of g i j times g i j dot. Which you can also think of as one half times the trace of this matrix Q.

"one-half of" vs "one-half times", both written as with $\frac{1}{2}$ on the board.

I suspect the key thing to notice here is that each language pattern is sensible because we are talking about distinct but isomorphic mathematical operations. Multiplying x by a scalar 0.5 OR having a "one-half" function that halves x, are just different abstractions over the same operation. And hence - different possible speech.

To tie the example with the discussion here, I very much agree that authors should not be encouraged to micromanage connector words. What I am trying to illustrate is that language overrides can have a clear boundary dividing them from intent Concepts. And that AT itself may want to use different connector words based on context.

It wouldn't be an error if intent="times(0.5, x)" sometimes vocalizes as "one half of x" and sometimes vocalizes as "one half times x", among the many other options ("zero point five times x",...).

So if an author wanted to force a specific reading using _of, they could use intent="_(_half,_of, $x)". I think this is useful exactly to discourage adopters from picking intent="one-half(x)" when the author wants to hear "of" and intent="times(0.5,x)" when the author wants to hear "times". If we can contain the uses of Intent which aim at micro-managing the final speech solely within the "underscore territory", we get some additional clarity/guarantees.

Aside: This also reminded me that the "half"-based reading isn't common in Bulgarian. We tend to use "one second" (sic), "one third" etc ("една втора", "една трета") and then we can also naturally alternate between the "times" ("по") and "of" ("от") prepositions. Curiously, if one wanted to insist on the "half" ("половина") word in Bulgarian, it sounds very unnatural to use the "times" ("по") connector word after, where we would commonly say "half of" ("половина от").

Apologies for the long comment, mostly wanted to add the example somewhere.

davidcarlisle · 2023-04-24T15:18:27Z

So if an author wanted to force a specific reading using of, they could use intent="(_half,_of, $x)". I think this is useful exactly to discourage adopters from picking intent="one-half(x)" when the author wants to hear "of" and intent="times(0.5,x)" when the author wants to hear "times". If we can contain the uses of Intent which aim at micro-managing the final speech solely within the "underscore territory", we get some additional clarity/guarantees.

Authors should be strongly discouraged from doing this at all but if they really must it is far preferable to use one-half($x) using a semantically meaningful half function than the essentially meaningless construct that more or less accidentally produces meaningful speech _(_half,_of, $x)

dginev · 2023-04-24T15:31:18Z

Seen differently: When we are reaching for a "speech override" using a semantically void construct is exactly the correct design, because it clearly indicates the override was intended, and not accidental (= using some temporary implementation state of AT that may change with the next major version of the software).

davidcarlisle · 2023-04-24T15:46:05Z

intent doesn't have a speech over-ride feature, it's just that if you (ab)use a silent function head and the fact that spurious arguments are spoken literally then you can in fact force any speech. even then half:silent(_half_of, $x) would be preferable to _(_half_of,$x) as it gives a simpler way to re-construct some semantics than the algorithm in the github pages doc for lifting semantics from _(

brucemiller · 2023-04-24T16:33:57Z

The more it gets promoted as a thing to do, the more it feels like a bug rather than a feature.

dginev · 2023-05-04T17:30:45Z

Resolved in light of #466 , as discussed in the meeting on May 4th 2023.

dginev added 2 commits April 19, 2023 14:16

IntentName view of literals and related rewording

4d1b62f

empty underscore rule

6a69c6c

dginev force-pushed the intent-underscore-review branch from d9f3981 to 6a69c6c Compare April 19, 2023 18:19

dginev changed the title ~~IntentName view of literals~~ [WIP] IntentName view of literals Apr 20, 2023

brucemiller mentioned this pull request May 3, 2023

Distinguish literal from unknown concept #466

Merged

dginev changed the title ~~[WIP] IntentName view of literals~~ IntentName view of literals May 4, 2023

dginev closed this May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IntentName view of literals #459

IntentName view of literals #459

dginev commented Apr 5, 2023 •

edited

Loading

davidcarlisle commented Apr 5, 2023 •

edited

Loading

NSoiffer commented Apr 6, 2023

davidcarlisle commented Apr 6, 2023

dginev commented Apr 6, 2023 •

edited

Loading

brucemiller commented Apr 6, 2023

davidcarlisle commented Apr 6, 2023

davidfarmer commented Apr 6, 2023 via email

davidcarlisle commented Apr 6, 2023

dginev commented Apr 10, 2023

davidcarlisle commented Apr 10, 2023

dginev commented Apr 10, 2023 •

edited

Loading

brucemiller commented Apr 10, 2023 via email

dginev commented Apr 10, 2023 •

edited

Loading

davidcarlisle commented Apr 10, 2023

dginev commented Apr 10, 2023 •

edited

Loading

davidcarlisle commented Apr 10, 2023

dginev commented Apr 10, 2023

davidcarlisle commented Apr 10, 2023 •

edited

Loading

NSoiffer commented Apr 12, 2023

davidcarlisle commented Apr 12, 2023

dginev commented Apr 12, 2023

davidcarlisle commented Apr 12, 2023

brucemiller commented Apr 12, 2023 via email

brucemiller commented Apr 20, 2023

davidcarlisle commented Apr 20, 2023 •

edited

Loading

brucemiller commented Apr 20, 2023 via email

dginev commented Apr 24, 2023

davidcarlisle commented Apr 24, 2023 •

edited

Loading

dginev commented Apr 24, 2023

davidcarlisle commented Apr 24, 2023

brucemiller commented Apr 24, 2023

dginev commented May 4, 2023

IntentName view of literals #459

IntentName view of literals #459

Conversation

dginev commented Apr 5, 2023 • edited Loading

davidcarlisle commented Apr 5, 2023 • edited Loading

NSoiffer commented Apr 6, 2023

davidcarlisle commented Apr 6, 2023

dginev commented Apr 6, 2023 • edited Loading

brucemiller commented Apr 6, 2023

davidcarlisle commented Apr 6, 2023

davidfarmer commented Apr 6, 2023 via email

davidcarlisle commented Apr 6, 2023

dginev commented Apr 10, 2023

davidcarlisle commented Apr 10, 2023

dginev commented Apr 10, 2023 • edited Loading

brucemiller commented Apr 10, 2023 via email

dginev commented Apr 10, 2023 • edited Loading

davidcarlisle commented Apr 10, 2023

dginev commented Apr 10, 2023 • edited Loading

davidcarlisle commented Apr 10, 2023

dginev commented Apr 10, 2023

davidcarlisle commented Apr 10, 2023 • edited Loading

NSoiffer commented Apr 12, 2023

davidcarlisle commented Apr 12, 2023

dginev commented Apr 12, 2023

davidcarlisle commented Apr 12, 2023

brucemiller commented Apr 12, 2023 via email

brucemiller commented Apr 20, 2023

davidcarlisle commented Apr 20, 2023 • edited Loading

brucemiller commented Apr 20, 2023 via email

dginev commented Apr 24, 2023

davidcarlisle commented Apr 24, 2023 • edited Loading

dginev commented Apr 24, 2023

davidcarlisle commented Apr 24, 2023

brucemiller commented Apr 24, 2023

dginev commented May 4, 2023

dginev commented Apr 5, 2023 •

edited

Loading

davidcarlisle commented Apr 5, 2023 •

edited

Loading

dginev commented Apr 6, 2023 •

edited

Loading

dginev commented Apr 10, 2023 •

edited

Loading

dginev commented Apr 10, 2023 •

edited

Loading

dginev commented Apr 10, 2023 •

edited

Loading

davidcarlisle commented Apr 10, 2023 •

edited

Loading

davidcarlisle commented Apr 20, 2023 •

edited

Loading

davidcarlisle commented Apr 24, 2023 •

edited

Loading