Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please remove onError parameter from format/formatToParts #58

Open
FrankYFTang opened this issue May 2, 2024 · 19 comments
Open

Please remove onError parameter from format/formatToParts #58

FrankYFTang opened this issue May 2, 2024 · 19 comments

Comments

@FrankYFTang
Copy link
Contributor

Currently, the format/formatToParts function
1.3.3 Intl.MessageFormat.prototype.format ( [ values [ , onError ] ] )
and
1.3.4 Intl.MessageFormat.prototype.formatToParts ( [ values [ , onError ] ] )

has a new onError parameters and introduce a new requirement for format function in ECMA402.

https://tc39.es/proposal-intl-messageformat/#sec-intl.messageformat.prototype.format

I am very concern about this and I do not think this functionality is useful. Could we remove this functionality from this proposal and just let the format/formatToParts throw ?

How would this functionality be useful in real life programming? How would that user code looks like? Why this functionality (handling the missing data) cannot be done in the user land?

@FrankYFTang
Copy link
Contributor Author

Currently, it is used in HandleMessageFormatError(ctx, error).
1.5.5 ResolveExpression ( ctx, expression )
in https://tc39.es/proposal-intl-messageformat/#sec-resolveexpression

in 1.5.5.1 ResolveFunction ( ctx, source, resArg, annotation )
https://tc39.es/proposal-intl-messageformat/#sec-resolvefunction

1.5.6 FormatMarkupPart ( ctx, markup )
https://tc39.es/proposal-intl-messageformat/#sec-formatmarkuppart

1.5.10 ResolveValue ( ctx, value )
https://tc39.es/proposal-intl-messageformat/#sec-resolvevalue

1.3.3 Intl.MessageFormat.prototype.format ( [ values [ , onError ] ] )
https://tc39.es/proposal-intl-messageformat/#sec-intl.messageformat.prototype.format

1.3.4 Intl.MessageFormat.prototype.formatToParts ( [ values [ , onError ] ] )
https://tc39.es/proposal-intl-messageformat/#sec-intl.messageformat.prototype.formatToParts

1.5.4.1 SelectPattern ( ctx )
https://tc39.es/proposal-intl-messageformat/#sec-selectpattern

Why don't we just throw some kind of Error in these cases instead?

@eemeli
Copy link
Member

eemeli commented May 4, 2024

The immediate reason for the onError argument is that it's required by MF2:

In all cases, when encountering a runtime error, a message formatter MUST provide some representation of the message. An informative error or errors MUST also be separately provided.

At a slightly deeper level, the reason for this approach is to ensure that localization will always provide some result to a user, rather than just failing. MessageFormat is different from the other Intl formatters in that it relies not only on testable CLDR data, but also on user-provided data, i.e. the message. For locales beyond the source locale used by the original developer, that data is the result of translation work, and translation may introduce bugs that can cause a runtime error. Sure, validation can catch many such bugs, but no system is perfect.

In other words, an error thrown by e.g. NumberFormat or DateTimeFormat is actionable by a developer, and rarely seen in production use, but an error thrown by MessageFormat is often not visible to a developer or even caught by CI systems, and much more likely to end up happening at production runtime when it's not actionable.

Throwing from MessageFormat would effectively introduce an expectation that all of its users have a full test suite that they run for all locales when any of its contents change. I do not think this is a reasonable expectation, as I'm not sure that anything like that is currently done by anyone.

@FrankYFTang
Copy link
Contributor Author

FrankYFTang commented May 9, 2024

I do not understand about this, you argue MessageFormat is different because " is often not visible to a developer or even caught by CI systems, and much more likely to end up happening at production runtime when it's not actionable."

If that is truely not actionable, then why do you need an onError parameter at all? In that case, the developer cannot take any action anyway so the onError could only be a noop, right?

Currently the HandleMessageFormatError will pass completionRecord.[[Value]] to onError if one is provided. So from my understand, there are two kind of conditions

Case A: The error is not actionable
Case B: The error is actionable

Therefore, for Case A, the completionRecord.[[Value]] is not needed to pass to onError and an onError is not need to be registered, because it is not actionable even if the onError receive the completionRecord.[[Value]] and can do nothing (since it is not actionable, mean no action can be taken to make thing better)

Now, for Case B, I think that is the part that the onError and completionRecord.[[Value]] is useful, right?

so in that case, your current design requires

fmt = new Intl.MessageFormat(...);
let myError = function(errorData) { /* do something */ };
let str = fmt.format(obj, myError);
/* do something with str */

Why it cannot be

fmt = new Intl.MessageFormat(...);
let myError = function(errorData) { /* do something */ };
try {
   str = fmt.format(obj);
} catch(e) {
   myError(e.errorData) 
}

I understand, sometime MessageFormat.format() may encounter error internally and need to still return a string but with informaiton about what were fallbacked during the format() operation. I wonder, could that be done via

fmt = new Intl.MessageFormat(...);
let myLogFallbackInfo = function(fallbackReasons) { /* do something */ };
let result = fmt.formatWithFallbackInfo(obj);
let str = result.formatted;
myLogFallbackInfo(result.fallbackReasons);

@sffc
The difference is w/ the onError handler, you can only see the partial information about the fallback one at a time, during the format but the formatWithFallbackInfo approach (or. we can use a different name) allow the myLogFallbackInfo to see all the fallback information at once (it could be an array of all the error data of 5 fallbacks during the format () call)

@sffc
Copy link

sffc commented May 10, 2024

I don't have an official position yet, but it seems that Intl.MessageFormat.prototype.format could throw an exception of type Intl.MessageFormatError which could have a field such as fallbackMessage that contains the message with fallback placeholders as required by the specification.

@eemeli
Copy link
Member

eemeli commented May 13, 2024

Replying to @FrankYFTang:

I do not understand about this, you argue MessageFormat is different because " is often not visible to a developer or even caught by CI systems, and much more likely to end up happening at production runtime when it's not actionable."

If that is truely not actionable, then why do you need an onError parameter at all? In that case, the developer cannot take any action anyway so the onError could only be a noop, right?

Continuing a bit from my previous comment, the two main use cases that the onError callback is designed to handle are:

  1. Providing a way to report errors encountered in production to developers, while still providing the user with a fallback formatted result.
  2. Throwing rather than warning about errors, e.g. when they are encountered during testing.

So while a thrown error isn't necessarily actionable to the user, reporting it to a developer could make it actionable by allowing them to fix it later.

Now, for Case B, I think that is the part that the onError and completionRecord.[[Value]] is useful, right?

In case it's not clear in the spec, in all the cases where onError is called, the completionRecord.[[Value]] is an Error, and its value is never used by anything within Intl.MessageFormat after onError is called with it as an argument.

So onError is not able to change anything about the fallback formatting of the message or its placeholders.

I understand, sometime MessageFormat.format() may encounter error internally and need to still return a string but with informaiton about what were fallbacked during the format() operation. I wonder, could that be done via

fmt = new Intl.MessageFormat(...);
let myLogFallbackInfo = function(fallbackReasons) { /* do something */ };
let result = fmt.formatWithFallbackInfo(obj);
let str = result.formatted;
myLogFallbackInfo(result.fallbackReasons);

In theory, that approach could work. An additional method formatToPartsWithFallbackInfo() would also be required, and we would need to decide what to do with errors when the plain format() and formatToParts() are called. So a user would need to pick between four different "format" methods. An additional object would also be created when formatting every message.

I don't think this would be better or clearer for users.

The difference is w/ the onError handler, you can only see the partial information about the fallback one at a time, during the format but the formatWithFallbackInfo approach (or. we can use a different name) allow the myLogFallbackInfo to see all the fallback information at once (it could be an array of all the error data of 5 fallbacks during the format () call)

Note that additional information like the message source data or an identifier for the message would also probably need to be included when reporting/logging errors.


Replying to @sffc

I don't have an official position yet, but it seems that Intl.MessageFormat.prototype.format could throw an exception of type Intl.MessageFormatError which could have a field such as fallbackMessage that contains the message with fallback placeholders as required by the specification.

In theory, this approach could also work. We might need two different errors, to cover errors encountered during both format() and formatToParts(), as their fallback messages have different shapes. This approach would also require error objects to carry complex user-defined values within them, which would be novel for the spec, and might be quite challenging to account for. As it's possible for multiple errors to be encountered while formatting a message, this would also swallow the errors beyond the first (while still formatting their placeholders with fallback values), or require the thrown error to include something like an additionalErrors array.

@eemeli
Copy link
Member

eemeli commented May 13, 2024

This topic was also covered during the 2023-10-25 incubator call where @sffc also participated, and I recall @ljharb having some thoughts on attaching user-defined data to errors, though that may have been limited to values that are later read by spec internals.

@ryzokuken
Copy link
Member

Given the context provided by @eemeli above, I too strongly feel that the onError pattern is the best one for us. Specifically, I strongly disagree with the idea put forward in #58 (comment) and I can't comprehend why stretching the problem space to the two extremes is helpful.

If we go the "always throw" route then the obvious downside in my opinion would be that due to the nature of MessageFormat as @eemeli mentioned, a developer would practically speaking be forced to wrap all usage of Intl.MessageFormat in try-catch blocks, significantly affecting the ergonomics of the proposal: I believe this is absolutely the wrong pattern in JavaScript, where try-catch is supposed to be used lightly. Secondly, by itself this approach would only give the user the first error and fallback so in the best case scenario we'd probably want to go through the whole message "collecting" errors and then throw a sort of AggregateError with the full fallback at which point we're back to square one.

If we go the "never throw" route it will be really difficult for developers to be able to debug or report errors. Furthermore it'll be a waste of effort since the parser and formatter already knew about the errors which instead of getting reported got discarded and then the developer had to put in substantial effort to go through the result and find those out again, at times losing context that would be vital to be able to debug it.

Because of both of these outcomes, I don't understand the benefit of trying to create a dichotomy here, since we need to come to a single solution that caters to the most common use-cases. This is addressed really well by the onError pattern which is not only popular in JavaScript but by far the most appropriate way to do non-blocking error handling as you may notice from these examples.

@ljharb
Copy link
Member

ljharb commented May 13, 2024

What are the pros/cons of an alternative where instead of returning a bare string, we always return an object with the bare string as a "value" property (or something), and then that object gives us a place to put an errors array?

@eemeli
Copy link
Member

eemeli commented May 14, 2024

What are the pros/cons of an alternative where instead of returning a bare string, we always return an object with the bare string as a "value" property (or something), and then that object gives us a place to put an errors array?

The main cost is that we would need to create that object for every formatting call, and that the result of a format() call wouldn't be a string, which is the case for all other Intl formatters.

So it would add a cost to every successful formatting call in order to deal with exceptional cases, and make this API different from all the other ones. To some extent it'd be a bit of a return to the kind of API this proposal had before #22, where the "formatted" output still needs a bit of processing to be used, albeit of course less than with the resolveMessage() approach.

@sffc
Copy link

sffc commented May 14, 2024

ICU APIs have been moving in the direction of returning an upgraded type from .format that can be converted to a string but also has extra metadata. We use .formatToString when the function directly returns a string.

I'd be okay with .format returning an upgraded type, maybe even one with a valueOf that returns a string, or introducing a new function like formatFull that returns the upgraded object.

@sffc
Copy link

sffc commented May 23, 2024

TG2 discussion: https://github.com/tc39/ecma402/blob/main/meetings/notes-2024-05-23.md#please-remove-onerror-parameter-from-formatformattoparts

We want to seek feedback from TG1 about error handling code style.

@sffc
Copy link

sffc commented May 31, 2024

I added this to the agenda: tc39/agendas#1608

Some additional context: this had previously been discussed in part in an Incubator Call, but that call saw fairly low attendance and there are clearly more questions that need to be answered. I'll put together some slides based on the options outlined in last week's TG2 discussion and share them ahead of plenary.

@jasonwilliams
Copy link
Member

We used some i18n APIs and are hoping to move over to messageFormat when its ready.
For us we cared about graceful degradation and not failing anything outside of the formatting itself if there's an issue and preventing users from doing what they need to do.

So we've always favoured fallback text showing with a mechanism to report issues to be resolved later on. For us the onError callback is more useful in that regard. It would be a unfortunate if we need to wrap all format calls (which are a lot) with the same try/catch logic.

I don't have an official position yet, but it seems that Intl.MessageFormat.prototype.format could throw an exception of type Intl.MessageFormatError which could have a field such as fallbackMessage that contains the message with fallback placeholders as required by the specification.

This would work but feels like it would lead to more boilerplate per mf.format() call than one would like. We would still need a try/catch, then in the catch (Intl.MessageFormatError) {} block grab the fallback text and return that.

There's some interesting ideas around objects with the fallbacks as keys, but returning the message string is also more consistent with Intl.numberFormat().format(), Intl.durationFormat().format(), Intl.DateTimeFormat().format() etc.. Breaking this consistency would be weird.

@sffc
Copy link

sffc commented Jun 10, 2024

@sffc
Copy link

sffc commented Jun 24, 2024

Options from this thread:

(1) onError callback
(2) Always throw errors in format function
(3) Expressive return value
(4) Separate functions: .format and .formatWithFallback

Some additional options that were suggested at TG1 plenary:

(5) Configure error handling behavior in the constructor (@patrick-soquet)
(6) Return error array in output parameter (@acutmore)

Conclusions from plenary:

  • Most delegates are negative on option 2
  • Many delegates prefer option 1 except there were some concerns raised about reentrancy
  • Option 6 has most of the benefits without the drawbacks of the other options and should be investigated more

@acutmore
Copy link

  1. Return error array in output parameter (@acutmore)

To expand on the suggestion. Inspired by Acorn's onComment and onToken options docs. .format could take an optional array as part of it's config and would push errors into it, i.e. it's an out param.

const errors = [];
const myString = format(data, {
  errors
});
if (errors.length) {
  // ...
}

When an array literal is passed we can be sure that there is no re-entrancy. Making control flow easy to see.

If no array is passed then format knows errors are not being handled and can still print a warning to the console.

The return value is "simply" a string

@sffc
Copy link

sffc commented Jun 26, 2024

Thought on option 6: If the errors get added during formatting, it would involve calling Array.prototype.push, which would involve re-entrancy. The re-entrancy could be avoided by internally collecting the errors into an array and setting the array into the output object at the end.

const outputMetadata = {};
const myString = format(data, outputMetadata);
if (outputMetadata.errors.length) {  // or `.errors?.length` ?
  // ...
}

@sffc
Copy link

sffc commented Jun 26, 2024

We should also discuss exactly how the property gets set. I realized that if [[Set]] is used, it could still be reentrant if a custom setter is on the output object. Maybe we should instead use [[DefineOwnProperty]]. @gibson042 may have opinions on this.

@acutmore
Copy link

Yeah, I don't think it would actually call outArray.push, but instead [set] them.

There could be re-entrance if the array is a proxy or has setters but if the code is passing an array literal then it can be sure there is no userland code that will be triggered. So it's analysable at the call site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants