Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify documentation for the ariaLabel channel #797

Closed
kentr opened this issue Mar 5, 2022 · 12 comments
Closed

clarify documentation for the ariaLabel channel #797

kentr opened this issue Mar 5, 2022 · 12 comments
Labels
documentation Improvements or additions to docs

Comments

@kentr
Copy link
Contributor

kentr commented Mar 5, 2022

Since the README says "All marks support the following optional channels:", I'm assuming that ariaLabel is supposed to work forrect marks.

However, IME adding ariaLabel causes the marks to not render at all. There aren't even <rect> elements in the source anymore that I can see.

I've added an explicit z channel as recommended in the README.

ariaDescription works as expected.

There are no errors in the dev console.

Is this a bug, or am I doing something incorrectly?

To repeat

  1. Create a rect plot and confirm that the marks plot as expected.
  2. Add ariaLabel as an option. The marks disappear.

Example: https://observablehq.com/@kentr/rect-mark-observable-plot

@Fil
Copy link
Contributor

Fil commented Mar 5, 2022

ariaLabel: "foo" defines a channel that uses the foo property of each datum. If this is undefined, the mark is filtered out. You can use () => "foo" instead.

@Fil Fil closed this as completed Mar 5, 2022
@kentr
Copy link
Contributor Author

kentr commented Mar 5, 2022

@Fil

Ah, thank you. I'm not sure I understand the difference between a channel and an option.

May I suggest adding title, href, and ariaLabel to this sentence?:

"The fill, fillOpacity, stroke, strokeWidth, strokeOpacity, and opacity options can be specified as either channels or constants."

@Fil Fil changed the title ariaLabel breaks rect mark clarify documentation for the ariaLabel channel Mar 5, 2022
@Fil Fil reopened this Mar 5, 2022
@Fil
Copy link
Contributor

Fil commented Mar 5, 2022

Yes we must find a way to clarify the documentation. Currently it says

Missing and invalid data are handled specifically for each mark type and channel. In most cases, if the provided channel value for a given datum is null, undefined, or (strictly) NaN, the mark will implicitly filter the datum and not generate a corresponding output.

@Fil Fil added the documentation Improvements or additions to docs label Mar 5, 2022
@kentr
Copy link
Contributor Author

kentr commented Mar 5, 2022

Part of the confusion is that options like fill will accept constants that are either static values (such as CSS values) or column names, and are listed under both "options" and "optional channels".

But other options can only be channels (like ariaLabel) and don't accept static values.

Speaking for myself, it would help if there is a single table of options, with columns along these lines:

  • Name
  • Description
  • Required - a check mark indicating that the option is required, perhaps with an asterisk denoting that it depends on the mark type.
  • Static - a check mark indicating that the option accepts a static value.
  • Channel - a check mark indicating that the option can also be supplied as a channel, along with link to a verbose description of a channel.

@mbostock
Copy link
Member

mbostock commented Mar 5, 2022

This would also be covered by a warning that detects when the majority (or all) of a mark’s instances were filtered because the channel values were undefined. #593 #755

Part of the confusion is that options like fill will accept constants that are either static values (such as CSS values) or column names, and are listed under both "options" and "optional channels". But other options can only be channels (like ariaLabel) and don't accept static values.

Ambiguity is an inherit risk when trying to make a concise API. 🙂 We’re only able to support string constants for fill because there is a formal way to disambiguate constant values (e.g., valid CSS colors) from column names. There isn’t such a formalism for ariaLabel because it is for human consumption, i.e., it is any text. So we use the most generic interpretation, which is a column name. In other words, fill is the exception here, not ariaLabel, because we know that fill is a color. These exceptions are specifically called out in the README:

When the fill or stroke is specified as a function or array, it is interpreted as a channel; when the fill or stroke is specified as a string, it is interpreted as a constant if a valid CSS color and otherwise it is interpreted as a column name for a channel.

(I originally objected to having string shorthand for column names since I anticipated this sort of ambiguity and "foo" is not much shorter than the equivalent d => d.foo. But string column names have proven quite popular, and I do seem them now as somehow conceptually simpler. And they have the added bonus that you can sometimes promote a column name to an axis or legend label.)

We could try to introduce a heuristic for detecting column names, but that’s also fraught because there is no strong convention. For example I considered trying to inspect the associated data, but Plot wants to be flexible in how data is represented (e.g. for columnar data) so that’s not an option.

Another challenge with ariaLabel is whether you intend the label to apply to each mark instance (i.e. each svg:rect element), or to all the mark’s instances (i.e. the parent svg:g element, as the ariaDescription does). Presumably you intended later, but we envisioned ariaLabel for making the data associated with the visual representation (i.e. the meaning of the mark) accessible to screen readers which necessitates it being a channel (a per-instance property). We could instead interpret ariaLabel: string as a constant property (applying to all instances), but then this option would behave differently from all other options, which probably isn’t desirable.

@mbostock
Copy link
Member

mbostock commented Mar 5, 2022

May I suggest adding title, href, and ariaLabel to this sentence?:

"The fill, fillOpacity, stroke, strokeWidth, strokeOpacity, and opacity options can be specified as either channels or constants."

The title, href, and ariaLabel can’t be specified as constants. They can only be specified as channels. That’s why they are not listed in this sentence.

Edit: I added a new sentence explaining this. Thanks for the suggestion!

@mbostock
Copy link
Member

mbostock commented Mar 5, 2022

I'm not sure I understand the difference between a channel and an option.

Mark options are named properties of the object you pass as the second argument to the mark constructor, such as the fill in Plot.dot(data, {fill: "red"}). Some mark options can be specified as either a channel or a constant. When the option indicates that all instances of the mark should be the same (as here with the red dots), it is referred to as a constant. On the other hand when it may vary across instances of the mark, as when encoding data, as in Plot.dot(data, {fill: "temperature"}), then it is referred to as a channel.

This is described in the README here:

Options that are shared by all of a mark’s generated shapes are known as constants, while options that vary with the mark’s data are known as channels. Channels are typically bound to scales and encode abstract values, such as time or temperature, as visual values, such as position or color. (Channels can also be used to order ordinal domains; see sort options.)

@kentr
Copy link
Contributor Author

kentr commented Mar 5, 2022

Some mark options can be specified as either a channel or a constant. When the option indicates that all instances of the mark should be the same (as here with the red dots), it is referred to as a constant.

I saw that after I RTFM more carefully.

Have you considered using the labels "static" vs "dynamic"? In other contexts, I would consider a string to be a constant, whether it contains a column name, a CSS value, or arbitrary text.

If I understand correctly, fill: () => "red" yields the same result as fill: "red". So the former would fit the Plot definition of a "constant" because it doesn't vary.

OTOH, since fill: "red" yields the same result as fill: () => "red", the former could also be considered a shorthand channel definition in the way that fill: "column_name" is shorthand for fill: (d) => d.column_name.

We could try to introduce a heuristic for detecting column names

Instead of that, what about looking at the datum to see if there is a column named "whatever", and treating it as a string if not?

@kentr
Copy link
Contributor Author

kentr commented Mar 5, 2022

Mark options are named properties of the object you pass as the second argument to the mark constructor, such as the fill in Plot.dot(data, {fill: "red"}). Some mark options can be specified as either a channel or a constant.

Options that are shared by all of a mark’s generated shapes are known as constants, while options that vary with the mark’s data are known as channels.

Would you agree that the named properties which are passed as the second argument are shared by all of the mark's generated shapes (regardless of whether the option is specified as a constant or as a channel)?

If so, I suggest slightly altering the README text to:

Options whose derived values are shared by all of a mark’s generated shapes are known as constants, while options whose derived values vary with the mark’s data are known as channels.

@mbostock
Copy link
Member

mbostock commented Mar 5, 2022

If I understand correctly, fill: () => "red" yields the same result as fill: "red". So the former would fit the Plot definition of a "constant" because it doesn't vary.

It will typically look the same, but the semantics, the code path, and the generated DOM are different between these two specifications.

In the fill: () => "red" case, the fill is a channel that is bound to the color scale. Plot will choose the identity scale by default because all the channel values are valid CSS colors. If you said instead fill: () => "foo" then Plot would chose an ordinal color scale with the default categorical tableau10 color scheme (giving a blue). (This was also the behavior prior to 0.4.0.) You can set the scale type or scheme using scale options, meaning that fill: () => "red" can produce non-red depend on how the color scale is configured. (Typically channels are used to encode abstract data visually, rather than to provide literal visual values; Plot is a visualization API, not a graphics API, after all.)

Whereas with fill: "red" the fill is a constant and does not instantiate a channel, and hence the fill is not associated with the color scale and not affected by the scale options; it is always a literal color value.

In the channel case, a fill channel is present in mark.channels, mark.fill is undefined, and the fill attribute is populated on each svg:circle element in applyChannelStyles:

if (F) applyAttr(selection, "fill", i => F[i]);

Whereas in the constant case, no fill channel is present in mark.channels, but there is a mark.fill, and the fill attribute is populated on the parent svg:g element in applyIndirectStyles:

applyAttr(selection, "fill", mark.fill);

You can see this reflected in the generated SVG.

Instead of that, what about looking at the datum to see if there is a column named "whatever", and treating it as a string if not?

That is what I was referring to here:

For example I considered trying to inspect the associated data, but Plot wants to be flexible in how data is represented (e.g. for columnar data) so that’s not an option.

The point being, if we want to support data being things like Apache Arrow tables, or other columnar representations, you can’t look at the datum associated with a particular instance because it might not exist (i.e., an object representing the row is never instantiated). We could try to support column inspection, but it would need to handle all the variety of ways data can be specified, and I would like to avoid opening that particular box. Also, I think it’s nice that you can look at a Plot specification and say that fill: "red" is always literally red regardless of the data, rather than magically changing to a channel if the dataset happens to include a column named “red”.

Options whose derived values are shared by all of a mark’s generated shapes are known as constants, while options whose derived values vary with the mark’s data are known as channels.

Yeah, I think that’s an improvement, although I’m not 100% sure about the word “derived”… and also you can have channels that have the same value for every instance (the fill: () => "red" case above). I’ll think about this some more. I appreciate all the feedback and perspective you are sharing, @kentr!

mbostock added a commit that referenced this issue Mar 5, 2022
@kentr
Copy link
Contributor Author

kentr commented Mar 6, 2022

It will typically look the same, but the semantics, the code path, and the generated DOM are different between these two specifications.

Ah, ok. Maybe that was a bad example.

Maybe y: 1 and y: () => 1 are a better example, but maybe I also missed this part of the README :).

I didn't compare DOMs. I understand that the code path and efficiency would probably be different just due to the complexities in accepting a function as input and calling a function for each datum. I think I understand the semantic difference, at least what is implied by specifying a computation or formula.

To my naive eye I'd consider them functionally equivalent because I'd expect them both to result in all marks using the hard-coded constant 1 as the value for y.

There are probably nuances of data science / data visualization that I'm missing.

although I’m not 100% sure about the word “derived”

There are probably better words. Maybe "resulting values"?

I was thinking of it in terms of "base values" (the input data) and "derived values" (values that are computed from the base data, or the values that result from the mapping that is performed).

@mbostock
Copy link
Member

mbostock commented Mar 6, 2022

Maybe y: 1 and y: () => 1 are a better example, but maybe I also missed this part of the README :).

Yes, these are identical because the former is promoted to the latter here:

: type === "number" || value instanceof Date || type === "boolean" ? array.from(data, constant(value))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to docs
Projects
None yet
Development

No branches or pull requests

3 participants