Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a "full series" option to the stack transforms #351

Closed
Fil opened this issue May 1, 2021 · 8 comments · Fixed by #792
Closed

Add a "full series" option to the stack transforms #351

Fil opened this issue May 1, 2021 · 8 comments · Fixed by #792
Assignees
Labels
enhancement New feature or request

Comments

@Fil
Copy link
Contributor

Fil commented May 1, 2021

See discussion at #325 and #348 (comment)

@Fil Fil self-assigned this May 1, 2021
@Fil Fil added the enhancement New feature or request label May 3, 2021
Fil added a commit that referenced this issue Aug 11, 2021
… by number of cylinders

The bins are sorted by decreasing r, so that they are all visible.

The example would benefit from stackR (#197).

It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).
This was referenced Aug 11, 2021
mbostock added a commit that referenced this issue Aug 11, 2021
* This example plot computes the median of cars' economy (mpg), grouped by number of cylinders

The bins are sorted by decreasing r, so that they are all visible.

The example would benefit from stackR (#197).

It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).

* Update test/plots/cars-mpg.js

Co-authored-by: Mike Bostock <[email protected]>

* Update test/plots/cars-mpg.js

Co-authored-by: Mike Bostock <[email protected]>

* zero, not filter

* group, not bin

* remove console.log

* stroke, not fill

Co-authored-by: Mike Bostock <[email protected]>
@mbostock
Copy link
Member

mbostock commented Aug 12, 2021

Datadog calls this “default zero” interpolation: https://docs.datadoghq.com/dashboards/functions/interpolation/#default-zero

I wonder to what degree this is specific to time series. I can certainly imagine cases where it’s not specific to time series, but when it is, it seems like the bin transform with filter: null is an option for fixing the missing data. Edit: Okay, the example histogram you made is pretty convincing that we shouldn’t think of this as only a time-series problem. (Also in a related irony, this Cloud Costs notebook demonstrates the problem, but has another problem of time being represented as ordinal strings.)

@Fil
Copy link
Contributor Author

Fil commented Aug 12, 2021

It's worse than this: using the empty bin approach is necessary for a continuous ("binnable") domain, but far from sufficient—as soon as you have z or facets, you need a point (real or fake) for each element of the domain times each of the series.

@mbostock
Copy link
Member

But the bin domain is the same across all groups and facets, so as long as you have at least one data point in a given group, you’ll get all the bins?

@Fil
Copy link
Contributor Author

Fil commented Aug 12, 2021

In https://observablehq.com/d/f6a7975f2ad4519a there is just one empty bin (4,750 in in the chinstrap facet), which should be mapped to 0 for data fidelity. If we push up the number of bins to 200, we start to see that issue creeping everywhere — outlined in red in the image below, all the areas should drop to zero since there is no data point in this position.

Capture d’écran 2021-08-12 à 22 39 00

(I'm not sure that we can find a generic way to do both operations, maybe imputing missing values is something that should be left to the data-wrangling section?)

@mbostock
Copy link
Member

But that example doesn’t use filter: null on the bin transform, right?

@Fil
Copy link
Contributor Author

Fil commented Aug 13, 2021

the Plot.seriesX transform is possibly superseded by #499 and #500

@mbostock
Copy link
Member

I think this is probably now a duplicate of #597. (And perhaps #513 in the case of ordinal data.) At any rate they’re all closely related.

@Fil
Copy link
Contributor Author

Fil commented Mar 1, 2022

closing since this example is solved with filter: null.

@Fil Fil closed this as completed Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants