-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sort and filter for bin and group #495
Conversation
This works well in the athletesHeightWeightBinStroke example: rather than setting the strokeOpacity to zero or one, we can filter the elements we don’t want to show. |
I see two limitations of this filter option vis à vis the sort option you propose in #334:
I think both of these are possible and would love assistance. |
… by number of cylinders The bins are sorted by decreasing r, so that they are all visible. The example would benefit from stackR (#197). It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).
I like this: it solves the "availability" example which triggered the question, and makes athletesHeightWeightBinStroke better. (Note that the "data availability" test plot works equally well both in the "broken line" mode and "0 value" mode for empty bins; I kept the broken line version.) wrt remark 1: In what situation would we generate empty groups?
I tend to think that it would be desirable in all these cases to be able to generate empty groups, as we would want to solve #351. Remark 2 seems a bit similar to #472 (comment) ; but it makes me think, what if we did something totally different and passed the actual bin to a transform option?
|
The transform option feels like new semantics (and could arguably be called a map option); if we’re going to support the sort option #334 then the filter option here feels more simpatico. And we can make filter and sort more similar if allow an input channel to the filter reducer, and extend the filter option to the group transform (though as I said, for group setting filter: null won’t have any effect because you can’t generate empty groups, but you could still use it to suppress other groups). |
* This example plot computes the median of cars' economy (mpg), grouped by number of cylinders The bins are sorted by decreasing r, so that they are all visible. The example would benefit from stackR (#197). It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard). * Update test/plots/cars-mpg.js Co-authored-by: Mike Bostock <[email protected]> * Update test/plots/cars-mpg.js Co-authored-by: Mike Bostock <[email protected]> * zero, not filter * group, not bin * remove console.log * stroke, not fill Co-authored-by: Mike Bostock <[email protected]>
* uses the filter option of the bin transform * uses an explicit null if empty, or sum, accessor to create a broken line
21aaf4d
to
eb9deba
Compare
The only thing that this is now missing is supporting the sort option being specified as a comparator rather than reducer. I think that’s possible: you’d compute the bins or groups for the sort’s input channel (or the data), and then you’d pass those arrays to the comparator during sorting. But it would require some extra finagling to get this to work, and it doesn’t seem urgent since the common case will be things like sorting by count or some other accessor. So, I’ll punt! Forward! |
I was thinking we could use the sfCovidDeaths example because that shows the necessity of returning empty bins. |
Yes. But, can we use both? this example is a bit contrived but it shows the filter as a function that receives the bins. |
I prefer less contrived if possible but I’ll take a look. |
This introduces a filter output for the bin transform which defaults to count such that by default only non-empty bins are returned. By setting the filter to null you can return all bins; by setting it to a function you can apply whatever test you like on the bins before any other output channels are evaluated. Like the other reducers, this is specified on outputs rather than options.
I’m not sure this is strictly better than #489. It does allow more fine-grained control over which bins to return, and it is faster than applying the filter transform after binning (since it skips evaluating the other channels if the bin will be dropped). However, it is likely (a tiny bit?) slower than #489 since we cannot special-case the test for empty bins. And it is less explicit than a dedicated empty option.
Supersedes #491.
Supersedes #489.