-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow overriding of CSR accumulators with an IndexLike
#181
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a code read, and this looks great to me. Thanks for cleaning up some of the crufty code!
Plz fix the formattings |
34306ab
to
1d71cc1
Compare
IndexLike
e808e51
to
4b18ba8
Compare
@@ -26,7 +26,7 @@ def get_indexer( | |||
"""Something compatible with Pandas' Index.get_indexer method.""" | |||
|
|||
|
|||
IndexFactory = Callable[[npt.NDArray[np.int64]], "IndexLike"] | |||
IndexFactory = Callable[[npt.NDArray[np.int64], Optional[Any]], "IndexLike"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we have this take the NDArray as the only parameter is so that we can use the function pandas.Index
as an index factory. With this added, pandas.Index
no longer works, which is why the format check is now failing.
4b18ba8
to
1d71cc1
Compare
@thetorpedodog can you plz merge this as my C++ reindexer is dependent on this? |
I would like to wait until we get approval on the TileDB-SOMA side of the house. I am pretty confident that this is going to be it, but I don’t want to cut a release prematurely. It will be very quick—after merging this, we can immediately make the tag and everything will be ready in a matter of minutes. (There will be no rush to release TileDB-SOMA; this is fully compatible with existing TileDB-SOMA.) |
To support a custom indexer in TileDB-SOMA, this introduces a basic abstraction that is used to build the Indexer used by the CSR Accumulator. This can then be easily overridden by an implementation's custom subclass to swap that out without having to duplicate substantial parts of the code. The naming is, admittedly, not ideal; this is in part due to the naming of the things we're trying to abstract over (the Pandas `Index` type which has the `get_indexer` method).
1d71cc1
to
b47bbc8
Compare
We’re good to go! |
To support a custom indexer in TileDB-SOMA, this introduces a basic abstraction that is used to build the Indexer used by the CSR Accumulator. This can then be easily overridden by an implementation's custom subclass to swap that out without having to duplicate substantial parts of the code.
The naming is, admittedly, not ideal; this is in part due to the naming of the things we're trying to abstract over (the Pandas
Index
type which has theget_indexer
method).This is the somacore counterpart of single-cell-data/TileDB-SOMA#1728. It is entirely compatible with current code, though; once we’re positive that it fufills the needs of the tiledb-soma reindexer, we can merge it and release it and tiledbsoma installations will continue to work fine.