-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cell_cycle
returns poor scores on perfect data input
#351
Comments
Hi Scott, scib/scib/metrics/cell_cycle.py Line 71 in 2fe05c7
@LuckyMD I think we decided on per-batch computation of the PCA. Is this still the behaviour we want? If so it might make sense to default to encourage PCA recomputation or remove the reuse of existings PC components altogether to avoid confusion. |
Oh! That make sense. It's a bit of a funny result then that the "perfect embedding" here is one in which the batches are embedded with PCA separately and then smashed together, while an embedding that keeps the raw data as-is performs relatively poorly... but at least from the openproblems perspective there is a simple solution here. |
|
Hah... yes! This still makes sense I think. We don't want the PCA to capture batch differences in the CC score. I forgot this completely! Thanks so much for highlighting this @mumichae ! |
If we pass PCA on the unintegrated data, we should get a perfect score. We don't.
Related: openproblems-bio/openproblems#706
The text was updated successfully, but these errors were encountered: