-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Number of PCA Components to Keep #113
[WIP] Number of PCA Components to Keep #113
Conversation
My last commit added a notebook Main Takeaways:
Hopefully these two notebooks provide some useful quantitative information about how selecting hyperparameters will effect performance (AUROC) across a range of query scenarios. There's a lot more that could be done but I think we are at a point where we can make some decision about how we want to modify the main notebook. The memory usage should be a part of these decisions as well (see #88). Here are some of the decisions that need to be made):
My recommendation:
|
Ah I didn't realize we weren't setting
Once we settle on all these numbers, we should enforce this on the frontend.
I think this makes sense. You could also make it very simple, like 3 sample size ranges that map to three different number of PCA components. And I support how you took the min of the positive and negatives and used that as the relevant number. @rdvelazquez can you open a pull request to incorporate the changes you think should go into You can also merge this PR when its no longer in progress. |
Agreed. I will try to help out with that as I have time and if it is within my limited front-end abilities.
Will do!
I think @patrick-miller may have been looking at this PR so I'll wait to see if he has any comments. |
This a very nice study. I agree with most of your takeaways. For I would've expected a larger positive correlation between optimal |
I revised the
@dhimmel and @patrick-miller Thanks for looking at this and providing comments! I'll squash merge this now. If there's anything else we want to look at we can always just open a new PR and revise these notebooks. |
First step in addressing #106. This is still a work in progress but I thought I'd at least check in and post what I'm working on. Any and all input is very welcome and appreciated.
The notebook is a little long because it's basically my working notes but I think there is enough documentation in the notebook that it's fairly self explanatory.
My takeaways thus far:
To Do: