Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable federated XGBoost using bootstrap aggregation in Task Runner #1151

Merged
merged 42 commits into from
Nov 20, 2024

Conversation

kta-intel
Copy link
Collaborator

@kta-intel kta-intel commented Nov 15, 2024

This PR enables a TaskRunner-based federated XGBoost using the bootstrap aggregation

Specifically this PR:

  • creates an xgb_higgs task runner workspace to train on the higgs dataset [ref] with all required code (i.e. src/taskrunner.py, src/dataloader.py ,plan/*.yaml, etc.
  • adds a tasks_xgb.yaml to enable new FedBaggingXGBoost aggregation when running xgb training workloads
  • adds delta_updates parameter to Aggregator in order to bypass delta updating (for deep learning models getting weight deltas makes sense since the model size should stay relatively consistent, for tree-based algorithms, this makes less sense because more trees are added over time)
    • delta_updates is set to true by default to preserve normal behavior. xgboost taskrunner explicitly sets it to false to bypass it
  • introduces new loader_xgb.py as the backend / superclass to src/dataloader.py
  • introduces new runner_xgb.py as the backend / superclass to src/taskrunner.py
  • introduces new federated boostrap algorithm for xgboost in aggregation_function.fed_bagging which bags the latest trees to a global model, consistent with currently accept federated xgboost algorithms in the industry

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
This reverts commit d3937ef.

Signed-off-by: kta-intel <[email protected]>
@kta-intel kta-intel changed the title [WIP] Enable federated XGBoost using bootstrap aggregation in Task Runner Enable federated XGBoost using bootstrap aggregation in Task Runner Nov 15, 2024
@kta-intel kta-intel marked this pull request as ready for review November 15, 2024 22:14
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: Have you used specific formatters for yaml files?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied over the yaml from other workspaces as a template then ran bash shell/format.sh in the whole repo. Is there something additional that you recommend?

openfl-workspace/xgb_higgs/.workspace Outdated Show resolved Hide resolved
openfl-workspace/xgb_higgs/plan/cols.yaml Outdated Show resolved Hide resolved
openfl-workspace/xgb_higgs/src/dataloader.py Show resolved Hide resolved
openfl-workspace/xgb_higgs/src/setup_data.py Outdated Show resolved Hide resolved
openfl/component/aggregator/aggregator.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Show resolved Hide resolved
openfl-workspace/xgb_higgs/plan/defaults Outdated Show resolved Hide resolved
openfl-workspace/xgb_higgs/plan/data.yaml Outdated Show resolved Hide resolved
Copy link
Collaborator

@teoparvanov teoparvanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, thanks @kta-intel! I have a couple of questions and comments, but overall the PR looks in excellent shape for such a sizeable new feature.

PS: is there an easy way to add a CI job that covers at least the "happy path" of an XGBoost-based federation?

openfl-workspace/xgb_higgs/src/setup_data.py Show resolved Hide resolved
openfl-workspace/xgb_higgs/src/setup_data.py Outdated Show resolved Hide resolved
openfl/federated/data/loader_xgb.py Outdated Show resolved Hide resolved
openfl/federated/data/loader_xgb.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Outdated Show resolved Hide resolved
openfl/federated/task/runner_xgb.py Outdated Show resolved Hide resolved
openfl/interface/aggregation_functions/fed_bagging.py Outdated Show resolved Hide resolved
@kta-intel
Copy link
Collaborator Author

PS: is there an easy way to add a CI job that covers at least the "happy path" of an XGBoost-based federation?

Thanks for the review Teo! I'm not sure what you mean by this, would this just be a toy sanity check CI?

@teoparvanov
Copy link
Collaborator

teoparvanov commented Nov 19, 2024

PS: is there an easy way to add a CI job that covers at least the "happy path" of an XGBoost-based federation?

Thanks for the review Teo! I'm not sure what you mean by this, would this just be a toy sanity check CI?

Yes, I mean a very basic CI job that does an E2E run of the XGBoost workspace. Like here, but using xgb_higgs as the template, and with a reduced test matrix (f.e. just ubuntu + python3.10). However, if the effort for this turns out to be significant, you can also consider doing it in a separate PR.

Other than that, the PR is ready to be merged IMO. Thanks, @kta-intel !

@kta-intel
Copy link
Collaborator Author

Yes, I mean a very basic CI job that does an E2E run of the XGBoost workspace. Like here, but using the xgb_higgs as a template, and with a reduced test matrix (f.e. just ubuntu + python3.10). However, if the effort for this turns out to be significant, you can also consider doing it in a separate PR.

I see! This is a good idea, but lets save it as a separate PR. Thanks for the review and suggestion(s) @teoparvanov!

@teoparvanov teoparvanov merged commit 3c983ef into securefederatedai:develop Nov 20, 2024
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants