1.1 (#85)

* added correct tensorflow version 2.3, added test set split to mnist example, added intel version print function * removed hardcoded val/test value and bumped tf ver to 2.3.1 * intel-tensorflow 2.3.1 ~> 2.3.0 * clear cell output * Fix torchvision MNIST download (#26) * Fix torchvision MNIST download * flake8 fix * flake8 fix * Fix 503 Service Unavailable error for MNIST (#30) * Fix 503 Service Unavailable error for MNIST * Add issue link * Docker base modifications for distributing * Fixed wrong filepath to the signed collaborator certificate in docs (#32) * Add tensorflow compatibility to Python native API (#33) * Update setup.py (#34) Small update to allow for latest crypto library. * Use kroki to create svg (#36) * Fix Task Runner assignments in Python API (#28) * Fix Task Runner assignments in Python API * Skip model and Data Loader rebuilding * Reset layer names for each network * Fixed aggregator FQDN for jupyter notebook (#42) * Fixes for MNIST data handling for PyTorch & TF * FIx layer name mismatching in TensorFlow models (#43) * FIx layer name mismatching in tensorflow models * Remove Keras session clearing in TF native test * Fix KerasTaskRunner constructor * Fix linter * Remove useless mnist import * Remove useless six from setup.py * Update README.md * Contributor License Agreement (#49) * Adding CLA * CLA workflow * CLA workflow * Update cla.yml * Move CLA.md to root directory Co-authored-by: Alexey Gruzdev <[email protected]> * Creating file for storing CLA Signatures * @grib0ed0v has signed the CLA from Pull Request #50 * @walteriviera has signed the CLA from Pull Request #35 * Keras NLP Template * removed artifacts due to git-merge * removed artifacts from DataLoader * Adding 100k foot figure and documentation on OpenFL workflow (#52) * OpenFL workflow figure Adding a graphic on the overall OpenFL workflow. * Update running_the_federation.rst * Update running_the_federation.rst * Update running_the_federation.rst * Update running_the_federation.rst * Update running_the_federation.rst * Add files via upload * Add files via upload * Add files via upload * Update running_the_federation.rst Center the text below figure * Added centralization to all figures in docs Co-authored-by: Alexey Gruzdev <[email protected]> * Update README.md (#55) * Fix flake8 errors (#57) * Update fed_unet_runner.py * Interactive API beta (#61) * introduced interactive API component * moved examples to the openfl-tutorials * docs for the interactive api initialized * docs update * Added docs, cleaned notebooks * More comments added * More comments * More docstrings * dropped commented code * Docs update * Serializer class names fix * Added comments to tutorials * Fixes for docs * Fixed logging in notebook * Translated docs to rst * docs link fix * rst code-block fix * more fixes for code blocks * more fixes for docs * more rst fixes Co-authored-by: Alexey Gruzdev <[email protected]> * Add FedProx algorithm implementation (#53) * Add notebook for FedProx method * Add a description * Rename notebook * Add PyTorch workspace template for FedProx method * Extract train_epoch method in PyTorchTaskRunner * Erase local dependency * Extract train_epoch method * Add FedProx implementation for Keras * Add FedProx to Keras * Update FedProx Keras tutorial * Update PyTorch implementation * Add workspace for Keras * Update PyTorch workspace * Fix linter * Update PyTorch histology template * Update PyTorch MNIST workspace * Fix linter * Fix linter * Fix linter * Fix num_batches keyword argument in Keras * Allow to specify num_batches in Keras * Update PyTorch Python API test * Move Metric type to utilities.types * Include FedProx optimizers in utilities package * Fix imports * Update __init__.py * Remove workspaces with FedProx Co-authored-by: Alexey Gruzdev <[email protected]> * Added troubleshooting page for common questions. (#67) * Enable compression options (#63) * Fixes to delta calculation and lossy compression flow. Added new workspace that enables lossy compression by default * WIP * Fixed bug in sparsity calculation. Working now for single collaborator with 90% sparsity * Cleaning up code and adding documentation * Removing comments * Fixing flake8 linting * Fix F541 * Fixed collaborator unit tests * Store additional rounds for tf_cnn_histology Co-authored-by: Alexey Gruzdev <[email protected]> * Remove unused variable (#74) * Fix loss functions in PyTorch notebooks (#76) * Fix loss functions in PyTorch notebooks * Remove hard-coded typing in train_batches * Allow to specify custom aggregation function (#62) * Allow to specify custom aggregation function * Fix linter * Fix unit tests * Update notebook * Add geometric median aggregation function * Fix linter * Introduce aggregation function interface * Fix linter errors * Fix unit tests * Clear notebook output * Update aggregation function interface * Revert YAML reading changes * Fix unit tests * Update get_plan method in native API * Fix linter * Add documentation article for aggregation functions * Extend docs * Fix unit tests * Add threshold averaging and smoothing * Fix unit tests * Update docstrings * Update docs example * Update interface function arguments * Fix clipping * Fix TensorDB unit test * Drop delta records in clipping example * Fix pip command in notebook * Update docs * Resolve plan after overrides * Remove deltas for exponential averaging * Fix exponential averaging * Update notebook * Fix smoothing arguments * Fix threshold averaging arguments * Change warning frequency for threshold averaging * Show error message if BraTS data is not found (#78) * @sarthakpati has signed the CLA from Pull Request #80 * Added citation (#80) * ignore updated * azure set up added with twine auto-upload of pip packages (wheel and sdist) * pytest added * revert changes * added citation * Updated citation link * Remove azure-pipelines.yaml from PR Co-authored-by: Alexey Gruzdev <[email protected]> * Remove usage of urllib (#81) * Custom aggregation functions in CLI (#79) * Allow to specify custom aggregation function in CLI * Fix threshold averaging * Remove TensorDB predefined aggregation functions * Fix unit tests * Always use FedAvg for metric tensors * Use singleton default averaging instance * Create Singleton type in utilities package * Use metaclass instead of multiple inheritance * Update plan.py Co-authored-by: Alexey Gruzdev <[email protected]> * Update __version__.py * Update setup.py Co-authored-by: Kyle Shannon <[email protected]> Co-authored-by: Alexey Gruzdev <[email protected]> Co-authored-by: Ilya Trushkin <[email protected]> Co-authored-by: Dzhakhongir Olegov <[email protected]> Co-authored-by: Tony Reina <[email protected]> Co-authored-by: maradionov <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: walteriviera <[email protected]> Co-authored-by: walteriviera <[email protected]> Co-authored-by: Olga Perepelkina <[email protected]> Co-authored-by: igor-davidyuk <[email protected]> Co-authored-by: Sarthak Pati <[email protected]>
securefederatedai · May 19, 2021 · 1f5d6e7 · 1f5d6e7
1 parent 3645f8d
commit 1f5d6e7
Show file tree

Hide file tree

Showing 118 changed files with 15,229 additions and 687 deletions.
diff --git a/.github/workflows/cla.yml b/.github/workflows/cla.yml
@@ -0,0 +1,36 @@
+name: "CLA Assistant"
+on:
+  issue_comment:
+    types: [created]
+  pull_request_target:
+    types: [opened,closed,synchronize]
+
+jobs:
+  CLAssistant:
+    runs-on: ubuntu-latest
+    steps:
+      - name: "CLA Assistant"
+        if: (github.event.comment.body == 'recheck' || github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA') || github.event_name == 'pull_request_target'
+        # Beta Release
+        uses: cla-assistant/[email protected]
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          # the below token should have repo scope and must be manually added by you in the repository's secret
+          PERSONAL_ACCESS_TOKEN : ${{ secrets.PERSONAL_ACCESS_TOKEN }}
+        with:
+          path-to-signatures: '.signatures/cla.json'
+          path-to-document: 'https://github.com/intel/openfl/blob/develop/CLA.md' # e.g. a CLA or a DCO document
+          # branch should not be protected
+          branch: 'develop'
+          allowlist: alexey-gruzdev,psfoley,msheller,brandon-edwards,tonyreina,itrushkin,aleksandr-mokrov,igor-davidyuk,operepel,maradionov,mansishr,dmitryagapov,openfl-helper,bot*
+
+         #below are the optional inputs - If the optional inputs are not given, then default values will be taken
+          #remote-organization-name: enter the remote organization name where the signatures should be stored (Default is storing the signatures in the same repository)
+          #remote-repository-name:  enter the  remote repository name where the signatures should be stored (Default is storing the signatures in the same repository)
+          #create-file-commit-message: 'For example: Creating file for storing CLA Signatures'
+          #signed-commit-message: 'For example: $contributorName has signed the CLA in #$pullRequestNo'
+          #custom-notsigned-prcomment: 'pull request comment with Introductory message to ask new contributors to sign'
+          #custom-pr-sign-comment: 'The signature to be committed in order to sign the CLA'
+          #custom-allsigned-prcomment: 'pull request comment when all contributors has signed, defaults to **CLA Assistant Lite bot** All Contributors have signed the CLA.'
+          #lock-pullrequest-aftermerge: false - if you don't want this bot to automatically lock the pull request after merging (default - true)
+          #use-dco-flag: true - If you are using DCO instead of CLA
diff --git a/.gitignore b/.gitignore
@@ -3,4 +3,5 @@ __pycache__
 /build
 /dist
 .vscode
-.ipynb_checkpoints
+.ipynb_checkpoints
+venv/*
diff --git a/.signatures/cla.json b/.signatures/cla.json
@@ -0,0 +1,36 @@
+{
+  "signedContributors": [
+    {
+      "name": "grib0ed0v",
+      "id": 10911871,
+      "comment_id": 811036981,
+      "created_at": "2021-03-31T12:41:17Z",
+      "repoId": 329117231,
+      "pullRequestNo": 50
+    },
+    {
+      "name": "walteriviera",
+      "id": 78015934,
+      "comment_id": 811258002,
+      "created_at": "2021-03-31T17:06:50Z",
+      "repoId": 329117231,
+      "pullRequestNo": 35
+    },
+    {
+      "name": "walteriviera",
+      "id": 78015934,
+      "comment_id": 811349094,
+      "created_at": "2021-03-31T19:00:38Z",
+      "repoId": 329117231,
+      "pullRequestNo": 35
+    },
+    {
+      "name": "sarthakpati",
+      "id": 11719673,
+      "comment_id": 841339208,
+      "created_at": "2021-05-14T16:06:18Z",
+      "repoId": 329117231,
+      "pullRequestNo": 80
+    }
+  ]
+}
diff --git a/CLA.md b/CLA.md
@@ -0,0 +1,16 @@
+# Open Federated Learning (OpenFL) Contributor License Agreement
+
+In order to clarify the intellectual property license granted with Contributions from any person or entity, Intel Corporation ("Intel") must have a Contributor License Agreement ("CLA") on file that has been signed by each Contributor, indicating agreement to the license terms below. This license is for your protection as a Contributor as well as the protection of Intel; it does not change your rights to use your own Contributions for any other purpose.
+You accept and agree to the following terms and conditions for Your present and future Contributions submitted to Intel. Except for the license granted herein to Intel and recipients of software distributed by Intel, You reserve all right, title, and interest in and to Your Contributions.
+
+1.	Definitions.
+"You" (or "Your") shall mean the copyright owner or legal entity authorized by the copyright owner that is making this Agreement with Intel. For legal entities, the entity making a Contribution and all other entities that control, are controlled by, or are under common control with that entity are considered to be a single Contributor. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
+"Contribution" shall mean any original work of authorship, including any modifications or additions to an existing work, that is intentionally submitted by You to Intel for inclusion in, or documentation of, any of the products owned or managed by Intel (the "Work"). For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to Intel or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, Intel for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by You as "Not a Contribution."
+2.	Grant of Copyright License. Subject to the terms and conditions of this Agreement, You hereby grant to Intel and to recipients of software distributed by Intel a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense, and distribute Your Contributions and such derivative works.
+3.	Grant of Patent License. Subject to the terms and conditions of this Agreement, You hereby grant to Intel and to recipients of software distributed by Intel a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by You that are necessarily infringed by Your Contribution(s) alone or by combination of Your Contribution(s) with the Work to which such Contribution(s) was submitted. If any entity institutes patent litigation against You or any other entity (including a cross-claim or counterclaim in a lawsuit) alleging that your Contribution, or the Work to which you have contributed, constitutes direct or contributory patent infringement, then any patent licenses granted to that entity under this Agreement for that Contribution or Work shall terminate as of the date such litigation is filed.
+4.	You represent that you are legally entitled to grant the above license. If your employer(s) has rights to intellectual property that you create that includes your Contributions, you represent that you have received permission to make Contributions on behalf of that employer, that your employer has waived such rights for your Contributions to Intel, or that your employer has executed a separate Corporate CLA with Intel.
+5.	You represent that each of Your Contributions is Your original creation (see section 7 for submissions on behalf of others). You represent that Your Contribution submissions include complete details of any third-party license or other restriction (including, but not limited to, related patents and trademarks) of which you are personally aware and which are associated with any part of Your Contributions.
+6.	You are not expected to provide support for Your Contributions, except to the extent You desire to provide support. You may provide support for free, for a fee, or not at all. Unless required by applicable law or agreed to in writing, You provide Your Contributions on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON- INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE.
+7.	Should You wish to submit work that is not Your original creation, You may submit it to Intel separately from any Contribution, identifying the complete details of its source and of any license or other restriction (including, but not limited to, related patents, trademarks, and license agreements) of which you are personally aware, and conspicuously marking the work as "Submitted on behalf of a third-party: [named here]".
+8.	You agree to notify Intel of any facts or circumstances of which you become aware that would make these representations inaccurate in any respect.
+
diff --git a/README.md b/README.md
@@ -5,8 +5,10 @@
 [![Jenkins](https://img.shields.io/jenkins/build?jobUrl=http%3A%2F%2F213.221.44.203%2Fjob%2FFederated-Learning%2Fjob%2Fnightly%2F)](http://213.221.44.203/job/Federated-Learning/job/nightly/)
 [![Documentation Status](https://readthedocs.org/projects/openfl/badge/?version=latest)](https://openfl.readthedocs.io/en/latest/?badge=latest)
 [![PyPI version](https://img.shields.io/pypi/v/openfl)](https://pypi.org/project/openfl/)
-[<img src="https://img.shields.io/badge/[email protected]?logo=slack">](https://join.slack.com/t/openfl/shared_invite/zt-lo1djtw4-DWAQE_wgfp1N_o2RUMqq9Q) 
+[<img src="https://img.shields.io/badge/[email protected]?logo=slack">](https://join.slack.com/t/openfl/shared_invite/zt-ovzbohvn-T5fApk05~YS_iZhjJ5yaTw) 
 [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
+[![Citation](https://img.shields.io/badge/cite-citation-blue)](https://arxiv.org/abs/2105.06413)
+
 
 [Federated learning](https://en.wikipedia.org/wiki/Federated_learning) is a distributed machine learning approach that
 enables organizations to collaborate on machine learning projects
@@ -29,6 +31,13 @@ Intel Internet of Things Group.
 
 ![Federated Learning](https://raw.githubusercontent.com/intel/openfl/master/docs/images/diagram_fl.png)
 
+## Getting started
+
+Check out our [online documentation](https://openfl.readthedocs.io/en/latest/index.html) to launch your first federation.  The quickest way to test OpenFL is through our [Jupyter Notebook tutorials](https://openfl.readthedocs.io/en/latest/running_the_federation.notebook.html).
+
+For more questions, please consider joining our [Slack channel](https://openfl.slack.com).
+
+
 ## Requirements
 
 - OS: Tested on Ubuntu Linux 16.04 and 18.04.
@@ -43,13 +52,28 @@ By contributing to the project, you agree to the license and copyright terms the
 and release your contribution under these terms.
 
 ## Resources:
-* Docs: https://openfl.readthedocs.io/en/latest/index.html
+* Docs and Tutorials: https://openfl.readthedocs.io/en/latest/index.html
 * Issue tracking: https://github.com/intel/openfl/issues
+* [Slack channel](https://openfl.slack.com)
+
+## Citation
+
+```
+@misc{reina2021openfl,
+      title={OpenFL: An open-source framework for Federated Learning}, 
+      author={G Anthony Reina and Alexey Gruzdev and Patrick Foley and Olga Perepelkina and Mansi Sharma and Igor Davidyuk and Ilya Trushkin and Maksim Radionov and Aleksandr Mokrov and Dmitry Agapov and Jason Martin and Brandon Edwards and Micah J. Sheller and Sarthak Pati and Prakash Narayana Moorthy and Shih-han Wang and Prashant Shah and Spyridon Bakas},
+      year={2021},
+      eprint={2105.06413},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG}
+}
+```
 
 ## Support
 Please report questions, issues and suggestions using:
 
 * [GitHub* Issues](https://github.com/intel/openfl/issues)
+* [Slack channel](https://openfl.slack.com)
 
 ### Relation to OpenFederatedLearning and the Federated Tumor Segmentation (FeTS) Initiative
 

diff --git a/docs/advanced_topics.rst b/docs/advanced_topics.rst
@@ -11,5 +11,7 @@ Advanced Topics
    :maxdepth: 4
 
    multiple_plans
+   compression_settings
+   overriding_agg_fn
 
 
diff --git a/docs/compression_settings.rst b/docs/compression_settings.rst
@@ -0,0 +1,27 @@
+.. # Copyright (C) 2021 Intel Corporation
+.. # Licensed subject to the terms of the separately executed evaluation license agreement between Intel Corporation and you.
+
+.. _compression_settings:
+
+***************
+Compression Settings
+***************
+
+Federated Learning can enable tens to thousands of participants to work together on the same model, but with this scaling comes increased communication cost. Furthermore, large models exacerbate this problem. For this reason we make compression is a core capability of |productName|, and our framework supports several lossless and lossy compression pipelines out of the box. In general, the weights of a model are typically not robust to information loss, so no compression is applied by default to the model weights sent bidirectionally; however, the deltas between the model weights for each round are inherently more sparse and better suited for lossy compression. The following is the list of compression pipelines that |productName| currently supports:
+
+* ``NoCompressionPipeline``: This is the default option applied to model weights
+* ``RandomShiftPipeline``: A **lossless** pipeline that randomly shifts the weights during transport
+* ``STCPipeline``: A **lossy** pipeline consisting of three transformations: *Sparsity Transform* (p_sparsity=0.1), which by default retains only the (p*100)% absolute values of greatest magnitude. *Ternary Transform*, which discretizes the sparse array into three buckets, and finally a *GZIP Transform*. 
+* ``SKCPipeline``: A **lossy** pipeline consisting of three transformations: *Sparsity Transform* (p=0.1), which by default retains only the(p*100)% absolute values of greatest magnitude. *KMeans Transform* (k=6), which applies the KMeans algorithm to the sparse array with *k* centroids, and finally a *GZIP Transform*. 
+* ``KCPipeline``: A **lossy** pipeline consisting of two transformations: *KMeans Transform* (k=6), which applies the KMeans algorithm to the original weight array with *k* centroids, and finally a *GZIP Transform*. 
+
+We provide an example template, **keras_cnn_with_compression**, that utilizes the *KCPipeline* with 6 centroids for KMeans. To gain a better understanding of how experiments perform with greater or fewer centroids, you can modify the *n_clusters* parameter in the template's plan.yaml:
+
+    .. code-block:: console
+    
+       compression_pipeline :
+         defaults : plan/defaults/compression_pipeline.yaml
+         template : openfl.pipelines.KCPipeline
+         settings :
+           n_clusters : 6
+
diff --git a/docs/conf.py b/docs/conf.py
@@ -37,7 +37,7 @@
     "sphinx-prompt",
     'sphinx_substitution_extensions',
     "sphinx.ext.ifconfig",
-    "sphinxcontrib.mermaid"
+    "sphinxcontrib.kroki"
 ]
 
 # -- Project information -----------------------------------------------------

diff --git a/docs/images/openfl_flow.png b/docs/images/openfl_flow.png
diff --git a/docs/index.rst b/docs/index.rst
@@ -28,13 +28,14 @@ can use any deep learning frameworks, such as `Tensorflow <https://www.tensorflo
 |productName| is developed by Intel Labs and Intel Internet of Things Group.
 
 .. toctree::
-   :maxdepth: 4
+   :maxdepth: 2
    :caption: Contents:
 
    manual
    openfl
    models
    data
+   troubleshooting
 
 
 Indices and tables

diff --git a/docs/install.docker.rst b/docs/install.docker.rst
@@ -45,7 +45,7 @@ The current design is based on three assumptions:
    :alt: Docker design
    :scale: 70%
 
-   Docker design
+.. centered:: Docker design
 
 
 Build the docker image

diff --git a/docs/install.initial.rst b/docs/install.initial.rst
@@ -37,4 +37,4 @@ On every node in the federation you will need to install the |productName| packa
    .. figure:: images/fx_help.png
       :scale: 70 %
 
-      fx command
+.. centered:: fx command
diff --git a/docs/mermaid/CSR_signing.md → docs/mermaid/CSR_signing.mmd b/docs/mermaid/CSR_signing.md → docs/mermaid/CSR_signing.mmd
@@ -1,4 +1,3 @@
-```mermaid
 sequenceDiagram
 Title: Collaborator Certificate Signing Flow 
   participant A as Alice
@@ -16,5 +15,3 @@ Title: Collaborator Certificate Signing Flow
   B->>BG: Bob runs script to sign .csr,<br/> confirming the hash as input,<br/> creating the .crt file
   B->>A: Bob sends the .crt file back to Alice
   A->>AC: Alice copies the signed certificate (.crt)<br/>to her collaborator node.<br/>She now has a signed certificate.
-  
-```