Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to deployment guides #3994

Merged
merged 24 commits into from
Oct 7, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
db159e1
Updates to deployment guides
davidmirror-ops Aug 28, 2023
4063745
Update multicluster docs round 2
davidmirror-ops Sep 19, 2023
d36e9dc
Updates instructions from last run
davidmirror-ops Sep 28, 2023
08692a7
Add instructions to add clusters
davidmirror-ops Sep 29, 2023
be2abfd
Fix typos
davidmirror-ops Sep 29, 2023
4c51dc2
Fix JSON indentation in example
davidmirror-ops Sep 29, 2023
70737b7
Fix JSON indentation in example 2nd try
davidmirror-ops Sep 29, 2023
bb25ab6
Fix JSON missing blank line
davidmirror-ops Sep 29, 2023
cf77de9
Fix JSON missing blank line 3rd try
davidmirror-ops Sep 29, 2023
56cfed8
Fix JSON missing blank line 4th try
davidmirror-ops Sep 29, 2023
828359b
Fix JSON syntax
davidmirror-ops Sep 29, 2023
c7f7e3e
Fix JSON syntax 6th try
davidmirror-ops Sep 29, 2023
18f0fc8
Remove JSON block
davidmirror-ops Oct 2, 2023
e5cea21
Fix error in line 57
davidmirror-ops Oct 3, 2023
e9b685b
Fix spelling
davidmirror-ops Oct 3, 2023
b257695
Apply feedback from review
davidmirror-ops Oct 3, 2023
ee5c001
Fix hyperlink
davidmirror-ops Oct 3, 2023
7db1b7f
Fix blank space
davidmirror-ops Oct 3, 2023
0e2d871
Incorporate review
davidmirror-ops Oct 3, 2023
9199f41
Incorporate 2nd round of review
davidmirror-ops Oct 4, 2023
db07bab
Instructions using 2 IAM Roles
davidmirror-ops Oct 4, 2023
83d8c35
Incorporate 3rd round of feedback
davidmirror-ops Oct 5, 2023
e784efd
Add instructions to enable controlplane wf execution
davidmirror-ops Oct 5, 2023
fb0bb64
Incorporate 4th round of reviews
davidmirror-ops Oct 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion charts/flyte-binary/eks-production.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ ingress:
nginx.ingress.kubernetes.io/app-root: /console
grpcAnnotations:
nginx.ingress.kubernetes.io/backend-protocol: GRPC
host: development.uniondemo.run
host: development.uniondemo.run # change for the URL you'll use to connect to Flyte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this an inexistent url? Maybe rename this to '<your.flyte.url>' ?

rbac:
extraRules:
- apiGroups:
Expand Down
15 changes: 4 additions & 11 deletions rsts/deployment/deployment/cloud_production.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,18 @@ To turn on ingress, update your ``values.yaml`` file to include the following bl
.. literalinclude:: ../../../charts/flyte-binary/eks-production.yaml
:caption: charts/flyte-binary/eks-production.yaml
:language: yaml
:lines: 123-131
:lines: 127-135

.. note::

This currently assumes that you have nginx ingress. We'll be updating these
in the near future to use the ALB ingress controller instead.
This section assumes that you're using the NGINX Ingress controller. Instructions and annotations for the ALB controller
are covered in the `Flyte The Hard Way <https://github.com/davidmirror-ops/flyte-the-hard-way/blob/main/docs/06-intro-to-ingress.md#setting-up-amazons-load-balancer-alb-ingress-controller>`__ tutorial.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI these were the final annotations that made things work correctly for me

{
					"alb.ingress.kubernetes.io/certificate-arn": helm.values.userSettings.certificateArn
					"alb.ingress.kubernetes.io/group.name":      "flyte"
					"alb.ingress.kubernetes.io/listen-ports":    '[{"HTTPS":443}]'
					"alb.ingress.kubernetes.io/scheme":          "internet-facing"
					"alb.ingress.kubernetes.io/ssl-redirect":    "443"
					"alb.ingress.kubernetes.io/target-type":     "ip"
					"kubernetes.io/ingress.class":               "alb"
					"alb.ingress.kubernetes.io/inbound-cidrs":   "xxx.xxx.xxx.xxx,xxx.xxx.xxx.xxx"
				}```


***************
Authentication
***************

Authentication comes with Flyte in the form of OAuth 2. Please see the
Authentication comes with Flyte in the form of OAuth 2.0. Please see the
`authentication guide <deployment-configuration-auth-setup>`__ for instructions.

.. note::
Expand All @@ -60,10 +60,3 @@ compatibility being maintained, for the most part.

If you're using the :ref:`multi-cluster <deployment-deployment-multicluster>`
deployment model for Flyte, components should be upgraded together.

.. note::

Expect to see minor version releases roughly 4-6 times a year - we aim to
release monthly, or whenever there is a large enough set of features to
warrant a release. Expect to see patch releases at more regular intervals,
especially for flytekit, the Python SDK.
8 changes: 8 additions & 0 deletions rsts/deployment/deployment/cloud_simple.rst
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,14 @@ hello world example:
cd flytesnacks/cookbook
pyflyte run --remote core/flyte_basics/hello_world.py my_wf

***********************************
Flyte in on-premises infrastructure
***********************************

Sometimes, it's also helpful to be able to set up a Flyte environment in an on-premises Kubernetes environment or even on a laptop for testing and development purposes.
Check out `this community-maintained tutorial <https://github.com/davidmirror-ops/flyte-the-hard-way/blob/main/docs/on-premises/001-configure-local-k8s.md>`__ to learn how to setup the required dependencies and deploy the `flyte-binary` chart to a local Kubernetes cluster.


*************
What's Next?
*************
Expand Down
32 changes: 6 additions & 26 deletions rsts/deployment/deployment/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,29 +49,6 @@ deployment comes with a containerized `Minio <https://min.io/>`__, which offers
- **GCP**: `GCS <https://cloud.google.com/storage/>`__
- **Azure**: `Azure Blob Storage <https://azure.microsoft.com/en-us/products/storage/blobs>`__


Cluster Configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just not sure how it fits here. Additionally is not clear what specific K8s resources it refers to (besides namespaces for projects, which is only an example). So this relationship between Flyte objects and K8s resources is worth documenting, but I'm not sure it fits in this sort of pre-requisites section.

=====================

Flyte configures K8s clusters to work with it. For example, as your Flyte userbase evolves, adding new projects is as
simple as registering them through the command line:

.. prompt:: bash $

flytectl create project \
--id my-flyte-project \
--name "My Flyte Project" \
--description "My first project onboarding onto Flyte"

Once you invoke this command, this project should immediately show up in the Flyte console after refreshing.

Flyte runs at a configurable cadence that ensures that all Kubernetes resources necessary for the new project are
created and new workflows can successfully be registered and executed within it.

.. note::

For more information, see :std:ref:`flytectl <flytectl:flytectl_create_project>`.

************************
Flyte Deployment Paths
************************
Expand Down Expand Up @@ -108,7 +85,7 @@ There are three different paths for deploying a Flyte cluster:
This option is appropriate if all your compute can `fit on one EKS cluster <https://docs.aws.amazon.com/eks/latest/userguide/service-quotas.html>`__ .
As of this writing, a single Flyte cluster can handle more than 13,000 nodes.

Whatever path you choose, note that ``FlytePropeller`` itself can be sharded as well, though typically it's not required.
Regardless of using single or multiple Kubernetes clusters for Flyte, note that ``FlytePropeller`` -tha main data plane component- can be sharded as well, if scale demands require it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: tha -> the


Helm
====
Expand Down Expand Up @@ -156,10 +133,13 @@ Deployment Tips and Tricks

Due to the many choices and constraints that you may face in your organization, the specific steps for deploying Flyte
can vary significantly. For example, which cloud platform to use is typically a big fork in the road for many, and there
are many choices to make in terms of ingresses, auth providers, and versions of different dependent libraries that
are many choices to make in terms of Ingress controllers, auth providers, and versions of different dependent libraries that
may interact with other parts of your stack.

In addition to searching and posting on the `Flyte Slack community <https://flyte-org.slack.com/archives/C01P3B761A6>`__,
Considering the above, we recommend checking out the `"Flyte The Hard Way" <https://github.com/davidmirror-ops/flyte-the-hard-way/tree/main#flyte-the-hard-way>`__ set of community-maintained tutorials that can guide you through the process of preparing the infrastructure and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I would avoid pointing to an external guide, give also that the guide you are referring to is mostly referring to the single binary deployment, which might be confusing. I think all documentation should be in one place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. A second iteration of updates to these guides should be pointed to extend the reach and make it more actionable. Also expanding the tutorial in the https://github.com/unionai-oss/deploy-flyte repo to cover flyte-core

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we forked your guide, @davidmirror-ops , and moved it to flyteorg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eapolinario I was thinking of something similar. Moving forward, the idea is that TF is the preferred approach but a manual guide is always a good resource

deploying Flyte.

In addition to searching and posting on the `#flyte-deployment Slack channel <https://flyte-org.slack.com/archives/C01P3B761A6>`__,
we have a `Github Discussion <https://github.com/flyteorg/flyte/discussions/categories/deployment-tips-tricks>`__
section dedicated to deploying Flyte. Feel free to submit any hints you've found helpful as a discussion, ask questions,
or simply document what worked or what didn't work for you.
Expand Down
63 changes: 31 additions & 32 deletions rsts/deployment/deployment/multicluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ Multiple K8s Cluster Deployment

.. note::

The multicluster deployment described in this doc assumes you have deployed
the ``flyte`` Helm chart, which runs the individual Flyte services separately.
The multicluster deployment described in this section, assumes you have deployed
the ``flyte-core`` Helm chart, which runs the individual Flyte services separately.
This is needed because in a multicluster setup, the execution engine is
deployed to multiple K8s clusters. This will not work with the ``flyte-binary``
Helm chart, since that chart deploys all Flyte service as one single binary.
Expand All @@ -24,23 +24,22 @@ Scaling Beyond Kubernetes
execution. The data plane fulfills these workflows by launching pods in
Kubernetes.

At very large companies, total compute needs could exceed the limits of a single
At large organizations, total compute needs could exceed the limits of a single
Kubernetes cluster.

To address this, you can deploy the data plane to multiple Kubernetes clusters.
The control plane (FlyteAdmin) can be configured to load-balance workflows across
these individual data planes, protecting you from failure in a single Kubernetes
cluster increasing scalability.
cluster, thus increasing scalability.

To achieve this, first, you have to create additional Kubernetes clusters.
For now, let's assume you have three Kubernetes clusters and that you can access
To achieve this, first you have to create additional Kubernetes clusters.

This gude assumes that you have three Kubernetes clusters and that you can access
them all with ``kubectl``.

Let's call these clusters ``cluster1``, ``cluster2``, and ``cluster3``.

Next, deploy *only* the data planes to these clusters. To do this, remove the
data plane components from the ``flyte`` overlay, and create a new overlay
containing *only* the data plane resources.
Next, deploy *only* the data planes to these clusters. To do this, use the `values-dataplane.yaml <https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/values-dataplane.yaml>`__ provided with the Helm chart.

Data Plane Deployment
*********************
Expand All @@ -61,16 +60,16 @@ Install Flyte data plane Helm chart

.. code-block::

helm upgrade flyte -n flyte flyteorg/flyte-core values.yaml \
helm upgrade -n flyte -f values.yaml \
-f values-eks.yaml \
-f values-dataplane.yaml \
--create-namespace flyte --install
--create-namespace flyte flyteorg/flyte-core --install

.. tabbed:: GCP

.. code-block::

helm upgrade flyte -n flyte flyteorg/flyte-core values.yaml \
helm upgrade flyte -n flyte flyteorg/flyte-core -f values.yaml \
-f values-gcp.yaml \
-f values-dataplane.yaml \
--create-namespace flyte --install
Expand All @@ -83,24 +82,24 @@ Some Flyte deployments may choose to run the control plane separate from the dat
plane. FlyteAdmin is designed to create Kubernetes resources in one or more
Flyte data plane clusters. For the admin to access remote clusters, it needs
credentials to each cluster.
Flyte makes use of Kubernetess Service Accounts to enable every data plane cluster to perform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not the case. Flyte makes use of Kubernetes (you have a type kubernetess) Service Accounts to enable the control plane to issue authenticated requests to each data plane Kubernetes API server

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, you're right.

authenticated requests to the K8s API Server.
The default behaviour is that ``FlyteAdmin`` creates a `ServiceAccount <https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/templates/admin/rbac.yaml#L4>`_
in each data plane cluster.
In order to verify requests, the API Server expects a `signed bearer token <https://kubernetes.io/docs/reference/access-authn-authz/authentication/#service-account-tokens>`__
attached to the Service Account.

In Kubernetes, scoped service credentials are created by configuring a "Role"
resource in a Kubernetes cluster. When you attach the role to a "ServiceAccount",
Kubernetes generates a bearer token that permits access. Hence, create a
FlyteAdmin `ServiceAccount <https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/templates/admin/rbac.yaml#L4>`_
in each data plane cluster to generate these tokens.

.. warning::

**Never delete a ServiceAccount 🛑**

When you first create the FlyteAdmin ``ServiceAccount`` in a new cluster, a
bearer token is generated and will continue to allow access unless the
"ServiceAccount" is deleted.
.. note::
As of Kubernetes 1.24 an above, the bearer token has to be generated manually for a Service Account, using the following command:

To feed the credentials to FlyteAdmin, you must retrieve them from your new data plane cluster and upload them to admin (for example, within Lyft, `Confidant <https://github.com/lyft/confidant>`__ is used).
.. prompt:: bash $

kubectl create token <service-account-name> -n <namespace>
Copy link
Contributor

@gdabisias gdabisias Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expand on what service-account-name should be here.
If I understand correctly this is a separate service account, created in the data plane, which will be assume by the flyteadmin service account in the control plane to generate all the required resources in the data plane, right?


To feed the credentials to FlyteAdmin, you must retrieve them from your new data plane cluster and upload them to ``FlyteAmin``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use the word "upload" which might be misleading. I'd say that you need update your control plane configuration with the new secrets


The credentials have two parts ("ca cert" and "bearer token"). Find the generated secret via:
The credentials have two parts (``ca cert`` and ``bearer token``). Find the generated secret via:

.. prompt:: bash $

Expand Down Expand Up @@ -133,12 +132,12 @@ file named ``secrets.yaml`` that looks like:
namespace: flyte
type: Opaque
data:
cluster_1_token: {{ cluster 1 token here }}
cluster_1_cacert: {{ cluster 1 cacert here }}
cluster_2_token: {{ cluster 2 token here }}
cluster_2_cacert: {{ cluster 2 cacert here }}
cluster_3_token: {{ cluster 3 token here }}
cluster_3_cacert: {{ cluster 3 cacert here }}
cluster_1_token: "cluster-1-token-here"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we assuming that we have 3 dataplanes or 2? Since in the first case we only need 2 entries. I'd also specify how to change the names of the clusters

cluster_1_cacert: "cluster-1-cacert-here"
cluster_2_token: "cluster-2-token-here"
cluster_2_cacert: "cluster-2-cacert-here"
cluster_3_token: "cluster-3-token-here"
cluster_3_cacert: "cluster-3-cacert-here"

Create cluster credentials secret in the control plane cluster.

Expand Down
16 changes: 4 additions & 12 deletions rsts/deployment/deployment/sandbox.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ Sandbox Deployment

.. tags:: Kubernetes, Infrastructure, Basic

A sandbox deployment of Flyte is bundles together portable versions of Flyte's
A sandbox deployment of Flyte bundles together portable versions of Flyte's
dependencies such as a relational database and durable object store.

For the blob store requirements, Flyte Sandbox uses `Minio <https://min.io/>`__,
which offers an S3 compatible interface, and for Postgres, we use the stock
which offers an S3 compatible interface, and for Postgres, it uses the stock
Postgres Docker image and Helm chart.

.. important::
Expand Down Expand Up @@ -41,7 +41,7 @@ Requirements
- Install `docker <https://docs.docker.com/engine/install/>`__ or any other OCI-compatible tool, like Podman or LXD.
- Install `flytectl <https://github.com/flyteorg/flytectl>`__, the official CLI for Flyte.

While Flyte can run any OCI-compatible task image, using the default Kubernetes container runtime (cri-o), the Flyte
While Flyte can run any OCI-compatible task image using the default Kubernetes container runtime (cri-o), the Flyte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the default k8s container runtime containerd?

core maintainers typically use Docker. Note that the ``flytectl demo`` command does rely on Docker APIs, but as this
demo environment is just one self-contained image, you can also run the image directly using another run time.

Expand Down Expand Up @@ -79,12 +79,4 @@ who wish to dig deeper into the storage layer.
📂 The Minio API is hosted on localhost:30002. Use http://localhost:30080/minio/login for Minio console

Now that you have the sandbox cluster running, you can now go to the :ref:`User Guide <cookbook:userguide>` or
:ref:`Tutorials <cookbook:tutorials>` to run tasks and workflows written in ``flytekit``, the Python SDK for Flyte.

**************************
Flyte Sandbox on the Cloud
**************************

Sometimes it's also helpful to be able to install a sandboxed environment on a cloud provider. That is, you have access
to an EKS or GKE cluster, but provisioning a separate database or blob storage bucket is harder because of a lack of
infrastructure support. Instructions for how to do this will be forthcoming.
:ref:`Tutorials <cookbook:tutorials>` to run tasks and workflows written in ``flytekit``, the Python SDK for Flyte.