Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phasing out use of n1 machines in favor of n2 machines on GKE #2923

Closed
14 of 19 tasks
consideRatio opened this issue Aug 4, 2023 · 1 comment
Closed
14 of 19 tasks

Phasing out use of n1 machines in favor of n2 machines on GKE #2923

consideRatio opened this issue Aug 4, 2023 · 1 comment
Assignees
Labels
tech:cloud-infra Optimization of cloud infra to reduce costs etc.

Comments

@consideRatio
Copy link
Contributor

consideRatio commented Aug 4, 2023

I've researched to conclude that n2 machines seems like an almost pure win with:

  • n2-highmem is more cost effective per RAM
  • n2-highmem is more cost effective per CPU performance
  • n2-highmem has 1:8 CPU:RAM ratio, while the n1 stands out with the unique ratio of 1:6.5

Proposal

  • To change anything we touch anyhow to n2 from n1
    Exception: we need to retain n1 nodes for GPU machines as they may only come as n1 variants
  • To update templates so new clusters are bootstrapped with n2 machines by default

Current use of n1 machines

  • 2i2c-uk core nodes
  • 2i2c-uk user nodes
  • cloudbank core nodes
  • cloudbank user nodes
  • cloudbank dask nodes (*use only one node pool with n2-highmem-16)
  • m2lines core nodes
  • m2lines user nodes
  • m2lines dask nodes (*use only one node pool with n2-highmem-16)
  • meom-ige core nodes
  • meom-ige user nodes
  • meom-ige dask nodes (*use only one node pool with n2-highmem-16)
  • pangeo-hubs core nodes
  • pangeo-hubs user nodes
  • pangeo-hubs dask nodes (*use only one node pool with n2-highmem-16)
  • two-eye-two-see core nodes
  • two-eye-two-see user nodes
  • two-eye-two-see dask nodes (*use only one node pool with n2-highmem-16)
  • two-eye-two-see neurohackademy nodes

*In #2687 I'm proposing we standardize on n2-highmem-16 nodes for dask workers.

Related

The following issues relate to this smaller and more tightly scoped issue.

@consideRatio
Copy link
Contributor Author

Not much changes remains, closing in favor of an issue that is cloud provider agnostic as I want to convey the same strategy for all cloud providers.

@github-project-automation github-project-automation bot moved this from Needs Shaping / Refinement to Complete in DEPRECATED Engineering and Product Backlog Oct 29, 2023
@consideRatio consideRatio self-assigned this Oct 29, 2023
@consideRatio consideRatio moved this to Done 🎉 in Sprint Board Oct 31, 2023
@consideRatio consideRatio removed the status in Sprint Board Oct 31, 2023
@damianavila damianavila moved this to Done 🎉 in Sprint Board Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tech:cloud-infra Optimization of cloud infra to reduce costs etc.
Projects
No open projects
Status: Done 🎉
Development

No branches or pull requests

1 participant