Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dedicated cloud environment for each of the universities participating in the Climate Risk Research Challenge #346

Open
3 tasks
HeatherAck opened this issue Aug 18, 2023 · 22 comments

Comments

@HeatherAck
Copy link
Contributor

HeatherAck commented Aug 18, 2023

As part of the Sustainable Africa Initiative and Climate Risk Research Challenge in Nigeria, we are providing Nigerian university participants with cloud compute resources. We have procured additional AWS credits ($25K) for this purpose. Contact [email protected] for credit info.

  • Please set up each team with a private repo; set this up by 24-Aug, for use between 1-Sep and 31-Dec
    Universities involved:
    • University of Ilorin
    • Kwara State University
    • University of Ibadan
    • University of Lagos
    • Federal University of Technology, Port Harcourt
    • Abubakar Tafawa Balewa University
    • Bauchi State University of Science and Technology
    • Bayero University
    • Kano and Kano State University of Science and Technology
    • Federal University of Technology, Owerri

  • Estimate 20 teams across all universities with a maximum of 10 team members per team (avg. 5); each team member will likely use Jupyter notebook. ~32 GB/8 core box; 10G per participant; initial set up by 24-Aug; final set up by 28-Aug

  • Github – private repo for each team with team members given sole access under https://github.com/SustainableAfrica/ClimateRiskChallenge/tree/main/Nigeria; initial set up by 24-Aug; final set up by 28-Aug.

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Aug 21, 2023

Heather to provide access for all students.

  • Put cluster in EU due to location in Africa
  • Determine if can add credits to existing account
  • Determine if can create a AWS account under OS-Climate org
  • Create a configure OpenShift cluster
  • Vault, test new version of ODH
  • No CI/CD - manual (ArgoCD only if faster)
  • Fixed resource model, manual monitoring
  • API definition needs to be created to connect OpenShift (Heather to meet with Mikhail)
  • Enable Notebook Controller via ODH
  • Leverage DEX & GitHub for Access
  • Create Openshift Group
  • Each team have own namespace
  • Use secrets external to vault
  • Standard Jupyter Notebook (if special libraries are needed, Universities to self manage - must be open source libraries) to point to private repo
  • Limit notebooks to prevent using more than $25K credits (equal distribution of resources)
  • Data requirements, storage limits - geospatial
  • Set up storage with max limit. Tool to upload and manage data.
  • Each team will get its own namespace - with given set of resources - audit usage on a daily basis.
  • Heather to establish a form for cloud resource requests from each team.

@HeatherAck
Copy link
Contributor Author

@ryanaslett will create a new AWS account (can't create sub-account under OS-C) today 22-Aug. @ryanaslett to set up admin console access and share with Mikhail. @redmikhail will establish the cluster once the account is shared with him. After that - steps include set up git hub ; create authenticator. create application and application keys. connect openshift to github authenticator github.com/sustainableafrica (bind to team then team members will get authenticated against open shift). Heather to create group of users with private repo(s), will still need to create a openshift group/namespace separately [gitops level]

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Aug 23, 2023

will use AWS account that Brendan established.

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Aug 24, 2023

Confirm that South Africa region supports M6 intel, C4, G4DN, P3 (price ~15% more), decision no GPUs for Nigeria's challenge

  • review AWS list of instances and request quota increase - MM
  • enable cape town - RA
  • add credit (pending given billing access) - HA
  • create subdomain apps.sustainableafrica.org - BR
  • build openshift cluster - MM
  • configure authentication - RA
  • update billing policy to include Heather, Ryan and Mikhail - BR
  • connect to data through os-climate

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Aug 25, 2023

  • @redmikhail requested quotas - can only request 2 at a time, need 4 or 5 more. South Africa does not have as many resources available - will request over the weekend and follow. (EC2s limited to 5 at a time). Will not proceed with standing up cluster in South Africa until quotas are all approved, just in case there are limits (special request issued). Backup plan will be to request quotas in Ireland. Will set up openshift - MM
  • Domain DNS - subdomain has not been setup (Brendan submitted this last night - will make sure it is set up today). It is already in Route53. - BR
  • MM created a subdomain in the interim: sai.apps.os-climate.org
  • Definition of API with Git Hub - MM
  • Authentication set up through Git Hub - MM 25-Aug
  • Provision remaining resources - ArgoCD / repo under SustainableAfrica (similar to OS-C). Storage of manifests - MM 25-Aug
  • Create groups that reflect teams in GitHub (Ryan will create a test university team and admin team) - RA
  • Provision new ODH (stable version will be used); https://opendatahub.io/ - Jey
  • yaml file will control who has access to name space (pull request or commit directly) - https://github.com/os-climate/ops-argocd/blob/main/cluster-scope/base/user.openshift.io/groups/cluster-admins/group.yaml
  • Need to create name spaces and jupyter hub

@ryanaslett
Copy link

I've created the teams hierarchy:
image

when we're ready to test, I have a testing github account assigned to the The University of Maximegalon so we can verify end to end access.

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Aug 28, 2023

New openshift cluster established: https://console-openshift-console.apps.challenge-cl1.sai.apps.os-climate.org
Next step: to set up authentication

@HeatherAck
Copy link
Contributor Author

Next steps by Thursday:
Brendan to establish the subdomain in route53
Received an error on authentication.
Install ArgoCD and ensure it is running
Provision groups
Install ODH manually
Mikhail to give access to LF team members

Post Thursday: Each team will need independent name space - must confirm can launch Jupyter Notebooks
Goal by Monday/Tuesday to have the ability for Jupyter Notebooks to run models (role play student activities)

@HeatherAck
Copy link
Contributor Author

Re-created the cluster and authentication is now working for open shift.
https://console-openshift-console.apps.ch-cl1.apps.sustainableafricainitiative.org/

Installation of certificate manager in progress. Will be done today.

ArgoCD installation and configuration is next step by @redmikhail
https://console-openshift-console.apps.osc-cl3.eqcq.p1.openshiftapps.com/k8s/ns/openshift-gitops/operators.coreos.com~v1alpha1~ClusterServiceVersion/openshift-gitops-operator.v1.8.3/argoproj.io~v1alpha1~ArgoCD

ODH deployment - manually - preserve artifacts via ArgoCD. @rynofinn will use latest ODH version

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Aug 31, 2023

  • certificates in place and managed by cert mgr (note NA will experience latency as cluster is based in South Africa)
  • will request additional quotas in Ireland just in case latency is bad for all
  • ArgoCD installed but will need to be configured. can create projects for all teams and assign different groups, manage individually.
  • ODH manually deployed.
  • Adding authentication mechanism (flat file based user provider) and will test user experience. If works PRR can run model.
  • create separate set of nodes for running notebooks (more cpu/memory)
  • if notebooks running more than X hours, stop server (calculations will also stop)
  • will also check to see how effective auto scaler will be in scaling up/down resources.

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Sep 1, 2023

authentication mechanism and user testing done. will finish machine sets and create separate nodes today for notebook use and get ODH set up. will focus on ArgoCD over long weekend (minimal namespaces). have a session with PRR team.

Open question - working with S3 buckets - how can avoid sharing credentials via open datahub?
LF team (@rynofinn and anton) to look at 1password - https://github.com/1Password/1password-teams-open-source. https://1password.com/developers

Heather to provide list of teams by university.

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Sep 5, 2023

machine set / nodes for notebook use in place; odh set up complete. Argo CD in progress; created data connector to S3 (will be pre-configured - should be exposed as environmental variable TBC). Thursday will demo how to run a notebook to PRR team.

Heather to get git hub IDs from attendees / team lead and will provide manual access.

Credentials will be shared via 1 password - if required.

@HeatherAck
Copy link
Contributor Author

odh test needs to be rebuilt - work still in progress on ArgoCD

@HeatherAck
Copy link
Contributor Author

Heather to add github ids and send to Mikhail

@HeatherAck
Copy link
Contributor Author

  • ids added, pending acceptance of them
  • odh admin permissions under review
  • recreated groups with naming convention and clean-up on odh
  • using test ids to confirm access at student level
  • confirming use of S3 buckets
  • will create 5 teams to start
  • will validate python usage within notebooks
  • reconfigure sizing options available in drop down (S, M only)
  • ArgoCD work still pending; once set up then Anton/Ryan can help by adding artifacts to ArgoCD.
  • on-boarding students can take place
  • will present all work done on Friday 8-Sep during stand-up

@HeatherAck
Copy link
Contributor Author

23 teams are registered for the Challenge. I've assigned them identifiers of 1001 through 1023. https://docs.google.com/spreadsheets/d/1fANvoBsFLvWwGDBLNV-bgNPUY23b7iOf/edit?usp=sharing&ouid=111309013911865667965&rtpof=true&sd=true

@HeatherAck
Copy link
Contributor Author

creating S3 buckets and policies, adding projects (trainers and teams 1, 2, 3, 4). Test python script to access S3 bucket. Starting tomorrow will focus on ArgoCD activities. ODH work to complete today. Goal to be done with environment by Thursday.

@HeatherAck
Copy link
Contributor Author

S3 and policies set up. have users add them to groups. 5 groups/data science projects plus 1 for trainers. created the secrets. Need to test python

Heather to set up google doc to map team #s to name.
UNILORIN TEAM A

  1. Jimfaa: Github ID: 144264109 ( Team lead)
    2.OlayinkaOmidokun: GitHub ID: 139630228
    3.Mukhtar Abdulquadir: Github ID:67704493
    4.Damilola Ajibola10: ID: #144265967
  2. Ohiowere David: 117910673
    6.StellaOladele GitHub ID: 144287656
    7.Sulyman Abdussamad:GitHub ID:85704303.
  3. Isiaq Abdulmajeed Opeyemi
    ID: 101581826
  4. Olofintoye-Jedidah: GitHub ID: #144281535
  5. Taofeek Maijindadi: GitHub ID #111041334

@HeatherAck
Copy link
Contributor Author

ID: 144264109 , GitHub username: jimfaa
ID: 139630228 , GitHub username: OlayinkaOmidokun
ID: 67704493 , GitHub username: mobolajiolowo
ID: 144265967 , GitHub username: olawoyevitoria
ID: 117910673 , GitHub username: GreatOhio
ID: 144287656 , GitHub username: OladeleStella
ID: 85704303 , GitHub username: Absamdy
ID: 101581826 , GitHub username: Abdulmajeed001
ID: 144281535 , GitHub username: Olofintoye-Jedidah
ID: 111041334 , GitHub username: Maijindadi

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Sep 12, 2023

challenge-team1 created, users invited, and tied to repo: https://github.com/SustainableAfrica/UNILORIN-TEAM-A

challenge-team2 created, users invited, and tied to repo: https://github.com/SustainableAfrica/UNILORIN-TEAM-B

challenge-team3 created, users invited, and tied to repo: https://github.com/SustainableAfrica/APEX

challenge-team4 created, users invited, and tied to repo: https://github.com/SustainableAfrica/UNILORIN-TEAM-C

@HeatherAck
Copy link
Contributor Author

discovered one issue when using test users where can create notebook outside namespace and disabled it.
will finish setting up the users and focus on ArgoCD activities.
need to add privileges to team admin to give them resource access.
heather to follow up with flo/noah/joe and creating a notebook/model to verify it works.
need to have a readme file and document what you do (e.g., logging in, getting user ids, url to connect, setting up workbench, etc.) (odh documentation expansion)

@HeatherAck
Copy link
Contributor Author

HeatherAck commented Sep 14, 2023

Mikhail has provisioned 4 student groups manually. ArgoCD set up in progress.

To add more groups/users:
User console: https://console-openshift-console.apps.ch-cl1.apps.sustainableafricainitiative.org to create groups and add users via yaml
also need to create a data science project in odh and attach the group to the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants