Skip to content

This repository stored the source code for downloading the result of ferderated learning.

Notifications You must be signed in to change notification settings

lesserror/Federated_Download

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Federated_Download

This repository stored the source code for downloading the result of ferderated learning.

系列文章-第6篇:这个系列已完结,如对您有帮助,求点赞收藏评论。 读者寄语:再小的帆,也能远航!

  1. 【k8s完整实战教程0】前言
  2. 【k8s完整实战教程1】源码管理-Coding
  3. 【k8s完整实战教程2】腾讯云搭建k8s托管集群
  4. 【k8s完整实战教程3】k8s集群部署kubesphere
  5. 【k8s完整实战教程4】使用kubesphere部署项目到k8s
  6. 【k8s完整实战教程5】网络服务配置(nodeport/loadbalancer/ingress)
  7. 【k8s完整实战教程6】完整实践-部署一个federated_download项目

1 Coding代码仓库开发源码

在这里插入图片描述

2 本地测试

2.1 git拉取到本地仓库

17211@hqc MINGW64 ~
$ cd /d/research/git_repository/

17211@hqc MINGW64 /d/research/git_repository (master)
$ ls
Federated/  federated_with_flask/  hello_world/

17211@hqc MINGW64 /d/research/git_repository (master)
$ git clone https://e.coding.net/hqc12/hqc/federated_download.git
Cloning into 'federated_download'...
remote: Enumerating objects: 31, done.
remote: Counting objects: 100% (31/31), done.
remote: Compressing objects: 100% (22/22), done.
remote: Total 31 (delta 5), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (31/31), 12.44 KiB | 849.00 KiB/s, done.
Resolving deltas: 100% (5/5), done.

17211@hqc MINGW64 /d/research/git_repository (master)
$ ls
Federated/  federated_download/  federated_with_flask/  hello_world/

17211@hqc MINGW64 /d/research/git_repository (master)
$ cd federated_download/

17211@hqc MINGW64 /d/research/git_repository/federated_download (master)
$ ls
Dockerfile  README.md  main.py  requirement.txt  static/  templates/

2.2 本地运行测试

打开pycharm,运行测试 在这里插入图片描述

出错了,查找资料好像说是conda环境中存在两个该.dll文件造成的,运行在docker中应该不会出现这个问题。 为了进行本地测试,还是在本地解决一下,代码加上:

import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

再次运行测试: 在这里插入图片描述

2.3 访问

两个网址都可以进行访问 点击即可下载 在这里插入图片描述

成功!!!

2.4 有个小问题

好像没法指定下载到设置的downloads文件夹中,就是直接浏览器下载。因此这个文件夹可以不要了。

3 制作成容器镜像

3.1 创建制品仓库

在这里插入图片描述

直接使用现有仓库

3.2 构建计划

采用之前的构建计划,更换一下代码仓库即可 另外,为提升二次构建镜像的速度,设置缓存目录 在这里插入图片描述

缓存目录设置参考:https://coding.net/help/docs/ci/configuration/cache.html 由于使用的是pip下载依赖,因此设置对应的缓存目录

3.3 立即构建

构建时 Dockerfile中COPY源码进容器这一步骤 出现问题,多方尝试之后,下面这样是ok的。

FROM python:3.7
WORKDIR . # 指定容器工作目录
COPY Dockerfile main.py requirement.txt ./ 
COPY /templates/download.html ./templates/download.html
RUN pip install -r requirement.txt
EXPOSE 5000
RUN /bin/bash -c 'echo init ok'
CMD ["python", "main.py"]

注意:

  1. 目标路径必须是一个目录,并以/ 结尾
  2. 不能直接复制文件夹,它默认会只复制文件夹中的文件,破坏文件结构 构建成功: 在这里插入图片描述

3.4 镜像仓库查看

成功! 在这里插入图片描述

4 搭建k8s集群

按之前的记录搭建即可

5 docker镜像测试

5.1 本地拉取镜像

# 登录
ubuntu@VM-1-15-ubuntu:~$ sudo docker login -u federated_project-1667453994942 -p 55a845e185c9bda4ba21fd2227a7538b55717b00 hqc12-docker.pkg.coding.net
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /home/ubuntu/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
# 拉取
ubuntu@VM-1-15-ubuntu:~$ sudo docker pull hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download:v0.0.1
v0.0.1: Pulling from hqc/federated_project/federated-download
17c9e6141fdb: Pull complete 
de4a4c6caea8: Pull complete 
4edced8587e6: Pull complete 
a7969cffbf46: Pull complete 
74fbfde6af91: Pull complete 
16fe51aed899: Pull complete 
a418194ab798: Pull complete 
e1b9101d5fa4: Pull complete 
c0b070a4672c: Pull complete 
7094a060c489: Pull complete 
6927575c3e2a: Pull complete 
be9ca32391c3: Pull complete 
aa5d393447e7: Pull complete 
Digest: sha256:4a2cb2995c0a9e10e2bef840a39e96843c043d50256c70d95bd7fc2f2a0362fe
Status: Downloaded newer image for hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download:v0.0.1
hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download:v0.0.1

5.2 本地运行镜像

# 查看当前镜像
ubuntu@VM-1-15-ubuntu:~$ sudo docker images
REPOSITORY                                                             TAG                 IMAGE ID            CREATED             SIZE
hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download   v0.0.1              135908d3f14b        50 minutes ago      3.32GB
...

# 打新的标签
ubuntu@VM-1-15-ubuntu:~$ sudo docker tag 135908d3f14b federated-download:v0.0.1
...

# 删除之前的镜像
ubuntu@VM-1-15-ubuntu:~$ sudo docker rmi hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download:v0.0.1
Untagged: hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download:v0.0.1
Untagged: hqc12-docker.pkg.coding.net/hqc/federated_project/federated-download@sha256:4a2cb2995c0a9e10e2bef840a39e96843c043d50256c70d95bd7fc2f2a0362fe

# 再次查看
ubuntu@VM-1-15-ubuntu:~$ sudo docker images
REPOSITORY                                                   TAG                 IMAGE ID            CREATED             SIZE
federated-download                                           v0.0.1              135908d3f14b        51 minutes ago      3.32GB
...

5.3 本地直接运行测试镜像

# 运行
ubuntu@VM-1-15-ubuntu:~$ sudo docker run federated-download:v0.0.1
2022-11-03 05:56:57.531568: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-11-03 05:56:57.531673: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2022-11-03 05:56:57.531708: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ac30a6317b98): /proc/driver/nvidia/version does not exist
2022-11-03 05:56:57.532244: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-11-03 05:56:57.559485: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2494085000 Hz
2022-11-03 05:56:57.559964: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fb718000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-11-03 05:56:57.560008: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step
ubuntu@VM-1-15-ubuntu:~$

没法持续运行。

5.4 创建deployment部署上k8s测试

# 创建deployment的yaml文件
ubuntu@VM-1-15-ubuntu:~$ vim federated-dp.yaml
# 创建service的yaml文件
ubuntu@VM-1-15-ubuntu:~$ vim federated-svc.yaml

ubuntu@VM-1-15-ubuntu:~$ ls
federated-dp.yaml  federated-svc.yaml
# 部署deployment
ubuntu@VM-1-15-ubuntu:~$ kubectl apply -f federated-dp.yaml 
deployment.apps/federated-deployment created
# 查看,暂时正常
ubuntu@VM-1-15-ubuntu:~$ kubectl get all
NAME                                        READY   STATUS    RESTARTS   AGE
pod/federated-deployment-54f5db67fd-g8p9j   1/1     Running   0          6s
pod/federated-deployment-54f5db67fd-tdwdl   1/1     Running   0          6s

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/federated-deployment   2/2     2            2           6s

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/federated-deployment-54f5db67fd   2         2         2       6s
# 部署service
ubuntu@VM-1-15-ubuntu:~$ kubectl apply -f federated-svc.yaml 
service/federated-service created

# 查看,暂时也正常
ubuntu@VM-1-15-ubuntu:~$ kubectl get all
NAME                                        READY   STATUS    RESTARTS   AGE
pod/federated-deployment-54f5db67fd-g8p9j   1/1     Running   0          35s
pod/federated-deployment-54f5db67fd-tdwdl   1/1     Running   0          35s

NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/federated-service   NodePort    172.16.253.172   <none>        80:30000/TCP   3s
service/kubernetes          ClusterIP   172.16.252.1     <none>        443/TCP        37m

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/federated-deployment   2/2     2            2           36s

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/federated-deployment-54f5db67fd   2         2         2       36s
# 过一会儿查看,出错
ubuntu@VM-1-15-ubuntu:~$ kubectl get all
NAME                                        READY   STATUS             RESTARTS   AGE
pod/federated-deployment-54f5db67fd-2hn6s   0/1     Evicted            0          33s
pod/federated-deployment-54f5db67fd-ddxts   0/1     Evicted            0          33s
pod/federated-deployment-54f5db67fd-dn9dn   0/1     Evicted            0          33s
pod/federated-deployment-54f5db67fd-g8p9j   0/1     Evicted            0          113s
pod/federated-deployment-54f5db67fd-jkrfn   0/1     Evicted            0          33s
pod/federated-deployment-54f5db67fd-jwfqp   0/1     Evicted            0          33s
pod/federated-deployment-54f5db67fd-mjxk8   0/1     Evicted            0          33s
pod/federated-deployment-54f5db67fd-mm8b6   0/1     Evicted            0          32s
pod/federated-deployment-54f5db67fd-tdwdl   1/1     Running            0          113s
pod/federated-deployment-54f5db67fd-wh6qk   0/1     ImagePullBackOff   0          32s
pod/federated-deployment-54f5db67fd-xdktv   0/1     Evicted            0          33s

NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/federated-service   NodePort    172.16.253.172   <none>        80:30000/TCP   80s
service/kubernetes          ClusterIP   172.16.252.1     <none>        443/TCP        39m

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/federated-deployment   1/2     2            1           113s

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/federated-deployment-54f5db67fd   2         2         1       113s

失败

6 搭建kubesphere

6.1 安装

1 安装
kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.0/kubesphere-installer.yaml
2 下载配置文件
wget https://github.com/kubesphere/ks-installer/releases/download/v3.3.0/cluster-configuration.yaml
3 更改配置文件
vim cluster-configuration.yaml 
4 部署配置文件
kubectl apply -f cluster-configuration.yaml
5 查看部署过程
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f

6.2 查看

在这里插入图片描述

安装完成,虽然报错但可以正常登录(是因为服务器资源不足的原因,过一会会变好)

6.3 登录

在这里插入图片描述

7 用kubesphere部署

7.1 关联coding制品仓库

  1. 获取coding制品仓库的配置令牌
  2. kubesphere中创建秘钥 配置-保密字典-创建 在这里插入图片描述

7.2 创建工作负载-部署(deployment)

应用负载-工作负载-创建 在这里插入图片描述

成功!

8 创建服务来访问应用(NodePort方式)

8.1 kubesphere创建service

kubesphere-应用负载-服务-创建 在这里插入图片描述

8.2 访问

在这里插入图片描述

失败!

9 更改源码

9.1 更改源码依赖的版本

python改为3.9版本 在这里插入图片描述

9.2 构建计划制作镜像推送到制品仓库

在这里插入图片描述

10 kubesphere重新创建dp和svc

在这里插入图片描述 在这里插入图片描述

11 访问

在这里插入图片描述

可以访问,并且可以正常下载。

至此,实践成功!

源码开源,若对您有所帮助,请不要吝啬您的star,开放交流学习才能进步!

About

This repository stored the source code for downloading the result of ferderated learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published