Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

部署fedVision报错 #32

Open
houjibofa2050 opened this issue Dec 15, 2022 · 1 comment
Open

部署fedVision报错 #32

houjibofa2050 opened this issue Dec 15, 2022 · 1 comment

Comments

@houjibofa2050
Copy link

houjibofa2050 commented Dec 15, 2022

使用的是fedvision-deploy deploy deploy --config standalone_template.yaml 命令
报错信息如下
Traceback (most recent call last):

File "/root/fedvision/fedvision/bin/fedvision-deploy", line 8, in
sys.exit(app())

File "/root/fedvision/fedvision/lib/python3.6/site-packages/fedvision_deploy_toolkit/_deploy.py", line 44, in deploy
_maybe_create_python_venv(machine)

File "/root/fedvision/fedvision/lib/python3.6/site-packages/fedvision_deploy_toolkit/_deploy.py", line 69, in _maybe_create_python_venv
raise RuntimeError(f"python executable {machine['python']} not valid")

KeyError: 'python'

standalone_template.yaml 文件
`
machines:

  • name: machine1
    ip: 127.0.0.1
    ssh_string: 127.0.0.1:22
    base_dir: /data/projects/fedvision
    python_for_venv_create: python3 # use to create venv, python3.7+ required

coordinator start/stop only if machine provided

coordinator:
name: coordinator1
machine: machine1
port: 10000

clusters:

  • name: cluster1
    manager:
    machine: machine1
    port: 10001
    workers:
    • name: worker1
      machine: machine1
      ports: 12000-12099
      max_tasks: 10

masters:

  • name: master1
    machine: machine1
    submit_port: 10002
    coordinator: coordinator1
    cluster: cluster1

  • name: master2
    machine: machine1
    submit_port: 10003
    coordinator: coordinator1
    cluster: cluster1

  • name: master3
    machine: machine1
    submit_port: 10004
    coordinator: coordinator1
    cluster: cluster1

  • name: master4
    machine: machine1
    submit_port: 10005
    coordinator: coordinator1
    cluster: cluster1
    `

@jaysontree
Copy link

他启动时是这样的,会ssh到每台机器上然后 验证python版本 -> 创建虚拟环境 -> 拷贝代码 -> 在虚拟环境安装依赖
你报这个错的原因是在验证python版本就错了。看看部署的机器上 python3 能否执行、版本符不符合要求。

报错的内容是配置里没'python'这个key,这个是作者代码的问题,正常的话不会走到这个分支。跟实际的问题没有关系、

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants