Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PyTorch #15

Open
nikita-kotsehub opened this issue Nov 2, 2022 · 4 comments
Open

Add support for PyTorch #15

nikita-kotsehub opened this issue Nov 2, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@nikita-kotsehub
Copy link
Collaborator

  • Adding support for PyTorch would widen the user base
@nikita-kotsehub nikita-kotsehub added the enhancement New feature or request label Nov 2, 2022
@nikita-kotsehub nikita-kotsehub self-assigned this Nov 2, 2022
@nikita-kotsehub
Copy link
Collaborator Author

started working on it in #23

@nikita-kotsehub
Copy link
Collaborator Author

added in #25

@nikita-kotsehub
Copy link
Collaborator Author

The current PyTorch example (flox/examples/quickstart_pytorch/pytorch_funcx.py on #23 ) supports CIFAR10 and should also support all other datasets from torchvision.datasets. However, it does not work with other datasets either because of (1) the Net model class defined in the example or (2) because of the training and data processing methods defined in flox/model_trainers/PyTorchTrainer. I'd appreciate it if someone with experience in PyTorch could look into it and make the PyTorch trainer dynamic such that it works with any datasets.

@nathaniel-hudson

@vinaBira
Copy link

vinaBira commented Jul 7, 2023

I tried running quickstart_pytorch.py file from tutorial...It is not working failing with error:
File "quickstart_pytorch.py", line 134, in
main()
File "quickstart_pytorch.py", line 131, in main
flox_controller.run_federated_learning()
File "/home/cloudlabgpu1/FLoX/flox/controllers/MainController.py", line 565, in run_federated_learning
results = self.on_model_receive(
File "/home/cloudlabgpu1/FLoX/flox/controllers/MainController.py", line 426, in on_model_receive
res = task_data.future.result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/cloudlabgpu1/FLoX/flox/clients/MainClient.py", line 119, in run_round
fit_results = self.on_model_fit(model_trainer, config, processed_training_data)
File "/home/cloudlabgpu1/FLoX/flox/clients/PyTorchClient.py", line 60, in on_model_fit
model_weights = model_trainer.fit(training_data, config)
File "/home/cloudlabgpu1/FLoX/flox/model_trainers/PyTorchTrainer.py", line 41, in fit
outputs = self.model(images)
File "/home/cloudlabgpu1/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "quickstart_pytorch.py", line 79, in forward
x = self.pool(F.relu(self.conv1(x)))
File "/home/cloudlabgpu1/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/cloudlabgpu1/.venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 457, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/cloudlabgpu1/.venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 453, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

@nikita-kotsehub
In this issue, are we talking about the same?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants