Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taking too much time in training #42

Open
CS-savvy opened this issue Jul 25, 2019 · 3 comments
Open

Taking too much time in training #42

CS-savvy opened this issue Jul 25, 2019 · 3 comments

Comments

@CS-savvy
Copy link

CS-savvy commented Jul 25, 2019

Model taking 2 hrs for one epoch having 2300 images and batch size is 1, but you guys have mentioned it took only 4 hrs to train page detection model which contains 1600 images for 30 epochs.

can someone tell me the reason?

@solivr
Copy link
Member

solivr commented Jul 26, 2019

Can you give your GPU specs ?

@CS-savvy
Copy link
Author

CS-savvy commented Jul 27, 2019

Thanks for replying @solivr
I am using Azure VM to train dhSegment model having NVIDIA-driver 390.116 GPU - Tesla K80 - 11 Gb gpu with 6 vcpu and 56 GB ram.

https://www.techpowerup.com/gpu-specs/tesla-k80.c2616

GPU -
gpu_config

CPU -
cpu_config

@jalvathi
Copy link

@solivr @CS-savvy are there any updates here?? My instances are crashing due to the fact of memory even though I use 4 GPUs of 16 GB each. Can anyone suggest any improvement??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants