-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset get item format #19
Comments
Hey there, to be honest, I never really used/tested the Regarding the Cityscapes images I used the training split of the standard 5000 images ( Hope this helps! |
Hey there, Thanks for the response. I can definitely fix the sorting issue, the only thing I'm hesitant about is how it should look in the end.
? |
Hey, yes this would be the format I would also expect. The second index in the key gives the frame number, so the key ('color', 0, -1) should contain all images for which you have a preceding and succeeding frame (2.jpg, 3.jpg, 4.jpg). The corresponding preceding and succeeding frames should be in the other keys (-1 = preceding, 1= succeeding). |
Hey there, Thanks for the help. I have a question regarding learning the depth I have a different segmentation task with only two classes to segment. When trying first with cross-entropy and with some weighting it refused to learn completely since my classes are unbalanced and after using focal loss with the high focal parameter it became better. However, I still can't overfit the depth (10 epochs). Below is the result I get from the inference script What are the possible reasons for such behavior? Depth loss during training is always around 0.11. I've also checked whether the images [-1, 0, 1] are passed correctly to the loss computation function. |
Hey there, on some datasets I could observe the same behaviour, that the depth training tends to be rather unstable in the beginning, resulting in the behaviour you describe that the output is just constant and the loss remains unchanged. The combined training of two networks (depth + pose) using just a single loss tends to be sensitive to the choice of the initial images, though I could find out a pattern here, yet. If you train the depth without the segmentation part, does it converge then? A first solution could then be to use network weights pretrained on KITTI and to see, if the training/overfitting converges then. My guess would be that if the initial output is already closer to a depth map, then the training should be more stable. Another possibility would be to use some kind of supervision for the pose network (would also solve the scale ambiguity), if that is an option in your case. |
Hey there, Tried it without seg loss, just setted Tried starting from the checkpoint excluding segmentation blocks (we have 2 classes for segmentation instead of 20). Got the following result after overfitting for 10 epochs After overfitting even more, for 50 epochs, the quality of the depth map improved, while segmentation becomes partially "killed" However, the default network output has a smoother depth.
|
Hey there, Regarding 1. you could also try to train at a lower resolution, depending on what output resolution you need in the end. If 416/128 is enough for you, then this might work. Also, the multi-task training is sometimes sensitive to the weighting of the two tasks, so you could also see, if varying this weighting gives you better results. Regarding 2. I indeed was thinking of supervising the egomotion by some kind of ground truth. If you have the full ground truth, you could use that or, if not, then velocity and time stamps of the images can be used to constrain the translation between two images as in https://arxiv.org/pdf/1905.02693.pdf. This might lead to a little more overfitting, but it could also stabilize the depth/egomotion training in the beginning through this additional constraint. |
Hey there,
Thanks for your work!
I have a question regarding the dataset format. I'm currently using my own dataset for both depth and segmentation and decided to implement it via
simple_mode
with the read_from_folder function. Is there any way to know what is the expected format of a sample in case of default parameters?I created a test directory containing 'color', 'depth', 'segmentation' subfolders, and the description file. All images are named from 0000 to 0010.
For the training dataset, I get the following 'data_files':
I've read the docs for the data loader repo, but shouldn't (0, -1, 1) be something like (0003, 0004, 0005)?
And another question, which part of the cityscapes was used? There are several options available for download on the website.
The text was updated successfully, but these errors were encountered: