Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about joint training #10

Open
Sanster opened this issue Nov 10, 2022 · 1 comment
Open

Question about joint training #10

Sanster opened this issue Nov 10, 2022 · 1 comment

Comments

@Sanster
Copy link

Sanster commented Nov 10, 2022

I'm trying to implement joint training of PaperEdge. Although PaperEdge's Supplementary Material mentions joint training did get further improvement, in my opinion, end-to-end training is more practical from a practical application point of view.

I would appreciate it if the author could share more about joint training(e.g., advantages/disadvantages/changes to the training process).

One disadvantages I can think of(still implementing code to verify my idea) is: In the joint training netG has not yet converged and the prediction(dg) is incorrect, resulting in a complete failure of netL to converge

x, y = doc3d_aug(im, fm, bm, bg)
dg = netG(x) # netG is training mode
dg = warpUtil.global_post_warp(dg, 64)
gs = F.interpolate(dg, 256, mode='bilinear', align_corners=True)
xg = F.grid_sample(x, gs.permute(0, 2, 3, 1), align_corners=True)
xd = netL(xg)
loss, _ = spvLoss.lloss(xd, y, dg)
@wkema
Copy link
Collaborator

wkema commented Nov 11, 2022

Good question. I did not thoroughly investigate the join training as it violates the basic modeling of the proposed method.

One reason I used a two step method here is to divide a challenging task (estimate the full deformation field) into two simpler tasks (predict the boundary deformation and predict the texture deformation). This is a divide and conquer methodology (tho not recursive).

The case you mentioned is one possible reason. Think about another scenario that ENet works perfectly estimating the edge deformation and Tnet is still under training, if we train Enet and Tnet jointly, the loss from Tnet will propagate to Enet and force Enet to update its parameters. It does not make sense to penalize Enet due to the error in Tnet in the current setting.

I agreed you might get better results with jointly training after some tweaks...but you need to investigate what really happened.

One of the reasons I did not get obvious improvement is these lines:
https://github.com/cvlab-stonybrook/PaperEdge/blob/main/networks/paperedge.py#L219-L222
That is, the deformation field boundaries won't propagate grad back to Enet (tho Enet still got some grad from the interpolated field). If you want to investigate more, you may want to disable this part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants