Question about joint training #10

Sanster · 2022-11-10T02:42:48Z

I'm trying to implement joint training of PaperEdge. Although PaperEdge's Supplementary Material mentions joint training did get further improvement, in my opinion, end-to-end training is more practical from a practical application point of view.

I would appreciate it if the author could share more about joint training(e.g., advantages/disadvantages/changes to the training process).

One disadvantages I can think of(still implementing code to verify my idea) is: In the joint training netG has not yet converged and the prediction(dg) is incorrect, resulting in a complete failure of netL to converge

x, y = doc3d_aug(im, fm, bm, bg)
dg = netG(x) # netG is training mode
dg = warpUtil.global_post_warp(dg, 64)
gs = F.interpolate(dg, 256, mode='bilinear', align_corners=True)
xg = F.grid_sample(x, gs.permute(0, 2, 3, 1), align_corners=True)
xd = netL(xg)
loss, _ = spvLoss.lloss(xd, y, dg)

wkema · 2022-11-11T22:26:54Z

Good question. I did not thoroughly investigate the join training as it violates the basic modeling of the proposed method.

One reason I used a two step method here is to divide a challenging task (estimate the full deformation field) into two simpler tasks (predict the boundary deformation and predict the texture deformation). This is a divide and conquer methodology (tho not recursive).

The case you mentioned is one possible reason. Think about another scenario that ENet works perfectly estimating the edge deformation and Tnet is still under training, if we train Enet and Tnet jointly, the loss from Tnet will propagate to Enet and force Enet to update its parameters. It does not make sense to penalize Enet due to the error in Tnet in the current setting.

I agreed you might get better results with jointly training after some tweaks...but you need to investigate what really happened.

One of the reasons I did not get obvious improvement is these lines:
https://github.com/cvlab-stonybrook/PaperEdge/blob/main/networks/paperedge.py#L219-L222
That is, the deformation field boundaries won't propagate grad back to Enet (tho Enet still got some grad from the interpolated field). If you want to investigate more, you may want to disable this part.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about joint training #10

Question about joint training #10

Sanster commented Nov 10, 2022 •

edited

Loading

wkema commented Nov 11, 2022

Question about joint training #10

Question about joint training #10

Comments

Sanster commented Nov 10, 2022 • edited Loading

wkema commented Nov 11, 2022

Sanster commented Nov 10, 2022 •

edited

Loading