2

I am trying to reproduce the model described in the paper DocUNet: Document Image Unwarping via A Stacked U-Net, i.e. stacking two U-Nets to yield one final prediction. The paper mentions that:

The deconvolution features of the first U-Net and the intermediate prediction y1 are concatenated together as the input of the second U-Net.

What does it mean by concatenating deconvolution features and the prediction (which is an array? cm)?

The next paragraph says that:

The second U-Net finally gives a refined prediction y2, which we use as the final output of our network. We apply the same loss function to both y1 and y2 during training.

It leads to the next question: Does it mean that I have to train U-Net twice?

nbro
  • 39,006
  • 12
  • 98
  • 176
aRRay
  • 21
  • 1

0 Answers0