I am proposing a modified version of Sequence-to-Sequence model with dual decoders. The problem that I am trying to solve is Neural Machine Translation into two languages at once. This is the simplified illustration for the model.
/--> Decoder 1 -> Language Output 1
Language Input -> Encoder -|
\--> Decoder 2 -> Language Output 2
What I understand about back propagation is that we are adjusting the weights of the network to enhance the signal of the targeted output. However, it is not clear to me on how to back propagate in this network because I am not able to find similar implementations online yet. I am thinking of doing the back propagation twice after each training batch, like this:
$$ Decoder\ 1 \rightarrow Encoder $$ $$ Decoder\ 2 \rightarrow Encoder $$
But I am not sure whether the effect of back propagation from Decoder 2 will affect the accuracy of prediction by Decoder 1. Is this true?
In addition, is this structure feasible? If so, how do I properly back propagate in the network?