Why do we have to dot product in the Low-rank Bilinear Pooling?

Question

I was reading this paper Hadamard Product for Low-rank Bilinear Pooling. I understand what they are trying to say, but I don't know why we have to convert the element-wise multiplication into a scalar (using the dot product)

$$ \mathbb{1}^{T}\left(\mathbf{U}_{i}^{T} \mathbf{x} \circ \mathbf{V}_{i}^{T} \mathbf{y}\right)+b_{i} \tag{2}\label{2} $$

Why do we have to multiply the resulting vector by the one vector? We would still use the multiplicative interaction between elements if we did not consider multiplying by that one vector.

score 2 · Answer 1 · edited Jan 06 '21 at 23:38

The paper (which I was not familiar with) proposes a new pooling method, which they name "Low-rank Bi-linear Pooling". Because it is a pooling method, its output should be a single value over some input spatial region (the goal of pooling is to down-sample an input representation). They do not want to use the resulting vector from the element-wise multiplication, they want to get a single scalar. Notice their initial formula:

$$ f_{i}=\sum_{j=1}^{N} \sum_{k=1}^{M} w_{i j k} x_{j} y_{k}+b_{i}=\mathbf{x}^{T} \mathbf{W}_{i} \mathbf{y}+b_{i}\label{1}\tag{1} $$

Why do we have to dot product in the Low-rank Bilinear Pooling?

1 Answers1