Kipf et al. described in his paper that we can write graph convolution operation like this:
$$H_{t+1} = AH_tW_t$$
where $A$ is the normalized adjacency matrix, $H_t$ is the embedded representation of the nodes and $W_t$ is the weight matrix.
Now, can I imagine the same formula as first performing 2D convolution with fixed-size kernel over the whole feature space then multiply the result with the adjacency matrix?
If this is the case, I think I can create a graph convolution operation just using the Conv2D layer then performing simple matrix multiplication with adjacency matrix using PyTorch.