I'm trying to understand few details about NT-Xent loss defined in SimCLR paper(link). The loss is defined as
$$\mathcal{l}_{i,j} = -\log\frac{\exp(sim(z_i,z_j)/\tau)}{\sum_{k=1}^{2N}\mathbb{1}_{[k\neq i]} \exp(sim(z_i,z_k)/\tau)}$$
Where $z_i$ and $z_j$ represent two augmentations for the same image. What I don't understand is: at the denominator, I understand that we want to exclude the point $z_i$ using the indicator function but shouldn't we exclude also $z_j$? Otherwise we will have $k=j$ for some $k$. Essentially, why we do let the positive sample at the denominator?