0

I have a pre-trained BERT model from Huggingface, which I tune to categorize short texts (like tweets or slightly longer) into several thousand categories using triplet loss.

As I understand, if I train two models on the same dataset, the resulting embeddings are not cross-comparable between the two models, meaning that the same text won't necessarily have similar embedding vector in two models. It happens this way because normally I do not care about the absolute values of the embedding vectors, I only care about their distances and relative positions to the other vectors, for the purposes of the classification. So, the absolute position of the vector is an unconstrained degree of freedom.

My question is - is there a way to "fix the gauge" of the model when training: to modify the loss or something like that, so that the embedding of the same text would always have approximately the same embedding vector, even if I train another model on the same train set?

0 Answers0