Autoencoders are used for unsupervised anomaly detection by first learning the features of the data set with mainly "normal" data points. Then new data can be considered anomalous if the new data has a large reconstruction error, i.e. it was hard to fit the features as in the normal data.
Even if the training is supervised by learning to reconstruct the same data, how is the reconstruction error computed for the new data?