How to tell if two hotel reviews addressing the same thing

Question

I am playing with a large dataset of hotel reviews, which contains both positive and negative reviews (the reviews are labeled). I want to use this dataset to perform textual style transfer - given a positive review, output a negative review which address the same thing. For example, if the positive review mentioned how spacious the rooms are, I want the output to be a review that complains about the small and claustrophobic rooms.

However, I don't have positive review-negative review pairs for the training. I was thinking that maybe I could create those pairs myself, but I'm not sure what is the best way to do that. Simple heuristics like jaccard index and such didn't give the desired results.

This seems like a tricky data set to create automatically. Do you have a different goal you want to eventually achieve, once you have this data set? Put differently, why are you building such a data set? By the way, "style transfer" can very often be equated with "translation" in NLP. — Mathias Müller, Jan 30 '20 at 19:43
Well, my goal is to show that supervised style transfer can work in practice in complex cases, so I could get funds for labeling a very challenging dataset :) — Nadav Borenstein, Feb 03 '20 at 07:13

How to tell if two hotel reviews addressing the same thing

0 Answers0