I am diving in data-to-text generation for long articles (> 1000 words). After creating a template and fill it with data I am currently going down on paragraph level and adding different paragraphs, which are randomly selected and put together. I also added on a word level different outputs for date, time and number formats.
The challenge I see is, that when creating large amounts of such generated texts they become boring to read as the uniqueness for the reader goes down.
Furthermore, I also think it's easy to detect that such texts have been autogenerated. However, I still have to validate this hypotheses.
I was wondering if there is an even better method to bring in variability in such a text?
Can you suggest any methods, papers, resources or share your experience within this field.
I highly appreciate your replies!