From my understanding, GPT-3 is trained on predicting the next token from a sequence of tokens. Given this, how is it able to take commands? For instance, in this example input, wouldn't the statistically most likely prediction be to insert a period and end the sentence?
Input: write me a beautiful sentence
Output: I cannot put into words how much I love you, so I'll just say it's infinite.