Background: I'm currently trying to use GPT to give me numerical scores, and looking for tips on prompt design, see my previous StackExchange post. To craft good prompts it seems important to have a good understanding of how the generative model works...
Question: How many tokens ahead does GPT 3.5 look with its beam search feature?
Extra context: I found it hard to find good references for beam search, a decent starting point seemed to be huggingface blog post.
I tried asking BingChat about GPT-3.5 beam search length: BingChat replied that it was 10 tokens but could only give a 'reference' to an OpenAI API page which did not seem to support the claim. I couldn't find any other results online.
Why I care? Suppose I have a long theatre review and want to score how impressed the critic was by the quality of acting on a scale of: -5 extremely unimpressed to +5 extremely impressed. My prompts currently ask the model to finish the reply with a sentence in the form: "Overall the critic was very impressed by the quality of acting - score 4." But perhaps by asking the model to continue generation I can make the prompts more reliable. E.g. I could ask the model to subsequently explain the score with a quotation from the text; e.g. along the lines of "Overall the critic was very impressed by the quality of acting - score 4/5 - and indeed that 'Mark Strong's performance stole the show'"
Knowing beam search length would really help me design prompts like these (which ask for the continuation of text after a numerical score to improve reliability).