BERT2: How to use GPT2LMHeadModel to start a sentence, not complete it

Question

I am using GPT2LMHeadModel to change the way GPT2 choose the next word in a sentence. At this point, I have to give the initial part of the sentence and GTP2 starts to predict the better next word.

I want GPT2 to read an entire sentence and then start a new one based on that (like it does with translation)

Is there any kind of parameter that I need to set up in order to make GPT2 start a sentence from zero, not complete an initial one?

You should be able to do this by feeding a period at the end of the sentence you're feeding to GPT2. It will understand that it needs to start a new sentence after that (because the most probably token after a period is the start of a new sentence) — Raphael Lopez Kaufman, May 05 '22 at 18:43
It's still trying to "complete" the context, for example: I gave "A company director has a pecuniary duty." and gtp2 come up with "A company director has a pecuniary duty. It's not to make money, but to serve customers". It didn't started a new one based on the first. I am using a paraphrase pretrained model. — Mucida, May 06 '22 at 11:21
What would you have wanted as a completion from the model after this first sentence? This will help understanding what you want to achieve. — Raphael Lopez Kaufman, May 06 '22 at 12:07
A paraphrase, like: "A company director has a pecuniary duty." ..... "A CEO has monetary responsabilities" Something like that — Mucida, May 06 '22 at 15:27

score 1 · Answer 1 · answered May 06 '22 at 18:02

1

Based on our conversation in the comment section, what Mucida wants is a reformulation of the input, e.g. if the input is:

"A company director has a pecuniary duty"

the output should be:

A company director has a pecuniary duty or "A CEO has monetary responsabilities".

By default, GPT2 returns what could be the next sentence in a longer paragraph, e.g.:

"It's not to make money, but to serve customers".

When you want large language models like GPT2 to give you a certain type of answers, what usually works well is to give it a few examples of what you want as an input.

For example, you'd give as input a few pairs of reformulations:

"What's a reformulation of "A company director has a pecuniary duty"? It's "A company director has a pecuniary duty". What's a reformulation of "Stackoverflow is a great place to ask questions"? It's "Stackoverflow is where you get the best answers to your questions". What's a reformulation of "Elon musk is the richest person on earth?" It's"

Then, the ouput of GPT2 should complete what comes after "It's" in the same style.

You should check what's the context length of the GPT2 model you're using is. Otherwise, if you feed an input that's longer than the context length, it's not going to be taken into account in its entirety by the model.

answered May 06 '22 at 18:02

Raphael Lopez Kaufman

600
1
9

I think I will have to create my own decoding algorithm because it didn't work. I tried with a pretrained model for translation and for paraphrase. It stills try to complete a sentence – Mucida May 07 '22 at 14:22
Can you show me the output you got from the model? Usually, getting what you want out of a big language model is a matter of crafting the right prompt. – Raphael Lopez Kaufman May 08 '22 at 19:00
If a start the sentece with "A company director has a pecuniary duty.", it completes with: "A company director has a pecuniary duty."I'm not going to say that I'm going to be a good manager," he said" ... word by word each iteration. A print would be easier to show you hehe – Mucida May 09 '22 at 20:22
Did you try what I suggested? Giving a few examples of the type of completion you want as an input to the model? It seems to me you just tried again with a single sentence as an input – Raphael Lopez Kaufman May 10 '22 at 20:05
I did, but didn't work. Sometimes it only repeats the first sentence, or completes with a random one. But I tried with only 6 pairs of sentences as examples, I will try with more – Mucida May 12 '22 at 13:48
I would try to figure out what's the context length of the model you're using – Raphael Lopez Kaufman May 12 '22 at 18:35
Is there any way we can get in touch? – Mucida May 12 '22 at 20:33

BERT2: How to use GPT2LMHeadModel to start a sentence, not complete it

1 Answers1