Highest Voted 'fine-tuning' Questions - Artificial Intelligence Stack Exchange

12

votes

1 answer

What is the difference between one-shot learning, transfer learning and fine tuning?

Lately, there are lots of posts on one-shot learning. I tried to figure out what it is by reading some articles. To me, it looks like similar to transfer learning, in which we can use pre-trained model weights to create our own model. Fine-tuning…

asked Jun 08 '20 at 05:34

Hiren Namera

406
5
14

8

votes

2 answers

Are GPT-3.5 series models based on GPT-3?

In the official blog post about ChatGPT from OpenAI, there is this paragraph explaining how ChatGPT model was trained: We trained this model using Reinforcement Learning from Human Feedback (RLHF), using the same methods as InstructGPT, but with…

open-ai fine-tuning chatgpt gpt-3

asked Feb 02 '23 at 16:40

iMad

183
4

4

votes

1 answer

What is the difference between fine tuning and variants of few shot learning?

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, recently I have seen a plethora of blog posts…

deep-learning fine-tuning pretrained-models few-shot-learning zero-shot-learning

asked Jun 14 '22 at 03:57

Exploring

223
6
16

4

votes

1 answer

Why aren't the BERT layers frozen during fine-tuning tasks?

During transfer learning in computer vision, I've seen that the layers of the base model are frozen if the images aren't too different from the model on which the base model is trained on. However, on the NLP side, I see that the layers of the BERT…

natural-language-processing computer-vision bert transfer-learning fine-tuning

asked Oct 03 '20 at 11:03

Bunny Rabbit

141
2

4

votes

1 answer

How to fine tune BERT for question answering?

I wish to train two domain-specific models: Domain 1: Constitution and related Legal Documents Domain 2: Technical and related documents. For Domain 1, I've access to a text-corpus with texts from the constitution and no question-context-answer…

natural-language-processing bert fine-tuning question-answering

asked Jul 06 '20 at 09:53

Anirban Saha

141
1
3

3

votes

1 answer

When doing transfer learning, which initial layers do we need to freeze, and how should I change the last layer for my task?

I want to train a neural network for the detection of a single class, but I will be extending it to detect more classes. To solve this task, I selected the PyTorch framework. I came across transfer learning, where we fine-tune a pre-trained neural…

convolutional-neural-networks transfer-learning fine-tuning single-shot-multibox-detector

asked Feb 19 '18 at 12:07

Santhosh

143
5

3

votes

0 answers

Why shouldn't batch normalisation layers be learnable during fine-tuning?

I have been reading this TensorFlow tutorial on transfer learning, where they unfroze the whole model and then they say: When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the…

convolutional-neural-networks tensorflow transfer-learning batch-normalization fine-tuning

asked Dec 13 '20 at 11:15

dato nefaridze

862
6
20

3

votes

0 answers

How does one continue the pre-training in BERT?

I need some help with continuing pre-training on Bert. I have a very specific vocabulary and lots of specific abbreviations at hand. I want to do an STS task. Let me specify my task: I have domain-specific sentences and want to pair them in terms of…

training python bert fine-tuning training-datasets

asked Mar 05 '20 at 15:02

Adrian_G

31
1

2

votes

1 answer

Should I be layer freezing when fine-tuning an LLM?

I've had it in my head that generally speaking, it's better to freeze layers when fine-tuning an LLM, as per this quote from HuggingFace's article: PEFT approaches only fine-tune a small number of (extra) model parameters while freezing most…

transformer performance fine-tuning catastrophic-forgetting

asked May 18 '23 at 16:07

multiheadedattention

21
2

2

votes

1 answer

Would a transformer trained on highly specific material be as usable as a commercial product like ChatGPT?

Soft question here. I was recently learning a bit about how it is feasible to train a transformer on a personal computer like an M1 Mac. I have been told that the model could have 1-3 million parameters and the training data could be from 1GB - 1TB,…

transformer gpt language-model fine-tuning chatgpt

asked Jan 28 '23 at 23:44

hmltn

103
9

2

votes

1 answer

Does BERT freeze the entire model body when it does fine-tuning?

Recently, I came across the BERT model. I did some research and tried some implementations. I wanted to tackle a NER task, so I chose the BertForSequenceClassifications provided by HuggingFace. for epoch in range(1, args.epochs + 1): total_loss…

bert pretrained-models fine-tuning named-entity-recognition

asked Jul 07 '21 at 08:50

Joon

51
1
6

2

votes

2 answers

What is the difference between feature extraction and fine-tuning in transfer learning?

I'm building a model for facial expression recognition, and I want to use transfer learning. From what I understand, there are different steps to do it. The first is the feature extraction and the second is fine-tuning. I want to understand more…

deep-learning transfer-learning feature-extraction fine-tuning emotion-recognition

asked Jun 07 '21 at 22:33

Speedskillsx

21
1
3

2

votes

0 answers

Adding corpus to BERT for QA

I was wondering about SciBERT's QA abilities using SQuAD. I have a scarce textual dataset consisting of less than 100 files where doctors are discussing cancer in dialogues. I want to add it to SciBERT to see if the QA abilities will improve in the…

natural-language-processing bert fine-tuning

asked Feb 22 '21 at 14:41

DarknessPlusPlus

121
3

2

votes

1 answer

Is my fine-tuned model learning anything at all?

I am practicing with Resnet50 fine-tuning for a binary classification task. Here is my code snippet. base_model = ResNet50(weights='imagenet', include_top=False) x = base_model.output x = keras.layers.GlobalAveragePooling2D(name='avg_pool')(x) x =…

convolutional-neural-networks training keras transfer-learning fine-tuning

asked Dec 28 '19 at 07:17

bit_scientist

241
1
4
15

1

vote

1 answer

What researched-backed findings is there for prompting LLM’s / GPT-4 to give specific information or actionable plans?

I have learned a bit recently about prompt strategies. For example, there was a paper about how just by saying “Let’s think step by step” can increase answer quality by like 40%. I have also come to appreciate that models like GPT4 sometimes…

fine-tuning large-language-models gpt-4 prompt prompt-design

asked May 04 '23 at 12:25

hmltn

103
9

Questions tagged [fine-tuning]