Questions tagged [fine-tuning]

For questions related to the concept of fine-tuning a model (e.g. neural network), which is very related to and sometimes used as a synonym for transfer learning.

54 questions
12
votes
1 answer

What is the difference between one-shot learning, transfer learning and fine tuning?

Lately, there are lots of posts on one-shot learning. I tried to figure out what it is by reading some articles. To me, it looks like similar to transfer learning, in which we can use pre-trained model weights to create our own model. Fine-tuning…
8
votes
2 answers

Are GPT-3.5 series models based on GPT-3?

In the official blog post about ChatGPT from OpenAI, there is this paragraph explaining how ChatGPT model was trained: We trained this model using Reinforcement Learning from Human Feedback (RLHF), using the same methods as InstructGPT, but with…
iMad
  • 183
  • 4
4
votes
1 answer

What is the difference between fine tuning and variants of few shot learning?

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, recently I have seen a plethora of blog posts…
4
votes
1 answer

Why aren't the BERT layers frozen during fine-tuning tasks?

During transfer learning in computer vision, I've seen that the layers of the base model are frozen if the images aren't too different from the model on which the base model is trained on. However, on the NLP side, I see that the layers of the BERT…
4
votes
1 answer

How to fine tune BERT for question answering?

I wish to train two domain-specific models: Domain 1: Constitution and related Legal Documents Domain 2: Technical and related documents. For Domain 1, I've access to a text-corpus with texts from the constitution and no question-context-answer…
3
votes
1 answer

When doing transfer learning, which initial layers do we need to freeze, and how should I change the last layer for my task?

I want to train a neural network for the detection of a single class, but I will be extending it to detect more classes. To solve this task, I selected the PyTorch framework. I came across transfer learning, where we fine-tune a pre-trained neural…
3
votes
0 answers

Why shouldn't batch normalisation layers be learnable during fine-tuning?

I have been reading this TensorFlow tutorial on transfer learning, where they unfroze the whole model and then they say: When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the…
3
votes
0 answers

How does one continue the pre-training in BERT?

I need some help with continuing pre-training on Bert. I have a very specific vocabulary and lots of specific abbreviations at hand. I want to do an STS task. Let me specify my task: I have domain-specific sentences and want to pair them in terms of…
Adrian_G
  • 31
  • 1
2
votes
1 answer

Should I be layer freezing when fine-tuning an LLM?

I've had it in my head that generally speaking, it's better to freeze layers when fine-tuning an LLM, as per this quote from HuggingFace's article: PEFT approaches only fine-tune a small number of (extra) model parameters while freezing most…
2
votes
1 answer

Would a transformer trained on highly specific material be as usable as a commercial product like ChatGPT?

Soft question here. I was recently learning a bit about how it is feasible to train a transformer on a personal computer like an M1 Mac. I have been told that the model could have 1-3 million parameters and the training data could be from 1GB - 1TB,…
hmltn
  • 103
  • 9
2
votes
1 answer

Does BERT freeze the entire model body when it does fine-tuning?

Recently, I came across the BERT model. I did some research and tried some implementations. I wanted to tackle a NER task, so I chose the BertForSequenceClassifications provided by HuggingFace. for epoch in range(1, args.epochs + 1): total_loss…
2
votes
2 answers

What is the difference between feature extraction and fine-tuning in transfer learning?

I'm building a model for facial expression recognition, and I want to use transfer learning. From what I understand, there are different steps to do it. The first is the feature extraction and the second is fine-tuning. I want to understand more…
2
votes
0 answers

Adding corpus to BERT for QA

I was wondering about SciBERT's QA abilities using SQuAD. I have a scarce textual dataset consisting of less than 100 files where doctors are discussing cancer in dialogues. I want to add it to SciBERT to see if the QA abilities will improve in the…
2
votes
1 answer

Is my fine-tuned model learning anything at all?

I am practicing with Resnet50 fine-tuning for a binary classification task. Here is my code snippet. base_model = ResNet50(weights='imagenet', include_top=False) x = base_model.output x = keras.layers.GlobalAveragePooling2D(name='avg_pool')(x) x =…
1
vote
1 answer

What researched-backed findings is there for prompting LLM’s / GPT-4 to give specific information or actionable plans?

I have learned a bit recently about prompt strategies. For example, there was a paper about how just by saying “Let’s think step by step” can increase answer quality by like 40%. I have also come to appreciate that models like GPT4 sometimes…
1
2 3 4