Highest Voted 'inference' Questions - Artificial Intelligence Stack Exchange

12

votes

4 answers

Why LLMs and RNNs learn so fast during inference but, ironically, are so slow during training?

Why LLMs learn so fast during inference, but, ironically, are so slow during training? That is, if you teach an AI a new concept in a prompt, it will learn and use the concept perfectly and flawless, through the whole prompt, after just one shot.…

asked Mar 31 '23 at 12:19

MaiaVictor

355
1
9

5

votes

2 answers

Are both the training and inference systems required in the same application?

From what I understand, there are 2 stages for deep learning: the first is training and the second is inference. The first is often done on GPUs because of their massive parallelism capabilities, among other things. The second, inference, while it…

deep-learning training implementation inference

asked Mar 06 '17 at 10:00

Mahmoud Abdel-Mon'em

113
7

5

votes

0 answers

Training and inference for highly-context-sensitive information

What is the best way to train / do inference when the context matters highly as to what the inferred result should be? For example in the image below all people are standing upright, but because of the perspective of the camera, their location…

deep-learning training features inference

asked Dec 21 '19 at 01:32

g491

101
2

5

votes

1 answer

Is the Mask Needed for Masked Self-Attention During Inference with GPT-2

My understanding is that masked self-attention is necessary during training of GPT-2, as otherwise it would be able to directly see the correct next output at each iteration. My question is whether the attention mask is necessary, or even possible,…

natural-language-processing attention transformer gpt inference

asked Nov 14 '19 at 11:41

D_s

51
3

3

votes

0 answers

How to use TPU for real-time low-latency inference?

I use Google's Cloud TPU hardware extensively using Tensorflow for training models and inference, however, when I run inference I do it in large batches. The TPU takes about 3 minutes to warm up before it runs the inference. But when I read the…

natural-language-processing tensorflow transformer google inference

asked Nov 01 '19 at 01:08

adng

51
2

2

votes

1 answer

Why is exact inference in a Bayesian network both NP-hard and P-hard?

I should show that exact inference in a Bayesian network (BN) is NP-hard and P-hard by using a 3-SAT problem. So, I did formulate a 3-SAT problem by defining 3-CNF: $$(x_1 \lor x_2) \land (\neg x_3 \lor x_2) \land (x_3 \lor x_1)$$ I reduced it to…

proofs time-complexity bayesian-networks computational-complexity inference

asked Jul 05 '18 at 10:36

xava

423
1
3
9

2

votes

2 answers

What is a beam?

For example, faster-whisper's transcribe function takes an argument beam_size: Beam size to use for decoding. What does "beam" mean?

terminology speech-recognition inference

asked Aug 05 '23 at 18:28

Geremia

163
6

2

votes

0 answers

Why does the BatchNormalization layer produce different outputs during training and inference?

I modified resnet50 architecture to get a regression network. I just add batchnorm1d and ReLU layers just before the fully connected layer. During the training, the output of batchnorm1d layer is nearly equal to 3 and this gives good results for…

convolutional-neural-networks pytorch batch-normalization inference

asked Jan 30 '20 at 14:50

Bedrick Kiq

141
2

1

vote

0 answers

Inference process and flow, and role of GPU, CPU, and RAM

This is a noob question. I load a HuggingFace transformer model into GPU and create a HuggingFace pipeline using that model. Then I run inference on the model using the pipeline. I would be glad to read in some depth about the actual process flow of…

transformer gpu hardware inference memory

asked Jul 06 '23 at 18:58

ahron

131
6

1

vote

1 answer

What if we drop the causal mask in auto-regressive Transformer?

I understand the triangular causal mask in the attention is used to prevent tokens from "looking into the future", but why do we want to prevent that? Let's suppose we have a model with context length $T = 8$. At inference time, we want to predict…

natural-language-processing training transformer attention inference

asked Jun 21 '23 at 20:55

nalzok

251
2
8

1

vote

2 answers

How to optimize transformer inference for prompts shorter than the maximum sequence length?

As far as I understand, a Transformer has a specific input sequence length that depends on its architecture. So a model like gpt-4 has a sequence length of 8192 tokens. As such, I am interested what happens when the input prompt is shorter than…

natural-language-processing transformer inference production-systems

asked Apr 26 '23 at 10:58

janekb04

121
3

1

vote

0 answers

Inference time of VGG16 when initialised with different weights

I’m trying to understand the differences in inference time and training time between two models: VGG16 with weights initialised from a Glorot uniform distribution and the same network with the only difference being that weights are initialised to…

convolutional-neural-networks training inference vgg

asked Mar 12 '23 at 08:10

kiril avramov

11
2

1

vote

0 answers

How can I use this Reformer to extract entities from a new sentence?

I have been looking at the NER example with Trax in this notebook. However, the notebook only gives an example for training the model. I can't find any examples of how to use this model to extract entities from a new string of text. I've tried the…

transformer named-entity-recognition inference reformer

asked Dec 29 '20 at 13:48

Alan Buxton

121
5

1

vote

1 answer

In RL as probabilistic inference, why do we take a probability to be $\exp(r(s_t, a_t))$?

In section 2 the paper Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review the author is discussing formulating the RL problem as a probabilistic graphical model. They introduce a binary optimality variable…

reinforcement-learning probability inference probabilistic-graphical-models

asked Dec 18 '20 at 11:52

David

4,591
1
6
25

1

vote

0 answers

Algorithm which learns to select from proposed options

My goal is to write a program that automatically selects a routing out of multiple proposed options. The data consists out of the multiple proposed options with each the attributes time, costs and if there is a transhipment and also which of the…

algorithm automation inference

asked Apr 26 '20 at 11:03

Nui

11
1

Questions tagged [inference]