Questions tagged [natural-language-processing]

For questions related to natural language processing (NLP), which is concerned with the interactions between computers and human (or natural) languages, in particular how to create programs that process and analyze large amounts of natural language data.

See: Natural language processing (NLP) at Wikipedia.

729 questions
97
votes
5 answers

How can neural networks deal with varying input sizes?

As far as I can tell, neural networks have a fixed number of neurons in the input layer. If neural networks are used in a context like NLP, sentences or blocks of text of varying sizes are fed to a network. How is the varying input size reconciled…
65
votes
4 answers

Why does the transformer do better than RNN and LSTM in long-range context dependencies?

I am reading the article How Transformers Work where the author writes Another problem with RNNs, and LSTMs, is that it’s hard to parallelize the work for processing sentences, since you have to process word by word. Not only that but there is no…
48
votes
2 answers

How does ChatGPT retain the context of previous questions?

One of the innovations with OpenAI's ChatGPT is how natural it is for users to interact with it. What is the technical enabler for ChatGPT to maintain the context of previous questions in its answers? For example, ChatGPT understands a prompt of…
34
votes
6 answers

How does an AI like ChatGPT answer a question in a subject which it may not know?

After seeing StackOverflow's banning of ChatGPT, I explored it out of curiosity. It's marvellous as it can write code by itself! Later to check if it knows chess as well like Google-Deepmind's AlphaZero AI, I asked below questions: Me: Does openai…
31
votes
3 answers

Can BERT be used for sentence generating tasks?

I am a new learner in NLP. I am interested in the sentence generating task. As far as I am concerned, one state-of-the-art method is the CharRNN, which uses RNN to generate a sequence of words. However, BERT has come out several weeks ago and is…
31
votes
2 answers

How can Transformers handle arbitrary length input?

The transformer, introduced in the paper Attention Is All You Need, is a popular new neural network architecture that is commonly viewed as an alternative to recurrent neural networks, like LSTMs and GRUs. However, having gone through the paper, as…
30
votes
9 answers

What is the actual quality of machine translations?

As an AI layman, till today I am confused by the promised and achieved improvements of automated translation. My impression is: there is still a very, very far way to go. Or are there other explanations why the automated translations (offered and…
27
votes
4 answers

Why is ChatGPT bad at math?

As opposed to How does ChatGPT know math?, I've been seeing some things floating around the Twitterverse about how ChatGPT can actually be very bad at math. For instance, I asked it "If it takes 5 machines 5 minutes to make 5 devices, how long would…
Mithical
  • 2,885
  • 5
  • 27
  • 39
26
votes
1 answer

How is BERT different from the original transformer architecture?

As far as I can tell, BERT is a type of Transformer architecture. What I do not understand is: How is Bert different from the original transformer architecture? What tasks are better suited for BERT, and what tasks are better suited for the…
22
votes
4 answers

Why does ChatGPT fail in playing "20 questions"?

IBM Watson's success in playing "Jeopardy!" was a landmark in the history of artificial intelligence. In the seemingly simpler game of "Twenty questions" where player B has to guess a word that player A thinks of by asking questions to be answered…
19
votes
2 answers

What are the main differences between skip-gram and continuous bag of words?

The skip-gram and continuous bag of words (CBOW) are two different types of word2vec models. What are the main differences between them? What are the pros and cons of both methods?
DRV
  • 1,573
  • 2
  • 11
  • 18
18
votes
2 answers

What research has been done in the domain of "identifying sarcasm in text"?

Identifying sarcasm is considered one of the most difficult open-ended problems in the domain of ML and NLP/NLU. So, was there any considerable research done on that front? If yes, then what is the accuracy like? Please, also, explain the NLP model…
17
votes
1 answer

What is the intuition behind the dot product attention?

I am watching the video Attention Is All You Need by Yannic Kilcher. My question is: what is the intuition behind the dot product attention? $$A(q,K, V) = \sum_i\frac{e^{q.k_i}}{\sum_j e^{q.k_j}} v_i$$ becomes: $$A(Q,K, V) = \text{softmax}(QK^T)V$$
DRV
  • 1,573
  • 2
  • 11
  • 18
17
votes
3 answers

How would an AI learn language?

I was think about AIs and how they would work, when I realised that I couldn't think of a way that an AI could be taught language. A child tends to learn language through associations of language and pictures to an object (e.g., people saying the…
AvahW
  • 275
  • 1
  • 8
16
votes
3 answers

What roles knowledge bases play now and will play in the future?

Nowadays, artificial intelligence seems almost equal to machine learning, especially deep learning. Some have said that deep learning will replace human experts, traditionally very important for feature engineering, in this field. It is said that…
1
2 3
48 49