Questions tagged [math]

For questions about mathematics related to artificial intelligence.

Mathematics is central to Artificial Intelligence. Feasibility study, capacity planing, evaluation of algorithms, the reasonability of research directions, and evaluation of the effectiveness of methods in real world use all rely on rigorous mathematical treatment.

246 questions
35
votes
6 answers

Is it possible to train the neural network to solve math equations?

I'm aware that neural networks are probably not designed to do that, however asking hypothetically, is it possible to train the deep neural network (or similar) to solve math equations? So given the 3 inputs: 1st number, operator sign represented by…
kenorb
  • 10,423
  • 3
  • 43
  • 91
29
votes
4 answers

Can neural networks be used to prove conjectures?

Imagine I have a list (in a computer-readable form) of all problems (or statements) and proofs that math relies on. Could I train a neural network in such a way that, for example, I enter a problem and it generates a proof for it? Of course, those…
27
votes
4 answers

Why is ChatGPT bad at math?

As opposed to How does ChatGPT know math?, I've been seeing some things floating around the Twitterverse about how ChatGPT can actually be very bad at math. For instance, I asked it "If it takes 5 machines 5 minutes to make 5 devices, how long would…
Mithical
  • 2,885
  • 5
  • 27
  • 39
27
votes
1 answer

What is the Bellman operator in reinforcement learning?

In mathematics, the word operator can refer to several distinct but related concepts. An operator can be defined as a function between two vector spaces, it can be defined as a function where the domain and the codomain are the same, or it can be…
23
votes
4 answers

How does ChatGPT know math?

ChatGPT is a language model. As far as I know and If I'm not wrong, it gets text as tokens and word embeddings. So, how can it do math? For example, I asked: ME: Which one is bigger 5 or 9. ChatGPT: In this case, 9 is larger than 5. One can say,…
15
votes
4 answers

Why do activation functions need to be differentiable in the context of neural networks?

Why should an activation function of a neural network be differentiable? Is it strictly necessary or is it just advantageous?
user3642
14
votes
2 answers

Is the mean-squared error always convex in the context of neural networks?

Multiple resources I referred to mention that MSE is great because it's convex. But I don't get how, especially in the context of neural networks. Let's say we have the following: $X$: training dataset $Y$: targets $\Theta$: the set of parameters…
14
votes
3 answers

What sort of mathematical problems are there in AI that people are working on?

I recently got a 18-month postdoc position in a math department. It's a position with relative light teaching duty and a lot of freedom about what type of research that I want to do. Previously I was mostly doing some research in probability and…
faceclean
  • 261
  • 2
  • 8
12
votes
3 answers

Why is the derivative of the activation functions in neural networks important?

I'm new to NN. I am trying to understand some of its foundations. One question that I have is: why the derivative of an activation function is important (not the function itself), and why it's the derivative which is tied to how the network performs…
11
votes
2 answers

How do we prove the n-step return error reduction property?

In section 7.1 (about the n-step bootstrapping) of the book Reinforcement Learning: An Introduction (2nd edition), by Andrew Barto and Richard S. Sutton, the authors write about what they call the "n-step return error reduction property": But they…
11
votes
3 answers

What are the mathematical prerequisites to be able to study artificial general intelligence?

What are the mathematical prerequisites to be able to study artificial general intelligence (AGI) or strong AI?
Mark ellon
  • 489
  • 1
  • 5
  • 6
11
votes
1 answer

How does the forget layer of an LSTM work?

Can someone explain the mathematical intuition behind the forget layer of an LSTM? So as far as I understand it, the cell state is essentially long term memory embedding (correct me if I'm wrong), but I'm also assuming it's a matrix. Then the…
user8714896
  • 717
  • 1
  • 4
  • 21
10
votes
3 answers

Can we get the inverse of the function that a neural network represents?

I was wondering if it's possible to get the inverse of a neural network. If we view a NN as a function, can we obtain its inverse? I tried to build a simple MNIST architecture, with the input of (784,) and output of (10,), train it to reach good…
Maverick Meerkat
  • 412
  • 3
  • 11
8
votes
2 answers

Why does the "reward to go" trick in policy gradient methods work?

In the policy gradient method, there's a trick to reduce the variance of policy gradient. We use causality, and remove part of the sum over rewards so that only actions happened after the reward are taken into account (See here…
8
votes
3 answers

How can I start learning mathematics for machine learning?

I am an Android programmer. Now, I would like to learn machine learning. I know it requires a mathematical background, like statistics, probability, calculus and linear algebra. However, I am a bit lost. Where should I start from? Can someone…
Anko6
  • 93
  • 5
1
2 3
16 17