3

(The math problem here just serves as an example, my question is on this type of problems in general).

Given two Schur polynomials, $s_\mu$, $s_\nu$, we know that we can decompose their product into a linear combination of other Schur polynomials.

$$s_\mu s_\nu = \sum_\lambda c_{\mu,\nu}^\lambda s_\lambda$$

and we call $c_{\mu,\nu}^\lambda$ the LR coefficient (always an non-negative integer).

Hence, a natural supervised learning problem is to predict whether the LR coefficient is of a certain value or not given the tuple $<\mu, \nu, \lambda>$. This is not difficult.

My question is: can we either use ML/RL to do anything else other than predicting (in this situation) or extract anything from the prediction result? In other words, a statement like "oh, I am 98% confident that this LR coefficient is 0" does not imply anything mathematically interesting?

nbro
  • 39,006
  • 12
  • 98
  • 176
SmoothKen
  • 153
  • 3
  • I've edited this post in order to try to clarify what you're asking. I'm still not sure what you were asking and the given answer below just confirms that your question was a bit open to interpretation, which is bad. Please, now that you probably have a clearer idea of what you had in mind, review this post (including the title that I've added) and try to clarify what you were really asking. What does it mean "do anything else other than predicting"? Are you asking whether an ML model can do something else other than what it was trained to do? – nbro Jan 17 '21 at 16:55

1 Answers1

3

There are quite a few examples of papers where they try and 'teach' neural networks to 'learn' how to solve math problems. Most of the time, sadly, it comes down to training on a large dataset after which the network can 'solve' the sort of basic problems, but is unable to generalize this to larger problems. That is, if you train a neural network to solve addition, it will be inherently constrained by the dataset. It might be able to semi-sufficiently solve addition with 3 or even 4 digits, depending on how big your dataset is, but throw in an addition question containing 2 10 digit numbers, and it will almost always fail.

The latest example that I can remember where they tried this is in the General Language Model GPT-3, which was not made to solve equations per se, but does 'a decent job' on the stuff that was in the dataset. Facebook AI made an 'advanced math solver' with a specific architecture that i have not looked into which might disproof my point, but you can look into that.

In the end, this comes down to 'what is learning' and 'what do you want to accomplish'. Most agree that these network are not able to generalize beyond their datasets. Some might say that not being able to generalize does not mean that it is not learning. It might just be learning slower. I believe that these models are inherently limited to what is presented in the dataset. Given a good dataset, it might be able to generalize to cases 'near and in-between', but I have yet to see a case where this sort of stuff generalizes to cases 'far outside' the dataset.

Robin van Hoorn
  • 1,810
  • 7
  • 32
  • Are all known types of neural networks equal in terms of this? In other words, regardless of their validation accuracy, is there some types of neural network that get closer to conclusive results than others? – SmoothKen Dec 18 '20 at 22:25
  • So specific architectures will always outperform more 'general' architectures on specific problems, right. This is also what FacebookAI did with their advanced math solver. Though this will most likely not be the best approach for the specific problem once we get more data and more computational power. For a nice thought experiment, make sure to look up 'The Bitter Lesson' by Rich Sutton who also talks about this. – Robin van Hoorn Dec 21 '20 at 09:59
  • Crutch of it is: yes, if we put 'our own knowledge' into structures/processes/techniques they will perform better right now, but in 10 years someone will come along and demolish these results with more computational power in a generic structure. – Robin van Hoorn Dec 21 '20 at 09:59
  • So, not all known types of neural networks perform equal in terms of this, but in my opinion this is not the way you should want to look at this. There is no 'proven' benefit of some specific structures for this kind of learning like CNNs have over ANNs for images. – Robin van Hoorn Dec 21 '20 at 10:01
  • How then they managed to teach it to program? – Anixx Aug 11 '23 at 10:13
  • Sorry, but i do not understand your comment. Could you please elaborate on what exactly what you are asking, and how it is related to this particular question? – Robin van Hoorn Aug 11 '23 at 10:19