4

SGD is able to jump out of local minima that would otherwise trap BGD

I don't really understand the above statement. Could someone please provide a mathematical explanation for why SGD (Stochastic Gradient Descent) is able to escape local minima, while BGD (Batch Gradient Descent) can't?

P.S.

While searching online, I read that it has something to do with "oscillations" while taking steps towards the global minima. What's that?

nbro
  • 39,006
  • 12
  • 98
  • 176
stoic-santiago
  • 1,121
  • 5
  • 18
  • 1
    This is a popular question. You can find a good explanation in the answer [here](https://stats.stackexchange.com/questions/49528/batch-gradient-descent-versus-stochastic-gradient-descent). – Kaivalya Swami May 31 '20 at 10:18
  • Where did you read that SGD is able to escape local minima? – nbro Jun 02 '20 at 10:30
  • 1
    The statement above is from a MOOC on Coursera, while I found more details here - http://proceedings.mlr.press/v80/kleinberg18a/kleinberg18a.pdf – stoic-santiago Jun 02 '20 at 11:08
  • 1
    @cogito_ai I wouldn't say that SGD can't really escape local minima, but that it can get stuck in other (better?) minima. It's not like _simulated annealing_, where sometimes you perform a random action to escape local minima. However, it's true that there's some form of stochasticity in SGD that probably helps to explore more the search space. And that's what people probably mean by that sentence. See also https://ai.stackexchange.com/a/20380/2444. – nbro Jun 02 '20 at 15:38
  • 1
    Anyway, can you please provide the link to the exact Coursera's lecture that claims that? Also, if you want, later, I could read that paper and try to provide an answer based on it. And don't forget to tag me with `@nbro` next time, so that I see the comment ;) – nbro Jun 02 '20 at 15:41
  • @nbro One of the quizzes in here claim my question's opening statement. https://www.coursera.org/learn/getting-started-with-tensor-flow2 Also, it'd be great if you could go through that paper when you have the time and write an answer based upon it! :) – stoic-santiago Jun 03 '20 at 10:54

0 Answers0