Stochastic Gradient Descent & Mini-Batch Stochastic Gradient Descent

When dealing with massive amounts of data, it is often inefficient to try calculate models based on the entire set. Instead, you want to take a subset of that data to test against.

Stochastic Gradient Descent takes a single (yes, you read that right – a single) example and calculates the convergence for that one point. Given enough (single, stochastic) examples, you could potentially come to an average that provides a useful estimate.

Mini-batch stochastic gradient descent, instead of going down to single examples, creates a small batch (typically 10-1,000) and calculates gradient descent for those.

Leave a Comment Cancel Reply