When dealing with massive amounts of data, it is often inefficient to try calculate models based on the entire set. Instead, you want to take a subset of that data to test against.
Stochastic Gradient Descent takes a single (yes, you read that right – a single) example and calculates the convergence for that one point. Given enough (single, stochastic) examples, you could potentially come to an average that provides a useful estimate.
Mini-batch stochastic gradient descent, instead of going down to single examples, creates a small batch (typically 10-1,000) and calculates gradient descent for those.