FACTOID # 64: Sri Lanka has lowest divorce rate in the world - and the highest rate of female suicide.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Stochastic gradient descent

Stochastic gradient descent is a general optimization algorithm, but is typically used to fit the parameters of a machine learning model.


In standard (or "batch") gradient descent, the true gradient is used to update the parameters of the model. The true gradient is usually the sum of the gradients caused by each individual training example. The parameter vectors are adjusted by the negative of the true gradient multiplied by a step size. Therefore, batch gradient descent requires one sweep through the training set before any parameters can be changed.


In stochastic (or "on-line") gradient descent, the true gradient is approximated by the gradient of the cost function only evaluated on a single training example. The parameters are then adjusted by an amount proportional to this approximate gradient. Therefore, the parameters of the model are updated after each training example. For large data sets, on-line gradient descent can be much faster than batch gradient descent.


There is a compromise between the two forms, which is often called "mini-batches", where the true gradient is approximated by a sum over a small number of training examples.


Stochastic gradient descent is a form of stochastic approximation. The theory of stochastic approximations gives conditions on when stochastic gradient descent converges. If the step size scales as 1/T (where T is the number of gradient steps taken so far), then stochastic gradient descent is guaranteed to converge.


References


  Results from FactBites:
 
Gradient descent - Wikipedia, the free encyclopedia (470 words)
Gradient descent is an optimization algorithm that approaches a local minimum of a function by taking steps proportional to the negative of the gradient (or the approximate gradient) of the function at the current point.
Gradient descent is also known as steepest descent, or the method of steepest descent.
When known as the latter, gradient descent should not be confused with the method of steepest descent for approximating integrals.
Stochastic gradient descent - Wikipedia, the free encyclopedia (299 words)
Stochastic gradient descent is a general optimization algorithm, but is typically used to fit the parameters of a machine learning model.
In standard (or "batch") gradient descent, the true gradient is used to update the parameters of the model.
Stochastic gradient descent is a form of stochastic approximation.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.