Gradient Descent – Looking for the global minimum

The idea behind gradient descent is looking for the global minimum (or local minimum when wrong) of the function, it is done by subtracting the derivative from the initial position, this process is called minimizing the cost function.

The way we update the parameters after the gradient descent is done, is by using an action called Back Propagation.

Parameter update is done by taking the cost function and calculating its partial derivatives with respect to each parameter and then subtracting the negative gradient from each parameter via back propagation.

Back propagation transmits the rules to update the parameters to minimize the cost.