Gradient
A gradient is a vector that stores the partial derivatives of multi variable functions, often denoted by ∇. It helps us calculate the slope at a specific point on a curve for functions with multiple independent variables.
Find a Gradient
Consider a function with two variables (x and y): f(x,y)=x2+y3
1) Find partial derivative with respect to x (Treat y as a constant like a random number 12)
2) Find partial derivative with respect to y (Treat x as a constant)
3) Store partial derivatives in a gradient
Properties of Gradients
There are two additional properties of gradients that are especially useful in deep learning. A gradient:
Always points in the direction of greatest increase of a function (explained here)
Is zero at a local maximum or local minimum
Directional Derivative
The directional derivative ∇vf is the rate at which the function f(x,y) changes at a point (x1,y1) in the direction v.
Directional derivative is computed by taking the dot product of the gradient of f and a unit vector v
Note: Directional derivative of a function is a scalar while gradient is a vector.
Find Directional Derivative
Consider a function with two variables (x and y): f(x,y)=x2+y3
As described above, we take the dot product of the gradient and the directional vector:
We can rewrite the dot product as:
Hence, the directional derivative ∇vfat co-ordinates (5,4) is: ∇vf=4x+9y2=4(5)+9(4)2=164
Link: - http://wiki.fast.ai/index.php/Calculus_for_Deep_Learning
Last updated
Was this helpful?