Matrix Calculus

Vector Calculus

Gradients are part of the vector calculus world. Instead of having partial derivatives just floating around and not organized in any way, we organize them into a horizontal vector also known as gradient.

The gradient of f(x,y)=x2+y3f(x,y) = x^2 + y^3 is:

f(x,y)=[fxfy]=[2x3y2]\nabla f(x,y) = \begin{bmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y}\end{bmatrix} = \begin{bmatrix}2x \\ 3y^2\end{bmatrix}

Matrix Calculus

When we add derivatives of another function g(x,y)g(x,y) to the above function f(x,y)f(x,y), we move from the world of vector calculus to matrix calculus.

The gradient of g(x,y)=2x+y8g(x,y) = 2x+y^8 is:

g(x,y)=[gxgy]=[28y7]\nabla g(x,y) = \begin{bmatrix}\frac{\partial g}{\partial x} \\ \frac{\partial g}{\partial y}\end{bmatrix} = \begin{bmatrix}2 \\ 8y^7\end{bmatrix}

If we have two or more functions, we can also organize their gradients into a matrix by stacking the gradients. When we do so, we get the Jacobian matrix (or just the Jacobian) where the each gradient is represented as a row:

J=[f(x,y)g(x,y)]=[fxfygxgy]=[2x3y228y7]J = \begin{bmatrix}\nabla f(x,y) \\ \nabla g(x,y) \end{bmatrix} = \begin{bmatrix}\frac{\partial f}{\partial x}\frac{\partial f}{\partial y} \\ \frac{\partial g}{\partial x}\frac{\partial g}{\partial y} \end{bmatrix} = \begin{bmatrix}2x \hspace{0.2cm}3y^2 \\ 2 \hspace{0.2cm}8y^7 \end{bmatrix}


Last updated