Andrew Gurung
  • Introduction
  • Data Science
    • Natural Language Processing
      • Sentiment analysis using Twitter
    • Linear Algebra
      • Linear algebra explained in four pages
      • Vectors
        • Vector Basics
        • Vector Projection
        • Cosine Similarity
        • Vector Norms and Orthogonality
        • Linear combination and span
        • Linear independence and Basis vectors
      • Matrices
        • Matrix Arithmetic
        • Matrix Operations
        • Functions and Linear Transformations
        • Matrix types
      • Eigendecomposition, Eigenvectors and Eigenvalues
      • Principle Component Analysis (PCA)
      • Singular-Value Decomposition(SVD)
      • Linear Algebra: Deep Learning Book
    • Calculus
      • Functions, Limits, Continuity and Differentiability
      • Scalar Derivative and Partial Derivatives
      • Gradient
      • Matrix Calculus
      • Maxima and Minima using Derivatives
      • Gradient Descent and its types
    • Statistics and Probability
      • Probability Rules and Axioms
      • Types of Events
      • Frequentist vs Bayesian View
      • Random Variables
      • MLE, MAP, and Naive Bayes
      • Probability Distributions
      • P-Value and hypothesis test
    • 7 Step DS Process
      • 1: Business Requirement
      • 2: Data Acquisition
      • 3: Data Processing
        • SQL Techniques
        • Cleaning Text Data
      • 4: Data Exploration
      • 5: Modeling
      • 6: Model deployment
      • 7: Communication
    • Miscellaneous
      • LaTeX commands
  • Computer Science
    • Primer
      • Big O Notation
  • Life
    • Health
      • Minimalist Workout Routine
      • Reddit FAQ on Nootropics
      • Hiking/Biking Resources
    • Philosophy
      • Aristotle's Defense of Private Property
    • Self-improvement
      • 100 Mental Models
      • Don't break the chain
      • Cal Newport's 5 Productivity tips
      • Andrew Ng's advice on deliberate practice
      • Atomic Habits
      • Turn sound effects off in Outlook
    • Food and Travel
      • 2019 Guide to Pesticides in Produce
      • Recipe
        • Spicy Sesame Noodles
      • Travel
        • Hiking
    • Art
      • Scott Adams: 80% of the rules of good writing
      • Learn Blues Guitar
    • Tools
      • Software
        • Docker
        • Visual Studio Code
        • Terminal
        • Comparing Git Workflow
      • Life Hacks
        • DIY Deck Cleaner
  • Knowledge Vault
    • Book
      • The Almanack of Naval Ravikant
    • Media
    • Course/Training
Powered by GitBook
On this page
  • Find a Gradient
  • Properties of Gradients
  • Directional Derivative
  • Find Directional Derivative

Was this helpful?

  1. Data Science
  2. Calculus

Gradient

A gradient is a vector that stores the partial derivatives of multi variable functions, often denoted by ∇\nabla∇. It helps us calculate the slope at a specific point on a curve for functions with multiple independent variables.

Find a Gradient

Consider a function with two variables (x and y): f(x,y)=x2+y3f(x,y) = x^2 + y^3f(x,y)=x2+y3

1) Find partial derivative with respect to x (Treat y as a constant like a random number 12)

fx′=∂f∂x=2x+0=2xf'_x = \frac{\partial f}{\partial x} = 2x+0=2xfx′​=∂x∂f​=2x+0=2x

2) Find partial derivative with respect to y (Treat x as a constant)

fy′=∂f∂y=0+3y2=3y2f'_y =\frac{\partial f}{\partial y}= 0+3y^2=3y^2 fy′​=∂y∂f​=0+3y2=3y2

3) Store partial derivatives in a gradient

∇f(x,y)=[∂f∂x∂f∂y]=[2x3y2]\nabla f(x,y) = \begin{bmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y}\end{bmatrix} = \begin{bmatrix}2x \\ 3y^2\end{bmatrix}∇f(x,y)=[∂x∂f​∂y∂f​​]=[2x3y2​]

Properties of Gradients

There are two additional properties of gradients that are especially useful in deep learning. A gradient:

  1. Is zero at a local maximum or local minimum

Directional Derivative

The directional derivative ∇v⃗f\nabla_{\vec v} f∇v​f is the rate at which the function f(x,y)f(x,y)f(x,y) changes at a point (x1,y1)(x_1,y_1)(x1​,y1​) in the direction v⃗{\vec v}v.

Directional derivative is computed by taking the dot product of the gradient of fff and a unit vector v⃗{\vec v}v

Note: Directional derivative of a function is a scalar while gradient is a vector.

Find Directional Derivative

Consider a function with two variables (x and y): f(x,y)=x2+y3f(x,y) = x^2 + y^3f(x,y)=x2+y3

v⃗=[23]{\vec v} = \begin{bmatrix}2 \\ 3\end{bmatrix}v=[23​]

As described above, we take the dot product of the gradient and the directional vector:

[∂f∂x∂f∂y].[23]\begin{bmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y}\end{bmatrix} . \begin{bmatrix}2 \\ 3\end{bmatrix} \newline \newline \newline[∂x∂f​∂y∂f​​].[23​]

We can rewrite the dot product as:

∇v⃗f=2∂f∂x+3∂f∂y=2(2x)+3(3y2)=4x+9y2\nabla_{\vec v} f = 2\frac{\partial f}{\partial x} +3 \frac{\partial f}{\partial y} = 2(2x) + 3(3y^2)=4x+9y^2∇v​f=2∂x∂f​+3∂y∂f​=2(2x)+3(3y2)=4x+9y2

Hence, the directional derivative ∇v⃗f\nabla_{\vec v} f∇v​fat co-ordinates (5,4)(5, 4) (5,4) is: ∇v⃗f=4x+9y2=4(5)+9(4)2=164\nabla_{\vec v} f = 4x+9y^2 = 4(5)+9(4)^2 = 164∇v​f=4x+9y2=4(5)+9(4)2=164

PreviousScalar Derivative and Partial DerivativesNextMatrix Calculus

Last updated 6 years ago

Was this helpful?

Always points in the direction of greatest increase of a function ()

Link: -

explained here
http://wiki.fast.ai/index.php/Calculus_for_Deep_Learning