Andrew Gurung
  • Introduction
  • Data Science
    • Natural Language Processing
      • Sentiment analysis using Twitter
    • Linear Algebra
      • Linear algebra explained in four pages
      • Vectors
        • Vector Basics
        • Vector Projection
        • Cosine Similarity
        • Vector Norms and Orthogonality
        • Linear combination and span
        • Linear independence and Basis vectors
      • Matrices
        • Matrix Arithmetic
        • Matrix Operations
        • Functions and Linear Transformations
        • Matrix types
      • Eigendecomposition, Eigenvectors and Eigenvalues
      • Principle Component Analysis (PCA)
      • Singular-Value Decomposition(SVD)
      • Linear Algebra: Deep Learning Book
    • Calculus
      • Functions, Limits, Continuity and Differentiability
      • Scalar Derivative and Partial Derivatives
      • Gradient
      • Matrix Calculus
      • Maxima and Minima using Derivatives
      • Gradient Descent and its types
    • Statistics and Probability
      • Probability Rules and Axioms
      • Types of Events
      • Frequentist vs Bayesian View
      • Random Variables
      • MLE, MAP, and Naive Bayes
      • Probability Distributions
      • P-Value and hypothesis test
    • 7 Step DS Process
      • 1: Business Requirement
      • 2: Data Acquisition
      • 3: Data Processing
        • SQL Techniques
        • Cleaning Text Data
      • 4: Data Exploration
      • 5: Modeling
      • 6: Model deployment
      • 7: Communication
    • Miscellaneous
      • LaTeX commands
  • Computer Science
    • Primer
      • Big O Notation
  • Life
    • Health
      • Minimalist Workout Routine
      • Reddit FAQ on Nootropics
      • Hiking/Biking Resources
    • Philosophy
      • Aristotle's Defense of Private Property
    • Self-improvement
      • 100 Mental Models
      • Don't break the chain
      • Cal Newport's 5 Productivity tips
      • Andrew Ng's advice on deliberate practice
      • Atomic Habits
      • Turn sound effects off in Outlook
    • Food and Travel
      • 2019 Guide to Pesticides in Produce
      • Recipe
        • Spicy Sesame Noodles
      • Travel
        • Hiking
    • Art
      • Scott Adams: 80% of the rules of good writing
      • Learn Blues Guitar
    • Tools
      • Software
        • Docker
        • Visual Studio Code
        • Terminal
        • Comparing Git Workflow
      • Life Hacks
        • DIY Deck Cleaner
  • Knowledge Vault
    • Book
      • The Almanack of Naval Ravikant
    • Media
    • Course/Training
Powered by GitBook
On this page
  • Density Functions
  • Gaussian (Normal) Distribution
  • Bernoulli Distribution
  • Binomial Distribution
  • Chi-Squared Distribution
  • Poisson Distribution
  • Exponential Distribution

Was this helpful?

  1. Data Science
  2. Statistics and Probability

Probability Distributions

PreviousMLE, MAP, and Naive BayesNextP-Value and hypothesis test

Last updated 6 years ago

Was this helpful?

Distribution can be thought as a function that describes the relationship between observations in a sample space. It can be used to calculate the probability of any individual observation from the sample space.

Density Functions

Distributions are often described in terms of their density functions.

Types of Density Functions:

  • Probability Density function: calculates the probability of observing a given value.

  • Cumulative Density function: calculates the probability of an observation equal or less than a value.

Note: Both PDF and CDF are continuous functions. The equivalent of a PDF for a discrete distribution is called a probability mass function, or PMF.

Gaussian (Normal) Distribution

Gausian distribution represents the behavior of most of the situations in the universe. It is so widely found in nature, hence the name Normal distribution.

Characteristics:

  • The mean, median and mode of the distribution coincide

  • The curve of the distribution is bell-shaped and symmetrical about the line x=μ

  • The total area under the curve is 1

  • Exactly half of the values are to the left of the center and the other half to the right

P(x)=1σ2πe−(x−μ)2/2σ2P(x) = \frac{1}{{\sigma \sqrt {2\pi } }}e^{-(x-\mu)^2/2\sigma^2}P(x)=σ2π​1​e−(x−μ)2/2σ2

Note: The Central Limit Theorem states that as the size of the sample increases, the distribution of the mean across multiple samples will approximate a Gaussian distribution. Note that the each trial must be independent.

Bernoulli Distribution

A Bernoulli distribution has only two possible outcomes, namely 1 (success) and 0 (failure), and a single trial.

The Bernoulli distribution has only one parameter - the probability of success.

Bernoulli Distribution is a special case of Binomial Distribution with a single trial.

Binomial Distribution

A binomial experiment is simply sum of n independent Bernoulli's distribution which actually is a success/ failure experiment.

The binomial distribution has two parameters - the probability of success and the number of random variables

Characteristics:

  • Each trial is independent

  • There are only two possible outcomes in a trial- either a success or a failure

  • A total number of n identical trials are conducted

Chi-Squared Distribution

The chi-square independence test is a procedure for testing if two categorical variables are related.

Characteristics:

  • The mean of the distribution is equal to the number of degrees of freedom: μ = v. Note: The degrees of freedom of an estimate is the number of independent pieces of information that go into the estimate.

  • As the degrees of freedom increase, the chi-square curve approaches a normal distribution.

Poisson Distribution

Poisson Distribution is applicable in situations where events occur at random points of time and space wherein our interest lies only in the number of occurrences of the event.

Examples:

  1. The number of suicides reported in a particular city.

  2. The number of printing errors at each page of the book.

Characteristics:

  • Any successful event should not influence the outcome of another successful event

  • The probability of success over a short interval must equal the probability of success over a longer interval.

  • The probability of success in an interval approaches zero as the interval becomes smaller.

Exponential Distribution

Exponential distribution is widely used for survival analysis. From the expected life of a machine to the expected life of a human, exponential distribution successfully delivers the result.

The Chi Square χ2{\chi}^2χ2 distribution is the distribution of the sum of squared standard normal deviates.

χ2=Σ(Observed−Expected)2Expected\chi^2 = \Sigma {\frac{(Observed-Expected)^2}{Expected}}χ2=ΣExpected(Observed−Expected)2​

The variance is equal to two times the number of degrees of freedom: σ2=2∗vσ^2 = 2 * vσ2=2∗v

When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when χ2=v−2{\chi}^2 = v - 2χ2=v−2 .

Link: - -

https://www.analyticsvidhya.com/blog/2017/09/6-probability-distributions-data-science/
https://machinelearningmastery.com/statistical-data-distributions/