Andrew Gurung
  • Introduction
  • Data Science
    • Natural Language Processing
      • Sentiment analysis using Twitter
    • Linear Algebra
      • Linear algebra explained in four pages
      • Vectors
        • Vector Basics
        • Vector Projection
        • Cosine Similarity
        • Vector Norms and Orthogonality
        • Linear combination and span
        • Linear independence and Basis vectors
      • Matrices
        • Matrix Arithmetic
        • Matrix Operations
        • Functions and Linear Transformations
        • Matrix types
      • Eigendecomposition, Eigenvectors and Eigenvalues
      • Principle Component Analysis (PCA)
      • Singular-Value Decomposition(SVD)
      • Linear Algebra: Deep Learning Book
    • Calculus
      • Functions, Limits, Continuity and Differentiability
      • Scalar Derivative and Partial Derivatives
      • Gradient
      • Matrix Calculus
      • Maxima and Minima using Derivatives
      • Gradient Descent and its types
    • Statistics and Probability
      • Probability Rules and Axioms
      • Types of Events
      • Frequentist vs Bayesian View
      • Random Variables
      • MLE, MAP, and Naive Bayes
      • Probability Distributions
      • P-Value and hypothesis test
    • 7 Step DS Process
      • 1: Business Requirement
      • 2: Data Acquisition
      • 3: Data Processing
        • SQL Techniques
        • Cleaning Text Data
      • 4: Data Exploration
      • 5: Modeling
      • 6: Model deployment
      • 7: Communication
    • Miscellaneous
      • LaTeX commands
  • Computer Science
    • Primer
      • Big O Notation
  • Life
    • Health
      • Minimalist Workout Routine
      • Reddit FAQ on Nootropics
      • Hiking/Biking Resources
    • Philosophy
      • Aristotle's Defense of Private Property
    • Self-improvement
      • 100 Mental Models
      • Don't break the chain
      • Cal Newport's 5 Productivity tips
      • Andrew Ng's advice on deliberate practice
      • Atomic Habits
      • Turn sound effects off in Outlook
    • Food and Travel
      • 2019 Guide to Pesticides in Produce
      • Recipe
        • Spicy Sesame Noodles
      • Travel
        • Hiking
    • Art
      • Scott Adams: 80% of the rules of good writing
      • Learn Blues Guitar
    • Tools
      • Software
        • Docker
        • Visual Studio Code
        • Terminal
        • Comparing Git Workflow
      • Life Hacks
        • DIY Deck Cleaner
  • Knowledge Vault
    • Book
      • The Almanack of Naval Ravikant
    • Media
    • Course/Training
Powered by GitBook
On this page
  • Get your environment set up
  • Create A New Database
  • SQL basics
  • Tips
  • Coalesce function:
  • Credits

Was this helpful?

  1. Data Science
  2. 7 Step DS Process
  3. 3: Data Processing

SQL Techniques

Previous3: Data ProcessingNextCleaning Text Data

Last updated 6 years ago

Was this helpful?

Get your environment set up

  1. Install which is a lightweight popular database

  2. Install

Create A New Database

  • SQLite: At a shell or DOS prompt, enter: "sqlite3 test.db". This will create a new database named "test.db". (You can use a different name if you like.)

  • DB Browser: Provides an easier user interface to execute SQL queries

SQL basics

Create a table

create table exercise_logs ( id integer primary key autoincrement,
type text, minutes integer,
calories integer,
heart_rate integer);

Insert data into a table

INSERT INTO exercise_logs (type, minutes, calories, heart_rate) VALUES ('biking', 60, 100, 110);
INSERT INTO exercise_logs (type, minutes, calories, heart_rate) VALUES ('biking', 45, 120, 100);
INSERT INTO exercise_logs (type, minutes, calories, heart_rate) VALUES ('football', 62, 200, 120);
INSERT INTO exercise_logs (type, minutes, calories, heart_rate) VALUES ('football', 90, 210, 140);
INSERT INTO exercise_logs (type, minutes, calories, heart_rate) VALUES ('swimming', 30, 110, 120);
INSERT INTO exercise_logs (type, minutes, calories, heart_rate) VALUES ('swimming', 40, 150, 130);

Select statements

SELECT type, sum(calories) as total_calories
FROM exercise_logs 
GROUP BY type;

type total_calories biking 220 football 410 swimming 260

Tips

Coalesce function:

Return the first non-null value in a list

SELECT COALESCE(NULL, NULL, NULL, 'Google', NULL, 'Twitter');
Google

Credits

SQLite
DB Browser for SQLite
Data Lit SQL Techniques