Andrew Gurung
  • Introduction
  • Data Science
    • Natural Language Processing
      • Sentiment analysis using Twitter
    • Linear Algebra
      • Linear algebra explained in four pages
      • Vectors
        • Vector Basics
        • Vector Projection
        • Cosine Similarity
        • Vector Norms and Orthogonality
        • Linear combination and span
        • Linear independence and Basis vectors
      • Matrices
        • Matrix Arithmetic
        • Matrix Operations
        • Functions and Linear Transformations
        • Matrix types
      • Eigendecomposition, Eigenvectors and Eigenvalues
      • Principle Component Analysis (PCA)
      • Singular-Value Decomposition(SVD)
      • Linear Algebra: Deep Learning Book
    • Calculus
      • Functions, Limits, Continuity and Differentiability
      • Scalar Derivative and Partial Derivatives
      • Gradient
      • Matrix Calculus
      • Maxima and Minima using Derivatives
      • Gradient Descent and its types
    • Statistics and Probability
      • Probability Rules and Axioms
      • Types of Events
      • Frequentist vs Bayesian View
      • Random Variables
      • MLE, MAP, and Naive Bayes
      • Probability Distributions
      • P-Value and hypothesis test
    • 7 Step DS Process
      • 1: Business Requirement
      • 2: Data Acquisition
      • 3: Data Processing
        • SQL Techniques
        • Cleaning Text Data
      • 4: Data Exploration
      • 5: Modeling
      • 6: Model deployment
      • 7: Communication
    • Miscellaneous
      • LaTeX commands
  • Computer Science
    • Primer
      • Big O Notation
  • Life
    • Health
      • Minimalist Workout Routine
      • Reddit FAQ on Nootropics
      • Hiking/Biking Resources
    • Philosophy
      • Aristotle's Defense of Private Property
    • Self-improvement
      • 100 Mental Models
      • Don't break the chain
      • Cal Newport's 5 Productivity tips
      • Andrew Ng's advice on deliberate practice
      • Atomic Habits
      • Turn sound effects off in Outlook
    • Food and Travel
      • 2019 Guide to Pesticides in Produce
      • Recipe
        • Spicy Sesame Noodles
      • Travel
        • Hiking
    • Art
      • Scott Adams: 80% of the rules of good writing
      • Learn Blues Guitar
    • Tools
      • Software
        • Docker
        • Visual Studio Code
        • Terminal
        • Comparing Git Workflow
      • Life Hacks
        • DIY Deck Cleaner
  • Knowledge Vault
    • Book
      • The Almanack of Naval Ravikant
    • Media
    • Course/Training
Powered by GitBook
On this page
  • Sentiment Analysis method
  • Sentiment analysis using Twitter API and Vader python framework

Was this helpful?

  1. Data Science
  2. Natural Language Processing

Sentiment analysis using Twitter

  • Sentiment analysis is a technique that involves extracting opinion from text

  • A sample use cause can be using tweets from Twitter to discover people feelings about a product or service

  • Helps find polarity (positive or negative)

  • Data sources: Review sites, blogs, forums, social media, etc.

  • Text data can either be facts (mostly neutral) or opinions (consists of polarity)

Sentiment Analysis method

  1. Rule based

    • Matching words with sentiment scores from a lexicon (sample of words with associated polarity scores)

    • No training required

    • Not so accurate

  2. Automatic

    • Trained pattern matching algorithm will predict a word's sentiment

    • Uses machine learning such as classification algorithm to find polarity

    • More accurate and scalable

    • Needs more training data

Sentiment analysis using Twitter API and Vader python framework

Step 1: Create a Twitter application to access twitter data

Step 2: Get keys and access tokens to access twitter API

  • consumer_key = ''

  • consumer_secret = ''

  • access_token = ''

  • access_token_secret = ''

Step 3: Install dependencies

!pip install pandas
!pip install tweepy
!pip install vaderSentiment

Step 4: Import dependencies

  • Tweepy: Accessing data from Twitter

  • Vader: For sentiment analysis

import tweepy
import pandas as pd
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# gather lexicon data
import nltk
nltk.download('vader_lexicon')

Step 5: Search for tweets

# Twitter API authentication variables
consumer_key = 'Insert here..'
consumer_secret = 'Insert here..'
access_token = 'Insert here..'
access_token_secret = 'Insert here..'
# authentication
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

# Twitter API
api = tweepy.API(auth)

# search for 'Indian Surgical Attack'. Ignore retweets
tweets = api.search('Indian Surgical Attack -filter:retweets', count=200)

# extract tweet text and create a single column data frame
data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

# display top 10 tweets
display(data.head(10))
	Tweets
0	The Pakistani Army deployment, including radar...
1	Indian Air Force Air Strike on Jaish's Terror ...
2	When India Said Pak did:\n\n2008 Mumbai Attack...
3	@Tirthan96689171 @ReutersWorld Here we r nt di...
4	First insult martyrs by calling Pulwama Terror...
5	@TimesNow @RShivshankar This shows that @INCI...
6	''The Indian voters will lead the third surgic...
7	When India Said Pak did:\n\n2008 Mumbai Attack...
8	When India Said Pak did:\n\n2008 Mumbai Attack...
9	When India Said Pak did:\n\n2008 Mumbai Attack...

Note: The response from tweepy search() contains metadata information of tweets.

# metadata extracted by tweepy
# https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object
print(tweets[0].text)
print(tweets[0].created_at)
print(tweets[0].source)
print(tweets[0].retweet_count)
The Pakistani Army deployment, including radars and air defense system along the LoC, was strengthened immediately… https://t.co/EiLUBi7Wgc
2019-03-12 23:30:00
TweetDeck
11

Step 6: Perform sentiment analysis using Vader

# Sentiment analyser
sid = SentimentIntensityAnalyzer()

list = []
for index, row in data.iterrows():
    sentiment_score = sid.polarity_scores(row['Tweets'])
    list.append(sentiment_score)
ss_series = pd.Series(list)

# create a new column 'polarity' with the corresponding sentiment score
data['Polarity'] = ss_series

# display the first 10 elements
display(data.head(10))
	Tweets	                                            Polarity
0	The Pakistani Army deployment, including radar...	{'neg': 0.0, 'neu': 0.777, 'pos': 0.223, 'comp...
1	Indian Air Force Air Strike on Jaish's Terror ...	{'neg': 0.353, 'neu': 0.647, 'pos': 0.0, 'comp...
2	When India Said Pak did:\n\n2008 Mumbai Attack...	{'neg': 0.437, 'neu': 0.563, 'pos': 0.0, 'comp...
3	@Tirthan96689171 @ReutersWorld Here we r nt di...	{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound...
4	First insult martyrs by calling Pulwama Terror...	{'neg': 0.502, 'neu': 0.498, 'pos': 0.0, 'comp...
5	@TimesNow @RShivshankar This shows that @INCI...	{'neg': 0.239, 'neu': 0.761, 'pos': 0.0, 'comp...
6	''The Indian voters will lead the third surgic...	{'neg': 0.154, 'neu': 0.846, 'pos': 0.0, 'comp...
7	When India Said Pak did:\n\n2008 Mumbai Attack...	{'neg': 0.437, 'neu': 0.563, 'pos': 0.0, 'comp...
8	When India Said Pak did:\n\n2008 Mumbai Attack...	{'neg': 0.437, 'neu': 0.563, 'pos': 0.0, 'comp...
9	When India Said Pak did:\n\n2008 Mumbai Attack...	{'neg': 0.437, 'neu': 0.563, 'pos': 0.0, 'comp...
​
PreviousNatural Language ProcessingNextLinear Algebra

Last updated 6 years ago

Was this helpful?

URL:

https://developer.twitter.com/en/apps/create