Cosine Similarity

The cosine similarity between two vectors is a measure that calculates the cosine of the angle between them.

Cosine similarity focuses on the direction instead of magnitude which is helpful in NLP and sentiment analysis. Longer document can have same theme as short sentences.

Cosine Similarity

Three different angle between two vectors: i) 90 degree (Orthogonal): NOT similar or Independent ii) Less than 90 degree: Similar iii) Greater than 90 degree: Opposite

Note: - Words and sentences should be converted to vectors to calculate cosine similarity - Tools such as Word2Vec, bag of words with either TF(term frequency) or TF-IDF(term frequency-inverse document frequency) can be used

Calculate Cosine Similarity using Scikit-learn

Link: http://blog.christianperone.com/2013/09/machine-learning-cosine-similarity-for-vector-space-models-part-iii/

Last updated

Was this helpful?