Cosine Similarity
The cosine similarity between two vectors is a measure that calculates the cosine of the angle between them.
Cosine similarity focuses on the direction instead of magnitude which is helpful in NLP and sentiment analysis. Longer document can have same theme as short sentences.
Three different angle between two vectors: i) 90 degree (Orthogonal): NOT similar or Independent ii) Less than 90 degree: Similar iii) Greater than 90 degree: Opposite
Note: - Words and sentences should be converted to vectors to calculate cosine similarity - Tools such as Word2Vec, bag of words with either TF(term frequency) or TF-IDF(term frequency-inverse document frequency) can be used
Calculate Cosine Similarity using Scikit-learn
Last updated