Home » Statistics » Cosine Similarity in Python – How to Calculate

Cosine Similarity in Python – How to Calculate

Cosine similarity measures the similarity between two vectors of an inner product space by calculating the cosine of the angle between the two vectors.

Python Cosine similarity is one of the most widely used and powerful similarity measures.

Use dot() and norm() functions of python NumPy package to calculate Cosine Similarity in python.

Cosine Similarity Formula

For two vectors, A and B, the Cosine Similarity in Python is calculated as:

Cosine Similarity = ΣAiBi / (√ΣAi2√ΣBi2)

In this article, we will discuss how to calculate cosine similarity in python and cosine similarity examples.

Cool Tip: Learn how to calculate mean squared error (MSE) in python!

Using Numpy for Cosine Similarity

We will be using numpy library available in python to calculate cosine similarity between two vectors.

If you don’t have numpy library installed then use the below command on the windows command prompt for NumPy library installation

pip install numpy

Let’s understand with examples about how to calculate Cosine similarity in python with given below python code

Calculate Cosine Similarity in Python

lets assume we have data as below;- x = [1, 1, 1, 1, 0, 0, 0, 0, 0] y = [0, 0, 1, 1, 1, 1, 0, 0, 0]

Using numpy.array()function we will create x & y arrays of the same length.

#import modules
import numpy as np
from numpy import dot
from numpy.linalg import norm

#define arrays
x = np.array([1, 1, 1, 1, 0, 0, 0, 0, 0]) 
y = np.array([0, 0, 1, 1, 1, 1, 0, 0, 0])

#calculate Cosine Similarity python
result = dot(x, y)/(norm(x)*norm(y))

print("The Cosine Similarity between two vectors is: ",result)

In the above code, we import numpy package to use dot() and norm() functions to calculate Cosine Similarity in python.

Using dot(x, y)/(norm(x)*norm(y)), we calculate the cosine similarity between two vectors x & y in python.

The output of the above cosine similarity in python code :

//Output
The Cosine Similarity between two vectors is:  0.5

Cool Tip: Check here article on how to calculate MAPE in python!

Calculate Cosine Similarity between arrays of same length in Python

In this example, we will calculate Python Cosine similarity between two randomly generated arrays of the same length in python with the given below code.

#import modules
import numpy as np
from numpy import dot
from numpy.linalg import norm

#define arrays
x = np.random.randint(10, size=100)
y = np.random.randint(10, size=100)

# Calculate Cosine Similarity Python
result = dot(x, y)/(norm(x)*norm(y))

print("The Cosine Similarity between two vectors is: ",result)

In the above code using numpy.random.randint(), we create two random arrays of size 100.

Using dot(x, y)/(norm(x)*norm(y)) we calculate the cosine similarity between two vectors x & y in Python.

The output of the above cosine similarity in python code.

#output
The Cosine Similarity between two vectors is:  0.6373168018459651

Cool Tip: Learn how to calculate SMAPE in python!

Calculate Cosine Similarity between arrays of different length in Python

In this example, we will calculate Cosine similarity Python between two randomly generated arrays of different lengths in python with the given below code.

import numpy as np
from numpy import dot
from numpy.linalg import norm

#define arrays
x = np.random.randint(10, size=90) #length=90
y = np.random.randint(10, size=100) #length=100

#calculate Cosine Similarity Python
result = dot(x, y)/(norm(x)*norm(y))

print("The Cosine Similarity between two vectors is: ",result)

In the above code, it will raise the ValueError because the arrays are of different lengths.

Note:- We are not able to calculate the cosine similarity between the arrays of different lengths.

The Error of the above cosine similarity in python code is shown below.

#ERROR
ValueError: shapes (90,) and (100,) not aligned: 90 (dim 0) != 100 (dim 0)

Cool Tip: Learn how to calculate Euclidean distance in python!

Conclusion

I hope, you may find how to calculate Cosine Similarity in python tutorial with step by step illustration of cosine similarity examples educational and helpful.

Using NumPy package in Python, cosine similarity can be calculated using dot() and norm() functions.

Leave a Comment