Home » Statistics » Cosine Similarity in Python – How to Calculate

# Cosine Similarity in Python – How to Calculate

Cosine similarity measures the similarity between two vectors of an inner product space by calculating the cosine of the angle between the two vectors.

Python Cosine similarity is one of the most widely used and powerful similarity measures.

Use `dot() and norm() `functions of python NumPy package to calculate `Cosine Similarity` in python.

## Cosine Similarity Formula

For two vectors, A and B, the Cosine Similarity in Python is calculated as:

Cosine Similarity = ΣAiBi / (√ΣAi2√ΣBi2)

In this article, we will discuss how to calculate cosine similarity in python and cosine similarity examples.

Cool Tip: Learn how to calculate mean squared error (MSE) in python!

## Using Numpy for Cosine Similarity

We will be using `numpy `library available in python to calculate cosine similarity between two vectors.

If you don’t have `numpy` library installed then use the below command on the windows command prompt for NumPy library installation

`pip install numpy`

Let’s understand with examples about how to calculate Cosine similarity in python with given below python code

## Calculate Cosine Similarity in Python

lets assume we have data as below;- x = [1, 1, 1, 1, 0, 0, 0, 0, 0] y = [0, 0, 1, 1, 1, 1, 0, 0, 0]

Using `numpy.array()function we will create x & y arrays of the `same length.

```#import modules
import numpy as np
from numpy import dot
from numpy.linalg import norm

#define arrays
x = np.array([1, 1, 1, 1, 0, 0, 0, 0, 0])
y = np.array([0, 0, 1, 1, 1, 1, 0, 0, 0])

#calculate Cosine Similarity python
result = dot(x, y)/(norm(x)*norm(y))

print("The Cosine Similarity between two vectors is: ",result)
```

In the above code, we import `numpy` package to use `dot() and norm() `functions to calculate `Cosine Similarity` in python.

Using `dot(x, y)/(norm(x)*norm(y))`, we calculate the cosine similarity between two vectors x & y in python.

The output of the above cosine similarity in python code :

```//Output
The Cosine Similarity between two vectors is:  0.5```

Cool Tip: Check here article on how to calculate MAPE in python!

## Calculate Cosine Similarity between arrays of same length in Python

In this example, we will calculate Python Cosine similarity between two randomly generated arrays of the same length in python with the given below code.

```#import modules
import numpy as np
from numpy import dot
from numpy.linalg import norm

#define arrays
x = np.random.randint(10, size=100)
y = np.random.randint(10, size=100)

# Calculate Cosine Similarity Python
result = dot(x, y)/(norm(x)*norm(y))

print("The Cosine Similarity between two vectors is: ",result)```

In the above code using `numpy.random.randint()`, we create two random arrays of size 100.

Using `dot(x, y)/(norm(x)*norm(y))` we calculate the cosine similarity between two vectors x & y in Python.

The output of the above cosine similarity in python code.

```#output
The Cosine Similarity between two vectors is:  0.6373168018459651```

Cool Tip: Learn how to calculate SMAPE in python!

## Calculate Cosine Similarity between arrays of different length in Python

In this example, we will calculate Cosine similarity Python between two randomly generated arrays of different lengths in python with the given below code.

```import numpy as np
from numpy import dot
from numpy.linalg import norm

#define arrays
x = np.random.randint(10, size=90) #length=90
y = np.random.randint(10, size=100) #length=100

#calculate Cosine Similarity Python
result = dot(x, y)/(norm(x)*norm(y))

print("The Cosine Similarity between two vectors is: ",result)```

In the above code, it will raise the ValueError because the arrays are of different lengths.

Note:- We are not able to calculate the cosine similarity between the arrays of different lengths.

The Error of the above cosine similarity in python code is shown below.

```#ERROR
ValueError: shapes (90,) and (100,) not aligned: 90 (dim 0) != 100 (dim 0)```

Cool Tip: Learn how to calculate Euclidean distance in python!

## Conclusion

I hope, you may find how to calculate Cosine Similarity in python tutorial with step by step illustration of cosine similarity examples educational and helpful.

Using NumPy package in Python, cosine similarity can be calculated using dot() and norm() functions.