The interquartile range (IQR) is a measure of variability, based on dividing a data set into quartiles. The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively.
The difference between the third quartile Q3(75th percentiles) and the first quartile Q1(25th percentiles) is called the Inter quartile range.
IQR = Q3 − Q1
where
- Q1 is the first quartile
- Q3 is the third quartile
You can calculate IQR very easily in python just by using single line code.
In this tutorial, we will discuss two methods you can use to calculate the interquartile range (IQR) in python with step-by-step examples.
Method 1:Interquartile Range using Numpy
We will be using the NumPy
library available in python, it provides numpy.percentile()
function to calculate interquartile range.
If you don’t have numpy
library installed then use the below command on the windows command prompt for NumPy library installation.
pip install numpy
Cool Tip: How to Calculate SMAPE in python.
Calculate Interquartile range of array in python.
Lets assume we have data as below
data = [87,80,68,72,56,58,60,63,82,70,58,55,48,50,77]
Let’s calculate the interquartile range for the data using the below python code.
#import modules import numpy as np #define array of data data = np.array([87,80,68,72,56,58,60,63,82,70,58,55,48,50,77]) #calculate interquartile range (IQR) q3, q1 = np.percentile(data, [75 ,25]) iqr = q3 - q1 #print the interquartile range print('The Interquartile Range for the data is:',iqr)
In the above code, we import the NumPy
package and creates an array for the data.
numpy.percentile()
function accepts the dataset and percentiles of the quartiles as input parameters and returns the calculated quartiles.
After subtracting the first quartile from the third quartile we get the interquartile range for the dataset. The output of the above code is shown below.
#Output The Interquartile Range for the data is: 17.5
The Interquartile Range for the data is 17.5 for the above dataset. This is the spread of the middle 50% of values in the dataset.
Cool Tip: How to Calculate MAPE in python.
Method 2:Use Scipy for Interquartile Range
We will be using the Scipy
library available in python, it provides scipy.stats.iqr()
function to calculate interquartile range.
If you don’t have scipy
library installed then use the below command on windows command prompt for scipy library installation.
pip install scipy
Calculate Interquartile range of array in python.
In this example. we will use the same dataset defined for the above example.
Let’s calculate the interquartile range for the data set using scipy
library in the below code.
# Import stats from scipy library from scipy import stats #define array of data data = np.array([87,80,68,72,56,58,60,63,82,70,58,55,48,50,77]) # get interquartile range (IQR) iqr = stats.iqr(data, interpolation = 'midpoint') #print the interquartile range print('The Interquartile Range for the data is:',iqr)
In the above code, we import the scipy package and then create an array.
Using scipy.stats.iqr()
function, it calculates the interquartile range for the dataset. The output of the above code is shown below.
#Output The Interquartile Range for the data is: 17.5
The Interquartile Range for the data is 17.5 for the above dataset. This is the spread of the middle 50% of values in the dataset.
Cool Tip: How to Calculate MAPE in python.
Calculate IQR (Interquartile range) for dataframe column in python.
In this example, we create the data frame using the pandas
package.
Then we calculate the interquartile range for the specific column i.e ‘maths’ in dataframe using the below python code.
import numpy as np import pandas as pd #create data frame df = pd.DataFrame({'maths': [90, 65, 72, 48, 94, 90, 46, 65, 87, 56], 'science': [85, 70, 64, 66, 57, 70, 52, 65, 64, 59], 'english': [85,67, 90, 58, 75, 67, 86, 49, 59, 55]}) #calculate interquartile range (IQR) for the 'maths' column q3, q1 = np.percentile(df['maths'], [75 ,25]) iqr = q3 - q1 #print the interquartile range for the maths column print('The Interquartile Range for the maths is:',iqr)
In the above code, using numpy.percentile()
function we calculate the quartiles for the maths column of a given dataframe.
Then after taking the difference we get the interquartile range for the maths column. The output of the above code is mentioned below.
#Output The Interquantile Range for the maths is: 31.0
Cool Tip: How to calculate cosine similarity in python.
Calculate Interquartile range (IQR) for multiple columns of dataframe in python.
In this example, we calculate the interquartile range for all the columns using the below python code.
#import modules import numpy as np import pandas as pd #create data frame df = pd.DataFrame({'maths': [90, 65, 72, 48, 94, 90, 46, 65, 87, 56], 'science': [85, 70, 64, 66, 57, 70, 52, 65, 64, 59], 'english': [85,67, 90, 58, 75, 67, 86, 49, 59, 55]}) #define function to calculate interquartile range def calculateIQR(x): q3,q1 = np.percentile(x, [75 ,25]) return np.subtract(q3,q1) #calculate IQR for all columns result = df.apply(calculateIQR) print('\nThe Interquartile Range for all columns are as follows:\n',result)
In the above example, we have created dataframe using the pandas package Dataframe()
function.
Our custom python function calculateIQR()
accepts the data as input parameters which are columns of dataframe and returns the interquartile range for all columns.
We call this function using df.apply(calculateIQR).
The interquartile range for all the columns is shown below.
#Output The Interquartile Range for all columns are as follows: maths 31.00 science 8.75 english 24.25 dtype: float64
Cool Tip: How to perform shapiro wilk test in python.
Conclusion
I hope, you may find how to calculate the interquartile range in Python tutorial with step by step illustration of examples educational and helpful.