Home » Python » How to Calculate the Standard Error of the Mean in Python

How to Calculate the Standard Error of the Mean in Python

The Standard Error of the Mean (SEM) describes how far a sample mean varies from the actual population mean.

It is used to estimate the approximate confidence intervals for the mean.

In this tutorial, we will discuss two methods you can use to calculate the Standard Error of the Mean in python with step-by-step examples.

Standard Error of the Mean Formula

The Standard error of the mean for a sample is calculated using below formula:

Standard error of the mean (SEM) = s / √n

where:

s : sample standard deviation

n : sample size

Method 1: Use Numpy

We will be using the numpy available in python, it provides std() function to calculate the standard error of the mean.

If you don’t have numpy package installed, use the below command on windows command prompt for numpy library installation.

pip install numpy

Example 1: How to calculate SEM in Python

Let’s understand, how to calculate the standard error of mean (SEM) with the given below python code.

#import modules
import numpy as np

#define dataset
data = np.array([4,7,3,9,12,8,14,10,12,12])

#calculate standard error of the mean 
result = np.std(data, ddof=1) / np.sqrt(np.size(data))

#Print the result
print("The Standard error of the mean : %.3f"%result)

In the above code, we import numpy library to define the dataset.

Using std() function we calculated the standard error of the mean.

Note that we must specify ddof=1 in the argument for std() function to calculate the sample standard deviation instead of population standard deviation.

The Output of the above code is shown below.

#Output 
The Standard error of the mean : 1.149

The Standard error of the mean is 1.149.

Method 2: Use Scipy

We will be using Scipy library available in python, it provides sem() function to calculate the standard error of the mean.

If you don’t have the scipy library installed then use the below command on windows command prompt for scipy library installation.

pip install scipy

Example 2: How to calculate SEM in Python

Lets assume we have dataset as below

data = [4,7,3,9,12,8,14,10,12,12]

Lets calculate the standard error of mean by using below python code.

#import modules
import scipy.stats as stat

#define dataset
data = [4,7,3,9,12,8,14,10,12,12]

#calculate standard error of the mean 
result = stat.sem(data)

#Print the result
print("The Standard error of the mean : %.3f"%result)

In the above code, we import numpy library to define the dataset.

Using sem() function we calculated the standard error of the mean.

The Output of the above code is shown below.

#Output 
The Standard error of the mean : 1.149

How to Interpret the Standard Error of the Mean

The two important factors to keep in mind while interpreting the SEM are as follows:-

1 Sample Size:- With the increase in sample size, the standard error of mean tends to decrease.

Let’s see this with below example:-

#import modules
import scipy.stats as stat

#define dataset 1
data1 = [4,7,3,9,12,8,14,10,12,12]

#define dataset 2 by repeated the first dataset twice
data2 = [4,7,3,9,12,8,14,10,12,12,4,7,3,9,12,8,14,10,12,12]

#calculate standard error of the mean 
result1 = stat.sem(data1)
result2 = stat.sem(data2)

#Print the result
print("The Standard error of the mean for the original dataset: %.3f"%result1)
print("The Standard error of the mean for the repeated dataset : %.3f"%result2)

In the above example, we created the two datasets i.e. data1 & data2 where data2 is just the twice of data1.

The Output of the above code is shown below:-

# Output
The Standard error of the mean for the original dataset: 1.149
The Standard error of the mean for the repeated dataset : 0.791

We seen that for data1 the SEM is 1.149 and for data2 SEM is 0.791.

It clearly shows that with an increase in size the SEM decreases.

Values of data2 are less spread out around the mean as compared to data1, although both have the same mean value.

2 The Value of SEM : The larger value of the SEM indicates that the values are more spread around the mean .

Let’s discuss this with below example:-

#import modules
import scipy.stats as stat

#define dataset 1
data1 = [4,7,3,9,12,8,14,10,12,12]

#define dataset 2 by replace last value with 120
data2 = [4,7,3,9,12,8,14,10,12,120]

#calculate standard error of the mean 
result1 = stat.sem(data1)
result2 = stat.sem(data2)

#Print the result
print("The Standard error of the mean for the original dataset: %.3f"%result1)
print("The Standard error of the mean for the repeated dataset : %.3f"%result2)

In the above example, we created the two datasets i.e. data1 & data2 where data2 is created by replacing the last value with 120.

The Output of the above code is shown below:-

#Output
The Standard error of the mean for the original dataset: 1.149
The Standard error of the mean for the repeated dataset : 11.177

We seen that for data1 the SEM is 1.149 and for data2 SEM is 11.177.

It clearly shows that SEM for data2 is larger as compared to data1.

It means the values of data2 are more spread out around the mean as compared to data1.

Conclusion

I hope, you may find how to calculate the Standard Error of the Mean in the python tutorial with a step-by-step illustration of examples educational and helpful.