Home » Python » How to Calculate the Standard Error of the Mean in Python

How to Calculate the Standard Error of the Mean in Python

The Standard Error of the Mean (SEM) describes how far a sample mean varies from the actual population mean.

It is used to estimate the approximate confidence intervals for the mean.

In this tutorial, we will discuss two methods you can use to calculate the Standard Error of the Mean in python with step-by-step examples.

Standard Error of the Mean Formula

The Standard error of the mean for a sample is calculated using below formula:

Standard error of the mean (SEM) = s / √n

where:

s : sample standard deviation

n : sample size

Method 1: Use Numpy

We will be using the numpy available in python, it provides std() function to calculate the standard error of the mean.

If you don’t have numpy package installed, use the below command on windows command prompt for numpy library installation.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
pip install numpy
pip install numpy
pip install numpy

Example 1: How to calculate SEM in Python

Let’s understand, how to calculate the standard error of mean (SEM) with the given below python code.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#import modules
import numpy as np
#define dataset
data = np.array([4,7,3,9,12,8,14,10,12,12])
#calculate standard error of the mean
result = np.std(data, ddof=1) / np.sqrt(np.size(data))
#Print the result
print("The Standard error of the mean : %.3f"%result)
#import modules import numpy as np #define dataset data = np.array([4,7,3,9,12,8,14,10,12,12]) #calculate standard error of the mean result = np.std(data, ddof=1) / np.sqrt(np.size(data)) #Print the result print("The Standard error of the mean : %.3f"%result)
#import modules
import numpy as np

#define dataset
data = np.array([4,7,3,9,12,8,14,10,12,12])

#calculate standard error of the mean 
result = np.std(data, ddof=1) / np.sqrt(np.size(data))

#Print the result
print("The Standard error of the mean : %.3f"%result)

In the above code, we import numpy library to define the dataset.

Using std() function we calculated the standard error of the mean.

Note that we must specify ddof=1 in the argument for std() function to calculate the sample standard deviation instead of population standard deviation.

The Output of the above code is shown below.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#Output
The Standard error of the mean : 1.149
#Output The Standard error of the mean : 1.149
#Output 
The Standard error of the mean : 1.149

The Standard error of the mean is 1.149.

Method 2: Use Scipy

We will be using Scipy library available in python, it provides sem() function to calculate the standard error of the mean.

If you don’t have the scipy library installed then use the below command on windows command prompt for scipy library installation.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
pip install scipy
pip install scipy
pip install scipy

Example 2: How to calculate SEM in Python

Lets assume we have dataset as below

data = [4,7,3,9,12,8,14,10,12,12]

Lets calculate the standard error of mean by using below python code.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#import modules
import scipy.stats as stat
#define dataset
data = [4,7,3,9,12,8,14,10,12,12]
#calculate standard error of the mean
result = stat.sem(data)
#Print the result
print("The Standard error of the mean : %.3f"%result)
#import modules import scipy.stats as stat #define dataset data = [4,7,3,9,12,8,14,10,12,12] #calculate standard error of the mean result = stat.sem(data) #Print the result print("The Standard error of the mean : %.3f"%result)
#import modules
import scipy.stats as stat

#define dataset
data = [4,7,3,9,12,8,14,10,12,12]

#calculate standard error of the mean 
result = stat.sem(data)

#Print the result
print("The Standard error of the mean : %.3f"%result)

In the above code, we import numpy library to define the dataset.

Using sem() function we calculated the standard error of the mean.

The Output of the above code is shown below.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#Output
The Standard error of the mean : 1.149
#Output The Standard error of the mean : 1.149
#Output 
The Standard error of the mean : 1.149

How to Interpret the Standard Error of the Mean

The two important factors to keep in mind while interpreting the SEM are as follows:-

1 Sample Size:- With the increase in sample size, the standard error of mean tends to decrease.

Let’s see this with below example:-

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#import modules
import scipy.stats as stat
#define dataset 1
data1 = [4,7,3,9,12,8,14,10,12,12]
#define dataset 2 by repeated the first dataset twice
data2 = [4,7,3,9,12,8,14,10,12,12,4,7,3,9,12,8,14,10,12,12]
#calculate standard error of the mean
result1 = stat.sem(data1)
result2 = stat.sem(data2)
#Print the result
print("The Standard error of the mean for the original dataset: %.3f"%result1)
print("The Standard error of the mean for the repeated dataset : %.3f"%result2)
#import modules import scipy.stats as stat #define dataset 1 data1 = [4,7,3,9,12,8,14,10,12,12] #define dataset 2 by repeated the first dataset twice data2 = [4,7,3,9,12,8,14,10,12,12,4,7,3,9,12,8,14,10,12,12] #calculate standard error of the mean result1 = stat.sem(data1) result2 = stat.sem(data2) #Print the result print("The Standard error of the mean for the original dataset: %.3f"%result1) print("The Standard error of the mean for the repeated dataset : %.3f"%result2)
#import modules
import scipy.stats as stat

#define dataset 1
data1 = [4,7,3,9,12,8,14,10,12,12]

#define dataset 2 by repeated the first dataset twice
data2 = [4,7,3,9,12,8,14,10,12,12,4,7,3,9,12,8,14,10,12,12]

#calculate standard error of the mean 
result1 = stat.sem(data1)
result2 = stat.sem(data2)

#Print the result
print("The Standard error of the mean for the original dataset: %.3f"%result1)
print("The Standard error of the mean for the repeated dataset : %.3f"%result2)

In the above example, we created the two datasets i.e. data1 & data2 where data2 is just the twice of data1.

The Output of the above code is shown below:-

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# Output
The Standard error of the mean for the original dataset: 1.149
The Standard error of the mean for the repeated dataset : 0.791
# Output The Standard error of the mean for the original dataset: 1.149 The Standard error of the mean for the repeated dataset : 0.791
# Output
The Standard error of the mean for the original dataset: 1.149
The Standard error of the mean for the repeated dataset : 0.791

We seen that for data1 the SEM is 1.149 and for data2 SEM is 0.791.

It clearly shows that with an increase in size the SEM decreases.

Values of data2 are less spread out around the mean as compared to data1, although both have the same mean value.

2 The Value of SEM : The larger value of the SEM indicates that the values are more spread around the mean .

Let’s discuss this with below example:-

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#import modules
import scipy.stats as stat
#define dataset 1
data1 = [4,7,3,9,12,8,14,10,12,12]
#define dataset 2 by replace last value with 120
data2 = [4,7,3,9,12,8,14,10,12,120]
#calculate standard error of the mean
result1 = stat.sem(data1)
result2 = stat.sem(data2)
#Print the result
print("The Standard error of the mean for the original dataset: %.3f"%result1)
print("The Standard error of the mean for the repeated dataset : %.3f"%result2)
#import modules import scipy.stats as stat #define dataset 1 data1 = [4,7,3,9,12,8,14,10,12,12] #define dataset 2 by replace last value with 120 data2 = [4,7,3,9,12,8,14,10,12,120] #calculate standard error of the mean result1 = stat.sem(data1) result2 = stat.sem(data2) #Print the result print("The Standard error of the mean for the original dataset: %.3f"%result1) print("The Standard error of the mean for the repeated dataset : %.3f"%result2)
#import modules
import scipy.stats as stat

#define dataset 1
data1 = [4,7,3,9,12,8,14,10,12,12]

#define dataset 2 by replace last value with 120
data2 = [4,7,3,9,12,8,14,10,12,120]

#calculate standard error of the mean 
result1 = stat.sem(data1)
result2 = stat.sem(data2)

#Print the result
print("The Standard error of the mean for the original dataset: %.3f"%result1)
print("The Standard error of the mean for the repeated dataset : %.3f"%result2)

In the above example, we created the two datasets i.e. data1 & data2 where data2 is created by replacing the last value with 120.

The Output of the above code is shown below:-

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#Output
The Standard error of the mean for the original dataset: 1.149
The Standard error of the mean for the repeated dataset : 11.177
#Output The Standard error of the mean for the original dataset: 1.149 The Standard error of the mean for the repeated dataset : 11.177
#Output
The Standard error of the mean for the original dataset: 1.149
The Standard error of the mean for the repeated dataset : 11.177

We seen that for data1 the SEM is 1.149 and for data2 SEM is 11.177.

It clearly shows that SEM for data2 is larger as compared to data1.

It means the values of data2 are more spread out around the mean as compared to data1.

Conclusion

I hope, you may find how to calculate the Standard Error of the Mean in the python tutorial with a step-by-step illustration of examples educational and helpful.