Suppose I need to remove the outlier, that is (40, 10) in this case (refer to the plot attached below) using IQR rule, how do I do that?
Compared to the neighbouring points, (40, 10) is definitely an outlier. However,
Q1 = 11.25,
Q3 = 35.75
1.5 * IQR = 1.5 * (Q3 - Q1) = 36.75
Only points with y-val lower than 11.25-36.75 or greater than 35.75+36.75 are considered outliers.
How do I find and remove (40, 10) using IQR rule if I must use IQR rule?
Here's my code:
import pandas as pd
import matplotlib.pyplot as plt
test = pd.DataFrame({'x': range(50), 'y': [i if i != 40 else 10 for i in range(50)]})
plt.figure(**FIGURE)
plt.scatter(test['x'], test['y'], marker='x')
plt.show()
Here's the plot generated from the above code. Please view the plot, this question is irrelevant without.

@Henry is correct. The point you show is not an outlier among the $x$s nor among the $Y$s. It is an outlier among the residuals from the regression line of $Y$ on $x.$
I do not have access to your data, so here is a somewhat similar simulation illustrated by data sampled using R, along with a regression analysis and a boxplot of the residuals.
Generate data for regression according to the model $Y_i = 3x_i + 10 + e_i,$ where $e_i$ are IID $\mathsf{Norm}(0, \sigma), \sigma = 5.$ An outlier from the regression line is introduced as point $(80,50).$
The left panel of the figure below shows the $n=100$ points. Subsequently, the regression line is plotted through the data.
Important information about the regression of $Y$ on $x:$ Notice the very small residual at about $-196.$
In the regression equation $Y_i = \alpha x_i + \beta + e_i,$ the estimate of slope $\alpha$ is $\hat\alpha = 2.9251$ (close to $3),$ the estimate of the $y$-intercept $\beta$ is $\hat \beta = 12.3146$ (close to $10),$ and $\sigma^2$ is estimated by $\hat\sigma^2 = 20.81$ (close to $5^2 = 25).$ The outlier, artificially introduced later, interferes (slightly) with estimation. The t tests show that neither slope nor intercept is $0.$
In the left panel below, the (blue) regression line $\hat Y = \hat\alpha x_i + \hat\beta$ is plotted through the data. Residuals $r_i = Y_i - (\hat\alpha x_i + \hat \beta)$ show vertical distances between each of the points and the regression line. Values of the $n=100$ residuals are stored in the vector
r.The right panel below shows a boxplot of the 100 residuals. Our artificially introduced outlier-residual is shown at the bottom of the boxplot. The procedure
boxplot.statsprints out the value of this residual.