Basic question on regression analysis

257 Views Asked by At

I am a mixed methods researcher with no formal training in statistics/math. I produced the below plot (using seaborn in Python). It is based on a topic model, and shows that over time (x) the number of strong topics (topics with probability > a given threshold) increases. Hence, more topic complexity over time.

If presenting this plot and above analysis to an expert (stat/math) audience, what would be the obvious problems/issues?

enter image description here

1

There are 1 best solutions below

2
On BEST ANSWER

The basic questions would be what are the estimated parameters (the intercept and the slope) and their standard deviations. Next, would be probability what the model's $R^2$ or adjusted $R^2$. Actually, you can answer all the basic questions by simply presenting the standard output of a linear regression procedure.

Next issue might be to see the residuals plots, however as your data is only in two dimension - there is actually no special interest to see it as we can basically observe the validity of the model in the given plot.

Anyway, if I was among the audience of your presentation, probably I would asked why haven't you considered more explanatory variables. Your data is very noisy and thus the $R^2$ will be low. If you just wanted to show that the number of "strong topics" increasing over time - then just give the model's parameters estimators with their s.d and that will be enough.