Consider the following panel data regression model:
$$y_{it}=X_{it}\beta+\alpha_{i}+u_{it},$$ where $\alpha_{i}$ indicate the nuisance parameters (indiviudal specific), $y_{it}$ is an $nt\times1$ vector of the dependent variable, $i$ represents individual $i$ , $t$ represents the time period, $X_{it}$ is an $nt\times k$ matrix of the regressors, $\beta$ is a $k\times1$ vector of coefficients that need to be estimated and $u_{it}$ is an $nt\times1$ vector of the error terms assumed to be orthogonal to the regressors, conditional on $\alpha_{i}$. When we estimate this panel data model by fixed effects, we include a dummy variable for every individual, effectively removing the nuisance paramter. This is equivalent to demeaning the data and estimating $(y_{it}-\bar{y_{i})}=(X_{it}-\bar{X_{i}})\beta+(u_{it}-\bar{u})$. In a sense, we are 'controlling' for time invariant individual unboservables in order to get unbiased estimates of our parameters of interest. Intuitively, I was wondering if the parameter estimates we obtain are equivalent to estimating the coefficients separetely for each individual and then averaging over all individuals?
If the true model is that you have homogeneity across the units, i.e. in the models $$y_{it}=X_{it}\beta_i+\alpha_i+u_{it}$$ we have $$\beta_i=\beta$$ then pool mean group estimator i.e. pooling everything and estimate $\beta$ is up to a statistically vanishing (with growing $(N,T)$) error the same as estimating each equation separately and the averaging (mean group estimator). Take a look at 1999 paper by Pesaran et. al. JASA. This is paper on dynamic models, but gives yo ua lot of insights!