I'm presently in Computational Statistics. I'm struggling a bit with truly understanding the concept of Estimators and Estimates. I understand the difference on a simple level: one is a function to use to generate estimates. In other words, an estimate is a single value such as $\hat{\theta}$ as opposed to the actual value $\theta$. I've taken Statistics and must admit to not fully grasping the topic at that time.
I have this homework problem: Given observations $Y_1,...,Y_n$ are described by the relationship $Y_i = \theta x^2 + \epsilon_i$, where $x_i,...,x_n$ are fixed constants (observed values) and $\epsilon_i,...,\epsilon_n$ are i.i.d. $N(\theta, \sigma^2)$.
I'm asked to first find the least squares estimator of this. I do not fully understand what I'm truly being asked to do. The textbook, Computational Statistics by Givens and Hoeting, seems to interchangeably use the terms estimator and estimate. I see that, because these are individual and independently distributed, the Likelihood function is the product of the p.d.f. But now enters my lack of understanding. How are the Estimators tied or linked to the likelihood functions? What is it I'm really calculating?
When I took Foundations of Analysis, the author of that textbook stated that most students have difficulty with topics until they have full "internalized" them. I think this is my struggle with this: internalization. What is it that I'm truly being asked to do?
I'll give a formal definition of these notions. Let $X$ be an observed random element with values in a measurable space $(\mathcal{X},\mathcal{H})$ ($\mathcal{X}$ is called the sampling space). Let $\{P_{\theta}:\theta\in\Theta\}$ be a family of probability measures on $(\mathcal{X},\mathcal{H})$; typically $\Theta$ is subset of $\mathbb{R}^d$. We assume that the distribution of $X$ is $P_{\theta}$ for some unknown $\theta$. Given a realization $x$ of $X$, one wants to estimate the unknown parameter $\theta$.
A measurable function $T:(\mathcal{X},\mathcal{H})\to (\mathcal{Y},\mathcal{G})$ is called a statistic. When $\mathcal{Y}=\Theta$, the statistic $T$ becomes an estimator. The random variable $T(X)$ is also referred to as an estimator of $\theta$. For a specific realization $x$ of $X$ one obtains an estimate $T(x)$.
In your case the observed random element is $X=\{(Y_i,x_i)\}_{i=1}^n$. The least squares estimator of $\theta$ is given by $$ T(X)=\underset{b\in \Theta}{\operatorname{argmin}}\left(\sum_{i=1}^n(Y_i-bx_i^2)^2\right). $$