Here's an excerpt from my notes:
Define the likelihood function: $$\mathcal{L}(\vec{x};\theta)=\prod_{i=1}^{n} f(x_i;\theta)$$
Where $f$ is the pdf of the distribution we're sampling the $x$'s from. Caution: the likelihood function $\mathcal{L(\vec{x};\theta)}$ is not the same as the joint pdf $L(\vec{x};\theta)$
Could someone explain to me where this "caution" comes from? Is it purely because the samples were not assumed independent?
The likelihood function $\mathcal L$ is not the same as the joint pdf $L$ because they are functions of different variables. The likelihood function $\mathcal L$ is a function of the parameter $\theta$ for a fixed value of the observations $\vec{x}$ but $L$ is a function of the observations $\vec{x}$ for a fixed value of the parameter $\theta$. This is why I prefer to write $\mathcal{L}(\theta;\vec{x})$ instead of $\mathcal{L}(\vec{x};\theta)$.
The general definition of the likelihood function when $\vec{x}$ admits a joint pdf is actually in terms of the joint pdf: $$ \mathcal{L}(\theta;\vec{x}):= L(\vec{x};\theta) $$ but in case of independence this simplies to $\prod\limits_{i=1}^n f(x_i;\theta)$.