I cannot figure this myself - suppose I have some distribution $f(y; \theta)$ from the exponential family. By Wikipedia (https://en.wikipedia.org/wiki/Exponential_family) let us assume this form:
$ f(y; \theta) = h(y)exp\big[\eta(\theta)T(y) - A(\theta)\big] $
The wiki defines the canonical form as
If $\eta(\theta) = \theta$, then the exponential family is said to be in canonical form
Then, it adds the part I do not get:
By defining a transformed parameter $\eta(\theta) = \theta$, it is always possible to convert an exponential family to canonical form
I do not get what they do with parameters $\theta$. For example, let us have the Normal with unknown mean $\mu$ AND unknown variance $\sigma^2$. There is immediately example in wiki, where they define:
$\eta = \big(\frac{\mu}{\sigma^2}, -\frac{1}{2\sigma^2}\big)$
and then conclude: then our form is canonical. Why is that? They initially define the normal as $f(y; \mu, \sigma)$! According to the previous definition, it is clear that $\eta(\theta) \neq (\mu, \sigma^2)$! Why do they call it a canonical form then?
Actually this can be shown almost for all distributions when written in canonical form, their parameter vector $\eta$ is clearly not equal to the parameters in the definition.
Could someone clarify please.