As I understand, when we have a parametric pdf and need to estimate the parameter based on some observed fact, we tend to choose a conjugate prior of the pdf for the parameter. Because conjugate prior has a good property that it ensure the posterior is of the same form, which could make the calculation simple.
But how could we know the posterior and prior must be of the same type? How could we pick the conjugate prior just for the sake of the simplicity of calculation?
BTW, as I read more about probability and statistical inference, I found it more subjective rather than objective...are we closing to the secret of God, or are we just making some self amusement.
The question, in my opinion, extends to the debate "Bayesian vs classical (or frequentist) statistics". According to Wikipedia there are two major differences in the frequentist and Bayesian approaches to inference:
In a frequentist approach to inference, unknown parameters are often, but not always, treated as having fixed but unknown values that are not capable of being treated as random variates in any sense, and hence there is no way that probabilities can be associated with them. In contrast, a Bayesian approach to inference does allow probabilities to be associated with unknown parameters, where these probabilities can sometimes have a frequency probability interpretation as well as a Bayesian one. The Bayesian approach allows these probabilities to have an interpretation as representing the scientist's belief that given values of the parameter are true.
While "probabilities" are involved in both approaches to inference, the probabilities are associated with different types of things. The result of a Bayesian approach can be a probability distribution for what is known about the parameters given the results of the experiment or study. The result of a frequentist approach is either a "true or false" conclusion from a significance test or a conclusion in the form that a given sample-derived confidence interval covers the true value: either of these conclusions has a given probability of being correct, where this probability has either a frequency probability interpretation or a pre-experiment interpretation.
So, returning to your question, the main reason for using a Bayesian approach is to model the personal beliefs of the scientist (assuming he has a reason to have these prior to the experiment). In my opinion the use of conjugate prior can be justified basically in cases where a standard distribution is known from past experiments and the scientist wants to update the values of the parameters of the distribution. For example, assume that through the data of the past decades can be inferred that the global temperature follows a normal distribution with mean $\mu$. Now the scientist - instead of testing the hypothesis that golobal temperature is rising (vs. not rising) - he can use a Bayesian approach to update the parameter $\mu$. Simplicity of calculations is desired and I do not think that cost for the model is so big, since there is indeed a great variety of conjugate priors to choose from.
In cases where there is no evidence about the existing distribution it is indeed audacity to simply choose a conjugate prior and just update it's parameters. It is like this mistake we make in early math exercise where we have to prove that something holds, and say "assume that what we have to show holds. Then we can infer that it holds indeed!"