I am reading a text which is described below.
Supervised learning can be done by choosing a hypothesis $h*$ that is most probable given the data : $h* = \underset{h \in H} {argmax}P(h | data)$.
By Bayes rule,
$h* = \underset{h \in H} {argmax} P(data | h)P(h)$.
Then we can say that the prior probability $P(h)$ is high for a degree-1 or 2 polynomial, lower for a degree-7 polynomial
I did not understand why $P(h)$is high for degree-1 polynomial as compared to degree-7 polynomial.