In every algebra (or basic analysis) book that I've seen, three properties of real numbers are taken as axiomatic: commutativity, association, and distribution of multiplication over addition [$a(b + c) = ab + ac$].
What's bothered me for a long time is that while combining like terms [$ax + bx = (a + b)x$] is equivalent to distribution, it seems more basic and fundamental. It's used as an addition process (instead of a multiplicative one); it seems commonsensical in terms of unit addition (3 feet + 5 feet = 8 feet), which was is referenced by some as the "great principle of similitude"; and so forth.
So what is the rationale for taking distribution as axiomatic, and proving combination afterward? Why is it not better pedagogy to take combination as fundamental, and then prove distribution from it?
(Edit) I've cross-posted this question on the Mathematics Educators site: https://matheducators.stackexchange.com/questions/9601/why-is-distribution-prioritized-over-combining
(The question was cross-posted to MESE; below, I cross-post my answer, as well.)
This grew a bit long for a comment.
(My first note is similar to Alexander Woo's remark about factoring polynomials; perhaps he intended "polynomials" to subsume the case here, in which we add constant functions...)
Given $413 + 91$, it may not be clear that this can be re-written as $7 \times 59 + 7 \times 13 = 7(59+13)$.
(Plenty of people seem to believe that $91$ is prime; an earlier comment cites J. Conway as observing as much, and I would guess it is related to memorizing the $10\times10$ or even $12\times12$ times tables.)
Meanwhile, $7(59+13)$ can have the $7$ distributed mechanically.
Still, the act of viewing an equation from both perspectives is certainly important, and I think what you observe here is not altogether different from the tendency to write $1 + 1 = 2$ significantly more often than $2 = 1 + 1$, which is known to cause problems with viewing the equal sign, $=$, as an operator, i.e., operationally rather than relationally, cf. MESE 7964 and the nice response of D. Hast.
You did ask for some rationale; perhaps we can look historically to Euclid's Elements. If we are to believe the translation provided here, then we have Euclid remarking in the same order. A similar translation is found, e.g., in the following source:
A note from the aforecited:
Finally, a separate but related comment: I observe a prioritizing in the presentation of the difference of squares and their factorization, i.e., more often than not I see: $a^2 - b^2 = (a+b)(a-b)$.
This phenomenon seems to prioritize "combining" over "distribution," and is often accompanied by the same for the sum and difference of cubes. Anyhow, there are consequences in this case, as well. Few students consider such a property in computing, e.g., $47 \times 53$.
(Ask students to compute that product in a few ways and see if it even comes up!)
I feel confident that most important will be for students to see that the same mathematical information is presented by an equality regardless of which expression is on which side, and to give students the opportunity to consider both presentations and their various ramifications.