By linearity of $f^*$, $$f^*\left(\sum_I a_I \, dx_I\right) = \sum_I f^* (a_I \, dx_I)$$
And if I want $df_I$, I would have to use the formula $f^*dx_i = df_i$. So $f^*$ disappears when introduced $df_i$ according to my computation.
$$f^*\left(\sum_I a_I \, dx_I\right) = \sum_I a_I (f^*dx_I) = \sum_I a_I \, df_I.$$
But according to the book, it is still there: $$f^*\left(\sum_I a_I \, dx_I\right) = \sum_I (f^*a_I) \, df_I.$$
Can anyone demystify it for me?
By linearity of $f^*$, $$f^*\left(\sum_I a_I \, dx_I\right) = \sum_I f^* (a_I \, dx_I)$$
And if I want $df_I$, I would have to use the formula $f^*dx_i = df_i$.
$$f^*\left(\sum_I a_I \, dx_I\right) = \sum_I (f^* a_I) (f^*dx_I) = \sum_I (f^* a_I) \, df_I.$$
Note: $f^*$ distribute on $a$ as well, this is where the question arises.