Suppose we have a language that just consists of a term "a" and a successor function s(x). We know that p(a) is true and we know that $\forall x: p(x) \to p(s(x))$ is true. Suppose we don't have the induction rule: p(a), $\forall x: p(x) \to p(s(x)) \vdash \forall x: p(x)$. Then we couldn't prove the conclusion $\forall x: p(x)$, could we?
I ask because in some course the prof says: For the language introduced above, our rule of inference [induction rule] is sound. Suppose we know that a schema is true of a and suppose that we know that, whenever the schema is true of an arbitrary ground term τ, it is also true of the term s(τ). Then, the schema must be true of everything, since there are no other terms in the language.
IMO he can't write that because he basically claims he can prove the soundness of the induction rule beyond. But as far as I understand you cannot prove the soundness of this rule beyond, i.e. you have to assume (per axiom) it and then you can prove it trivially from the assumption (and of course the assumption could be always wrong so you could never be sure the "schema must be true of everything". Am I right?

You are right. Counterexample to the argument that $\forall x \ p(x)$ logically follows from $p(a)$ and $\forall x (p(x) \to p(s(x)))$:
Take as the domain all non-negative integers, interpret $a$ as $1$, $s$ as the successor function, and $p(x)$ as $x$ is a positive integer. Then the premises hold, but since $0$ is not a positive integer, the conclusion does not.