I was implementing Bishop's predictive distribution (1.2.6 Section) over the weekend. But stumbled upon an oddity.
Here is an equation for your convenience: $$p(t|x,\mathbf{x},\mathbf{t})=\mathcal{N}(t|m(x),s^2(x)) \tag{1}$$
Where $m(x)$ and $s^2(x)$ can be found here (note I have corrected equation 1.72, as there was a typo in older version of Bishop's book).
I would like to give you the steps that I have taken to arrive at the "jagged line" in hopes that you can spot an error, because I am fairly certain that the code I have written is quite robust.
1) I first generated $\mathbf{x}$ along with $\mathbf{t}$ (note that for the random noise I have used normal random variables with variance $\beta^{-1}=\sigma^2=1/11.1$:
2) I then generated the data for which I wanted to predict mean $t$ as given by (1):
Now there are two things I have tried:
1) I iterated through each new observation above, supplying new $x$ and $t$ to my code for it to calculate the mean, keeping $\mathbf{x}$ throughout all iterations (so I was not appending new data neither to $\mathbf{x}$ nor to $\mathbf{t}$. And I get super weird results (which may suggest something):

Another thing I have tried, this is where I get the jagged line. 1) Iterate through each new $(x,t)$, compute the $m(x),s^2(x)$ and at the start of the next iteration append the old $x$ and $t$ to $\mathbf{x}$ and \mathbf{t}$. When I plot, I get this:
Looks bad..
2) However, if I run the algorithm on the $\mathbf{x}$ and $\mathbf{t}$ from above (i.e. when the new data is now incporporated in $\mathbf{x}$, and I repeat the procedure from above, I get this:

Which is what I want on the first run, not the second. Any ideas?



Without seeing your code what I will say is that it seems like your plots are correct and that any problems are in extrapolating these polynomial models onto regions where you don't have training data, and also the interpretation of what the jagged line plot displays.
To help explain I tried to simulate a similar set of data to what you are working with, so here is a a fit on the full data - which corresponds to your final figure and everything is looking pretty good.
Now going back to the "jagged line" plot, what you seem to be doing here is the following
However, the mean at all points $x \in \mathbf{x}$ will change as soon as you add the new training pair $(x,t)$ to $\mathbf{x},\mathbf{t}$. So what is the jagged line showing then? Well it is the collection of means, $m(x_n)$, conditional on only the training data up to $x_{n-1}$, that is $$ y_n = m(x_n) \;| \; (x_1,\ldots,x_{n-1}, t_1, \ldots, t_{n-1} ). $$ However there is no reason to think the sequence $(y_1,\ldots,y_n,\ldots)$ should vary smoothly because adding a new point will change the mean prediction over the whole range. In the next plot I have both the red line $\{ y_n \}$ equivalent to what you displayed in your fourth plot and I have also included plots of the predicted mean over all of the data points recalculated after updating the training set to include the points up to $x_n$ which are given by the (smooth) dotted black lines - note how the end point of the black lines are equal to the red line at $x_n$.
Finally with regards to the explosive behavior, that is simply a case of the poor behaviour of these polynomials extrapolating a long way outside of regions for which you have data and I have plotted an another example displaying the same phenomena. In summary as far as I can tell what you are doing is working as intended.