I have an electric heater and close to it a temperature sensor. And I could use some help with controlling the heater.
The amount of power supplied to the heater can be changed at any moment. Let $0 \le u(t) \le 1$ be the relative amount of power supplied to the heater at time $t$.
Let $y(t)$ be the temperature of the sensor minus the environment temperature.
By doing some modelling, measurements, curve fitting and die-hard mathematics, I came to the conclusion that the relation between $u(t)$ and $y(t)$ can be approximately described using a transfer function. $$ \begin{align} \hat{f}(s) &= \int_0^\infty f(t)e^{-st} \ \mathrm{d}t \qquad \text{(Laplace transform)} \\ G(s) &= \frac{0.595}{s^3+0.489 s^2+0.0613 s+0.000859} \qquad \text{(transfer function)} \\ \hat{y}(s) &= G(s)\hat{u}(s) \end{align} $$ Eventually I want to use a feedback mechanism to control $u(t)$ in order to let $y(t)$ reach a target temperature. For now, I'm interested in open loop control. Let's assume the model above is not an approximation, but is exactly correct.
If $y(0) = 0$ and a target temperature $y_T > 0$ is given, what function $u(t)$ should I choose to minimize the error $E$ while keeping $y(t) \le y_T$ at all times? $$ E = \int_0^\infty y_T - y(t) \ \mathrm{d}t $$ I know that eventually the steady state must be reached, so: $$ \lim_{t \to \infty} u(t) = \frac{y_T}{G(0)} $$ For example, with target $y_T = 200$, I could use this function: $$ u(t) = \begin{cases} 1 & t < 21.33 \\ \frac{y_T}{G(0)} \approx 0.29 & \text{otherwise} \end{cases} $$ I got the number $21.33$ by trying out different numbers and picking the highest number that does not make $y(t)$ exceed $y_T$. Here is the plot of $y(t)$:
However, from more experiments I know this is not the optimal solution. The area between the blue and the orange line, $E$, can be made smaller.

The optimal control input that solves the optimal control problem
$$\min_{u(\cdot)\in[0,1]}\int_0^T(y_T-y(s))ds$$ such that $y(t)\le y_T$ for all $t\in[0,1]$ does not have a closed-form expression because the optimal control input is a bang-bang control since the optimization problem is linear. A direct numerical optimization approach with discretization step of 10ms yields the following solution
where the control input is in blue (and multiplied by 100 to make it easily readable) and the corresponding temperature trajectory is in blue. The minimum cost is given by 3641.9, up to solution accuracy. Things are starting to get a bit off a bit before 40s because the input starts taking intermediary values while it should not. This is likely a numerical issue.
If we reduce the discretization step to 1ms, we get the following results
where we can observe the expected chattering behavior for the control input. This chattering phase has an average value of 0.2887 which is equal to $y_T/G(0)$.
In the end, OP is not too far from the optimal with their simple control strategy.
This is work under progress and may not be completely correct. I will update it when I find a bit of time.
However, it may still be possible to find an implicit characterization of the optimal control input using Pontryagin's Maximum Principle.
To do so, let us consider the system in state-space form $(A,B,C,D)$. By the Maximum Principle we get that necessary conditions are given by
$$ \begin{array}{rcl} \dot{x}^*(t)&=&Ax^*(t)+Bu^*(t),\ x^*(0)=x_0\\ \dot{\lambda}^*(t)&=&-A^T\lambda^*(t)+C^T-\mu(t)C^T,\ \lambda^*(T)=0 \end{array} $$ where $\mu(t)$ is the Lagrange multiplier associated with the constraint $y(t)-y_T\le0$.
We also have the condition that $$y_T-Cx^*(t)+(Ax^*(t)+Bu^*(t))^T\lambda^*(t)+\mu(t)(y(t)-y_T)\le y_T-Cx^*(t)+(Ax^*(t)+Bu(t))^T\lambda^*(t)+\mu(t)(y(t)-y_T)$$ for all $u(t)\in[0,1]$. This means that
$$ u^*(t)=\left\{ \begin{array}{rcl} 0,&&\mathrm{if\ }B^T\lambda^*(t)>0\\ 1,&&\mathrm{if\ }B^T\lambda^*(t)<0\\ ?,&&\mathrm{if\ }B^T\lambda^*(t)=0 \end{array} \right. $$ where the question mark indicates that further analysis is needed in this case (singular arcs).
We have that
$$ \lambda^*(t)=-\int_{0}^{T-t}\exp(A^Ts)C^Tds+\int_t^T\exp(-A^T(t-s)C^T\mu(s)ds. $$
Therefore, we have that $$ \begin{array}{rcl} B^T\lambda^*(t)&=&\displaystyle B^T\left(-\int_{0}^{T-t}\exp(A^Ts)C^Tds+\int_t^T\exp(-A^T(t-s)C^T\mu(s)ds.\right)\\ % &=&\displaystyle -\int_0^{T-t}h(s)ds+\int_0^{T-t}h(s)\mu(t+s)ds\\ &=&\displaystyle \int_0^{T-t}h(s)(\mu(t+s)-1)ds. \end{array} $$
Now we face the difficulty of computing $\mu$ which must satisfy the complementary slackness condition of the Karush-Kuhn-Tucker conditions, that is, we must have that
$$\mu(t)(y(t)-y_T)=0,$$ for all $t\in[0,T]$.
In other words, we have to solve $\mu(\cdot)$ in the system
$$ \begin{array}{rcl} \dot{x}^*(t)&=&Ax^*(t)+Bu^*(t),\ x^*(0)=x_0\\ u^*(t)&=&\left\{ \begin{array}{rcl} 0,&&\mathrm{if\ }\int_0^{T-t}h(s)(\mu(t+s)-1)ds>0\\ 1,&&\mathrm{if\ }\int_0^{T-t}h(s)(\mu(t+s)-1)ds<0\\ ?,&&\mathrm{if\ }\int_0^{T-t}h(s)(\mu(t+s)-1)ds=0 \end{array}\right.\\ \mu(t)(Cx(t)-y_T)&=&0. \end{array} $$
That does not look really solvable analytically. I am currently working on it.