The definition, provided in Baby Rudin, for an orthogonal system of functions on $[a,b]$, is the following
$\textbf{8.10 Definition}$ Let $\{\phi(n)\}_{n \in \mathbb{N}}$ be a sequence of complex functions on $[a,b]$, such that $$\int_a^b \phi_n(x)\overline{\phi_m(x)}dx = 0 \qquad (n \neq m).$$ Then $\{\phi_n\}$ is said to be an orthogonal system of functions on $[a,b]$. ...
I am wondering why the conjugate is needed in the second function in the integral, i.e., $\overline{\phi_m(x)}$, and how to understand the meaning behind? Why is orthogonality defined in such way?
I don't have a formal education in complex analysis, so I will greatly appreciate it if you can explain in a undergrad level before complex analysis.
To add some details of the proof which was originally suggested by @user247327, I'll publish that to answer.
Like vectors we can define the term "Orthogonal" be the function dot product of the functions $f$ and $g$ ($<f, g>$) be $0$.
And when $f=g$, $<f, g>=<f, f>=||f||$, is the norm of the function.
So the definition of your book is that functions $f$ and $g$ are orthogonal when $<f, g>=0$.
So your question can be interpreted to:
That's because we want $a^2+b^2$, not ${(a+bi)}^2$ to get the norm of a complex number $a+bi$. We use $z \bar z$ instead of $z^2$.
Then how can we know that $<f, g>=0$ when $<f, \overline g>=0$?
You can prove that easily by iterating $f(x) \overline{g(x)}$ by $x$.
Let's iterate $x$ dispersely to be intuitive.
So let $x_i$ be $i$th value of $x$, and if we think that similarly, we get:
$$ <f, g>= \lim_{n \to \infty} { \sum_i^n {f(x_i) \overline{g(x_i)}}\frac 1n}$$
And if we let $$f(x_i)=a_i+b_i i, g(x_i)=c_i+d_i i\\ \text{where}\\ a, b, c, d \in \mathbb R$$,
it will be fine if we prove that
So to finish,
$$(a_i c_i + b_i d_i) + (b_ic_i - a_id_i)i =0$$
$$\therefore a_ic_i + b_id_i =0, b_ic_i - a_id_i=0$$
$$\therefore (a_i+b_i i)(c_i-d_i i) = (a_ic_i + b_id_i) + (b_ic_i-a_id_i)i=0$$