Examples of solving a problem by introducing new parameters

516 Views Asked by At

First let's see some examples:

  1. I've seen a lot of examples of calculating the definite integrals by introducing a suitable parameter, but still cannot understand why this method so effective, while calculating the integral directly could be incredibly difficult.

  2. I remember that in linear algebra we can prove the following fact by introducing a new parameter $x$:

    Fact: For any $A,B\in M_n(\mathbb R),$ we have $\chi_{A\cdot B}(t)=\chi_{B\cdot A} (t).$

Here for any $A\in M(\mathbb R),$ $\chi_A(t):=\det(tE-A)$ is the characteristic polynomial of matrix $A,$ and $E$ is the identity matrix in $M_n(\mathbb R).$

Proof: If at least ome of $A,B$ is non-degenerate, for example, assume that $A$ is non-degenerate, then we have $$\chi_{AB}(t)=\det(tE-AB)=\det(A\cdot(tE-BA)\cdot A^{-1})=\det(tE-BA)=\chi_{BA}(t).$$
Now if both $A$ and $B$ are degenerate, then we can find that since polynimal $f(x)=\det(A+xE)$ only have finite roots, and $f(0)=\det A=0,$ then there exists a deleted neighbourhood $U(0)$ of $0,$ such that $f(x)\neq 0,\ \forall\ x\in U(0).$

Denote the real coefficient polynomial ring as $\mathbb R[t],$ then notice that map $\varphi:\mathbb R\to \mathbb R[t],\ x\mapsto \chi_{(A+xE)B}(t)-\chi_{B(A+xE)}(t)$ is continuous, and $\varphi(x)=0,\ \forall\ x\in U(0),$ so we have $\varphi(0)=0,$ or $\chi_{A\cdot B}(t)=\chi_{B\cdot A} (t).$
End of proof

  1. Another interesting example: https://math.stackexchange.com/q/1395378

I'm really interested in the motivation of introducing the new parameter and would like to see more examples of solving problems through introducing a new parameter.

Question: Give (as many as possible) examples of solving or simplifying a problem by introducing new parameters. Also, explanation of why it works is also welcomed.

4

There are 4 best solutions below

0
On

(I will adopt the convention of giving different responses in different answers, labeled community wiki.)

Here's an example I quite like. The problem is to prove that the following definitions of $e$ are equivalent:

  1. The unique positive real $e$ such that $\displaystyle \ln e = \int_1^e \frac{dt}{t} = 1$.
  2. $\displaystyle \sum_{n=0}^{\infty} \frac{1}{n!}$
  3. $\displaystyle \lim_{n \to \infty} \left( 1 + \frac{1}{n} \right)^n$.

The easiest way to approach this is to generalize it to proving that the following definitions of $e^x$ (introducing the new parameter $x$) are equivalent:

  1. The unique positive real $e^x$ such that $\displaystyle \ln e^x = \int_1^{e^x} \frac{dt}{t} = x$.
  2. $\displaystyle \sum_{n=0}^{\infty} \frac{x^n}{n!}$.
  3. $\displaystyle \lim_{n \to \infty} \left( 1 + \frac{x}{n} \right)^n$.

The reason is that we now have the freedom to differentiate with respect to $x$, and introduce a fourth equivalent definition:

  1. The unique function $e^x$ such that $\displaystyle \frac{d}{dx} e^x = e^x$ and $e^0 = 1$.

Once you've established that there is indeed a unique function with these properties, which is not hard, showing that the other three definitions give the same function amounts to showing that they all satisfy these properties, which just involves differentiating all of the definitions above with respect to $x$.

To my mind, this also explains what the point of $e$ is: it's really $e^x$ that is consistently the star player in mathematics, and $e$ just happens to be its value at $x = 1$.

Another nice exercise from here is to use definition 4 to prove the exponential property $e^{x + y} = e^x e^y$. This can be done from any of the other three definitions but it's tedious to do it directly from either definitions 2 or 3, although not bad with definition 1.

0
On

Shortest path algorithms. The aim is to find the shortest path from $s$ (for source) to some $v$, where $s$ and $v$ are vertices in a weighted directed graph. All the algorithms below solve this problem by way of solving a more general problem.

Examples:

Bellman-Ford: Find the shortest path between vertex $s$ to every other vertex $v$ which has at most $k$ edges. The algorithm uses the fact that once the answer is known for $k$ then it's easy to find it for $k+1$.

Dijkstra's algorithm: Find the $k$ closest vertices to vertex $s$. Again, if the answer is known for $k$ then it's easier to find it for $k+1$.


All pairs shortest path. The aim is to output a matrix that gives the shortest path between every pair of vertices.

Examples:

Floyd-Warshall: Find the shortest path between every pair of vertices, subject to the constraint that all the intermediate vertices in the paths are limited to $\{1,\dotsc,k\}$. Keep increasing $k$. See also Kleene's algorithm for turning a finite automaton into a regular expression.

0
On

A fact from the theory of continued fractions:

Let $$\frac{p_n}{q_n} = a_0 + \frac{1}{a_1 + \frac{1}{\ddots \frac{1}{a_n}}} $$ in lowest terms, where $a_0, a_1, \ldots$ are positive integers. Then for each $n$, $$p_n q_{n+1} - p_{n+1} q_n = (-1)^{n+1}.$$ It follows that $$\frac{p_{n+1}}{q_{n+1}} - \frac{p_n}{q_n} = \frac{(-1)^n}{q_n q_{n+1}}.$$

My favorite proof of this involves introducing an extra variable $x$ and defining $$ f_n(x) = a_0 + \frac{1}{a_1 + \frac{1}{ \ddots \frac{1}{a_n + x}}}. $$ Then, if you work a few examples, you are naturally led to the conjecture that $f_n(x) = \frac{p_n + p_{n-1} x}{q_n + q_{n-1} x}$. The point of introducing the extra variable of $x$ is that it allows you to substitute on the very inside: $$ f_{n+1}(x) = f_n\left(\frac{1}{a_{n+1} + x}\right). $$ This allows you to express $p_{n+1}$ and $q_{n+1}$ in terms of $a_{n+1}$, $p_n$, $p_{n-1}$, $q_n$, $q_{n-1}$, and then from there it becomes straightforward to show the desired result by induction. Whereas with the original continued fraction, but without this trick, it would appear to be necessary to start over the calculations for $\frac{p_n}{q_n}$ for each new value of $n$, which would make it very difficult to try to find relations between the various $p_n$ and $q_n$.

(Actually, once you've done the work above, you're still left with a gap: you don't know whether $\frac{p_n}{q_n}$ is actually a fraction in lowest terms. The equation $p_n q_{n+1} - p_{n+1} q_n = (-1)^{n+1}$ would imply that $p_n$ and $q_n$ are relatively prime; however, at this point, we have what appears to be a circular argument. To correct the circular dependencies, the final proof technically goes: define sequences $p_n$ and $q_n$ by the recursion relations you found using the argument above. Then by induction, $f_n(x) = \frac{p_n + p_{n-1} x}{q_n + q_{n-1} x}$ and $p_n q_{n+1} - p_{n+1} q_n = (-1)^{n+1}$. Now the latter equation implies that $p_n$ and $q_n$ are relatively prime, and therefore we conclude that $\frac{p_n}{q_n} = f_n(0)$ is indeed the lowest terms representation of the $n$th partial term of the continued fraction.)

0
On

In Littlewood's highly entertaining "A Mathematician's Miscellany", on page 21, he writes about a theorem due to M. Riesz for which an elegant proof (G. Thorin ("Convexity Theorems", Communications du seminaire mathematique de l'Universite de Lund, 9)) involves nested upper bounds and the innermost upper bound is taken with respect to a variable that is not there.

In the proof, an upper bound with respect to a real $\sigma$ has $\sigma$ replaced by $s+it$ and a new upper bound is inserted that is taken with respect to $t$.