Why is it okay to consider that $(\mathrm d x)^n=0$ for any n greater than $1$? I can understand that $\mathrm d x$ is infinitesimally small ( but greater than $0$ ) and hence its square or cube should be approximately equal to $0$ not exactly $0$ .
But if this is so then how can we expect the results obtained from calculus to be exact and not just approximate ( like the slope or area under a curve )?
I have also noticed some anomalies, like $\sqrt{ (\mathrm d x)^2 + (\mathrm d y)^2 }$ is $0$ but $\mathrm d x\sqrt{1+ (\mathrm d y/\mathrm d x)^2 }$ is not $0$ when these two things are apparently the same . Moreover we can claim that
$$(\mathrm d x)^2=(\mathrm d x)^3=(\mathrm d x)^4 = \cdots = 0$$
which is quite hard to believe.
Can you help me figure out the logic behind these things ?

In standard analysis there are no infinitesimals. $dx$ is merely an element of syntax used in expressing $\frac{df}{dx}$ and $\int f(x) dx$ and nothing more. Instead everything gets defined in terms of bounds on real numbers. In particular limits are defined in terms of bounds on real numbers, which gets you derivatives and integrals. In this setting, a situation where you would see $dx^2$ if you were using infinitesimals might be differentiating $x^2$. In this case you find $\frac{(x+h)^2-x^2}{h}=2x+h$. This $h$ term is not zero...but if $x$ is not zero and $h$ is going to zero then it is much smaller than the $2x$ to which it is being added. That is, the leading order term of $(x+h)^2$ is $x^2$; the first order correction is $2xh$.
Much of calculus is purely concerned with leading order terms and first order corrections. Much of the rest of it confines attention to second order corrections. Despite this, if you had $h^k$ by itself for some large integer $k$, you would not think of it as actually being zero; you only neglect it when it is being added to something much larger than itself. Thus in the infinitesimal language you shouldn't really think of $dx^2$ as being zero, but rather so much smaller than $dx$ that $dx+dx^2$ can be treated like $dx$. (In particular, under normal circumstances $\sqrt{(dx)^2+(dy)^2}$ can be interpreted as $|dx| \sqrt{1+(dy/dx)^2}$.)
This infinitesimal language can be formalized, resulting in theories which are referred to as nonstandard analysis. There are basically two ways to do this: include nilpotent infinitesimals or don't. One way to do the former is smooth infinitesimal analysis: this system contains nonzero numbers with some power of them being zero. For instance for a nilsquare infinitesimal $dx$ you have $f(x+dx)=f(x)+f'(x)dx$ as an exact equality in SIA.
SIA is a somewhat foreign theory, for at least two reasons. First, some finesse with logic is required to make it work without contradictions. You can't define SIA in classical logic, it is an inconsistent theory there, because (as you hinted at) one can use excluded middle and the field axioms to prove that $(dx)^2=0$ implies $dx=0$. Intuitionistic logic dodges this issue. Second, SIA, as the name suggests, describes a "smooth universe": all the functions in it are infinitely differentiable. Standard analysis deals with less regular functions quite routinely.
The main system containing infinitesimals, all powers of which are nonzero, is hyperreal analysis. Hyperreal analysis is suited to describe exactly the same things as standard analysis, in a certain precise and very strong sense. Rather than using nilpotent infinitesimals to implement things like linear approximation, hyperreal analysis uses the "standard part" operation, which takes a number with an ordinary real part and an infinitesimal part and "discards" the infinitesimal part.
I only mention these so that you know that there is some power beyond just intuition in the use of infinitesimals. Nevertheless I would strongly encourage you to learn the meaning of everything in the standard framework.
Revising based on the bounty commentary: first of all, one should not view $\sqrt{dx^2+dy^2}$ (intuitively the length of an infinitesimal line segment) as being zero. It is exactly the same as $|dx| \sqrt{1+(dy/dx)^2}$. (We might need the absolute value because $x$ might rise or fall along the path.) A more general way to handle this would be to parametrize the curve in terms of an additional variable $t$, so that $\sqrt{dx^2+dy^2}=dt \sqrt{(dx/dt)^2 + (dy/dt)^2}$. Now $t$ only goes up (by our choice) so no absolute value is required.
As for writing $dx+dx^2 \approx dx$, it really depends on the context. With derivatives, the whole point is not to exactly write down the function, it's all about linear approximation. Thus for instance when I write $(x+h)^2 \approx x^2+2xh$, I am doing that because I don't want to pay attention to terms of higher order than $h$, because those first two terms (the largest ones, if $h$ is small enough) are enough for whatever purpose I have.
On the other hand, a basic philosophy in calculus and (standard) analysis is that one can prove that two things are equal by proving that they are arbitrarily close together. So to follow your example, when you expand out a proof that $\int_0^\pi \sin(x) dx = 2$, you might show that there is a lower sum for $\int_0^\pi \sin(x) dx$ which is at least $2-\epsilon$ and an upper sum which is at most $2+\epsilon$, for each $\epsilon>0$. The partition depends on $\epsilon$, and that dependence is exactly where the "limit" operation is hidden. (In practice we don't do this, we just use the FTC, but the FTC is proven in this fashion.)