The question I'm asking might be rather simple, but I couldn't find relevant information (maybe it's too trivial?). Here's the question that baffled me.
Let $f:X\rightarrow Y$ and $g:Y\rightarrow Z$ functions. If g and $g \circ f$ are invertible, then is f also invertible?
Now, the reason I'm confused is that I'm currently learning set theory. I'm using the textbook "Introduction to Set Theory" by Karel Harbacek and Thomas Jech. In the book, the composite function is defined as follows:
$g \circ f$={(x, y)| $\exists$z(f(x)=z $\land$ g(z)=y)} where dom($g \circ f$)=domf $\cap$$f^{-1}$[domg]
Notice that we only need the intermediate $z$ to find the elements of the composite function, and only the domain is defined. Now, consider the case where f(1)=1 and g(k)=k for all $k$, where $k$ is equal or less than a certain natural number $n$.
In this case, clearly $g$ is bijective, hence invertible. The problem is the composite function. Since we defined only the domain of a composite function, the domain of the composite function in this case is {1} and the range is {1}.
Now, should we regard this range as the comain(surjective) so the composite function is invertible? Or, should we say the codomain of the composite function is $Z$, the codomain of g? This ambiguousity arose because the definition of the composite function seems somewhat incomplete.
My second question is, what if the problem didn't specify all the domains and codomains of each function? Then would be the conclusion different from the first case?
Lastly, I've heard from one of my fellows that in some textbook the domain of the composite function is defined as just plainly, $domf$. What made all the authors to make different definitions to such an important concept! I'm being confused!
Thanks in advance.
Your set-theoretic definition of $g \circ f$ is more general than the usual definition. Usually, if one write $g \circ f$, that carries the implicit assumption that $g(f(x))$ is defined for every $x \in \textrm{dom } f$. In other words, one assumes that $f(\textrm{dom } f) \subset \textrm{dom g}$. Your set-theoretic definition, OTOH, simply removes those $x$ for which $f(x) \notin \textrm{dom g}$ from the domain of $g \circ f$.
There's also an ambiguitiy in what invertible means here. If you take invertible to be a synonym for bijective, then for $f \,: X \to Y$ to be invertible, it in particular needs to be surjective, i.e. $f(\textrm{dom f}) = Y$. In that case, there's a $g \,:\, Y \to X$ such that $g \circ f = \textrm{id}_X$ and $f\circ g = \textrm{id}_Y$. But people will, quite often, call an $f$ invertible if it is only injective, i.e. if $f(\textrm{dom }f)$ is a proper subset of $Y$. You can then still find a $g$ with $g \circ f = \textrm{id}_X$, but you won't have that $f \circ g = \textrm{id}_Y$. Such a $g$ is called a left-inverse of $f$. You can always make such an $f$ bijective by restricting it's codomain to it's actual range, i.e. redefining it as $f \,: \textrm{dom }f \to f(\textrm {dom } f)$. Since the set-theoretic definition of a function doesn't explicitly specify the domain and codomain, these two $f$ are, set-theoretically, the same function - they both contain, after all, the same pairs of values. In other words, saying "$f$ is surjective" doesn't make much sense from a set-theoretical viewpoint, and so interpreting invertible to mean injective is a sensible way to go.
Let's now assume that invertible refers to the weaker definition here, i.e. simply means injective. Then, if $g\circ f$ is injective, $f$ must be injective too - if $f(x)=f(y)=z$ then surely $g(f(x)) = g(z) = g(f(y))$, contradicting the injectivity of $g\circ f$. So in that case, you don't even need to assume that $g$ is invertible. But you need to be carefull if you use your set-theoretic definition of $\circ$. It might be that $f$ is injective if restricted $x \in \textrm{dom g}$, but might not be injective on it's whole domain. So to be safe, you strictly speaking can only say that
If, OTOH, invertible means bijective, then $f$ is clearly bijective, since obviously $f = g^{-1} \circ (g \circ f)$, and the concatenation of bijective functions is bijective. You have that $f^{-1} = (g \circ f)^{-1} \circ g$ in this case.