I know the following example doesn't make what I am saying rigorous, but hopefully it clarifies to some extent what I mean.
For various computer implementations, dividing by 2 and multiplying by 0.5 require a different number of CPU cycles, even though the two operations are mathematically equivalent. (The first is performing the inverse operation of multiplication with the number 2, which is defined to be multiplication by the multiplicative inverse of 2 which is $\frac{1}{2}$ or 0.5 in decimal.)
Google "practical difference between currying and partial application" and at least the entire first page of results explains some of my skepticism about whether there is a mathematical difference between currying and partial application -- none of the results treat the subject mathematically, i.e. by defining them in terms of Hom functors, and instead discuss how currying and partial application are implemented differently in most functional programming languages.
(In a nod to other websites in the StackExchange network, I will post the results from them:
Notice that in both of the examples given above, they fail to explain any mathematical difference -- instead the difference is explained with examples of lines of code.
While in practice, as in the case when differentiating dividng by 2 and multiplying by 0.5, there is a difference in implementation, it does not seem to amount to a theoretical difference.
Currying and partial application are two different concepts, which impose different requirements on the category $\mathcal{C}$. Currying is a strictly stronger concept: one can express partial application using currying, but partial application is not sufficient for currying.
Currying (in cartesian closed category $\mathcal{C}\,$)
In order to talk about currying, the category has to be cartesian closed. In particular, it has to have exponential objects $Y^X$ for any two objects $X$ and $Y$. Currying is then one direction of the adjunction
$$ \mathcal{C}(X \times Y, Z) \cong \mathcal{C}(X, Z^Y)\qquad,$$
meaning that whenever we are given an $f: X \times Y \to Z$, we can obtain the curried version $\lambda f: X \to Z^Y$.
Partial application (in category with finite products)
For partial application, one needs only a category with all finite products (not necessarily cartesian closed). Since there is a terminal object $\mathbb{T}$ (nullary product), it makes sense to talk about elements of objects $X$: an element of $X$ can be defined as a morphism from $\mathbb{T} \to X$. Given an $f: X \times Y \to Z$, we can partially apply $f$ to $x$ as follows: $$ f \circ \langle x \circ \dagger_Y, Id_Y \rangle $$ where $\dagger_Y$ is the terminal morphism $Y \to \mathbb{T}$, the little circle $\circ$ denotes morphism composition, and $\langle - , - \rangle$ denotes the mediating morphism for products. If we now take any element of $Y$, that is, a morphism $y: \mathbb{T} \to Y$, and feed it to the above construction, we obtain: $$ f \circ \langle x \circ \dagger_Y, Id_Y \rangle \circ y = f \circ \langle x \circ \dagger_Y \circ y, Id_Y \circ y \rangle = f \circ \langle x \circ Id_{\mathbb{T}}, y \rangle = f \circ \langle x, y \rangle \quad,$$ which is exactly what we would expect from a "partially applied function". Notice: we have built the morphism for "f partially applied to x" without using $\lambda$ or exponential objects. We don't need the full-blown cartesian closedness for that.
Expressing partial application through currying
Suppose that $\mathcal{C}$ is cartesian closed, let $f$, $x$, $y$ as above. We can curry $f$ and apply it to $x$: $$ \lambda f \circ x \quad ,$$ obtaining an element of $Z^Y$. In order to be able to postcompose it with $y$, we have to "unpack" the element of $Z^Y$ using the evaluation morphism $\epsilon: Z^Y \times Y \to Z$. Thus, the morphism that represents "f partially applied to x" is: $$ \epsilon \circ \langle \lambda f \circ x \circ \dagger_Y, Id_Y\rangle \,.$$ Using the universal property of $\lambda f$ and $\epsilon$ (the commuting triangle that one sees in the definition of exponential objects), we can calculate: $$ \epsilon \circ \langle \lambda f \circ x \circ \dagger_Y, Id_Y\rangle = \epsilon \circ (\lambda f \times Id_Y) \circ \langle x \circ \dagger_Y, Id_Y \rangle = f \circ \langle x \circ \dagger_Y, Id_Y \rangle \,,$$ thus our notion of "partial application" defined directly coincides with the notion of "partial application" defined through currying. That means, it does not matter which defintion we take in cartesian closed categories.
When does the difference matter ...
...in programming:
Since categories built into functional programming languages (like $\mathrm{Hask}$) are cartesian closed, it does not really matter when creating simple functional programs.
However, as soon as we start to implement other categories inside our functional programming languages, the difference between "currying" and "partial application" begins to influence our design decisions. In general, it should be much easier to provide only finite products (which allow us to implement partial application), without providing currying. If we use those categories to represent certain computations, providing exponential objects and currying makes the framework more expressive, but we usually lose the possibility to inspect the structure of the computation before evaluating it. This is somewhat similar to trade-off between monads and applicatives: monads are more powerful, but applicatives have a more rigid structure, which can be analysed and optimized much better.
...in mathematics:
There are countless examples when one wants to have partial application, but does not want to think about whether the category one is working in is cartesian closed.
For example, in classical multivariate calculus, we look at smooth functions from $\mathbb{R}^n \to \mathbb{R}^m$, and often want to fix some of the variables, and compute partial derivatives. It is crucial that we do this using only the product structure and partial application, but no currying: if we would take a multivariate function $f: \mathbb{R}^2 \to \mathbb{R}$, and then curry it, we would obtain something like a function from $\mathbb{R}$ to the function space $\mathbb{R} \to \mathbb{R}$, and would thereby fall out of the realm of classical multivariate calculus, and land in the realm of functional analysis.
Same with topology: we often want to take some continuous function that depends on multiple variables (some homotopy or something), then fix some of the variables, and argue that the partially applied function is still continuous, and work with that. In basic topology, we do not want to spend lot of time on defining topologies on function spaces, and trying to make the category of topological spaces cartesian closed. We don't have to, because partial application works without currying.
Code examples
The fact that we do not need currying in order to implement partial application can be illustrated in any functional programming language. The following code snippets show how to implement
partialApplicationin terms ofcompose,identityandproduct, without using lambdas.Scala (with detailed comments):
The same in Haskell (same comments as in Scala apply):