Conceptual explanation for the identity of Hochschild about derivations

210 Views Asked by At

When reading Katz's paper "Algebraic Solutions of Differential Equations (p-Curvature and the Hodge Filtration)", he mentioned a mysterious identity about derivations in char $p$ commutative algebras at Page $3$: enter image description here

Namely, If $A$ is a commutative algebra over $\Bbb F_p$, $D$ is a derivation of $A$ (i.e linear and satisfying Leibniz rules ), then $D^{p-1}(X^{p-1} DX)+(D(X))^p=X^{p-1}D^p(X)$ for any $X \in A$. I check it for polynomial rings, why does it hold in such generality? Hochschild's original paper does not state this as a lemma but only regard it as a technical identity. How to prove it in modern language?

2

There are 2 best solutions below

1
On

The case of any commutative algebra $A$ follows automatically from the case of a polynomial algebra. It is likely that the proof for polynomial algebras, is also a proof for other algebras, but I'll still explain why one follows from the other below.


Rewrite the formula

First of all, let's rewrite the formula in more compact form:

$$(DX)^p = [X^{p-1},D^{p-1}]DX$$

I guess this notation is a little abusive because $[X^{p-1}, D^{p-1}]$ means the operator commutator $\mu_X^{p-1}\circ D^{p-1} - D^{p-1}\circ \mu_X^{p-1}$ where $\mu_X(Y) = X\cdot Y$ is multiplication by $X$.


Reduction of "general case" to polynomial algebra case.

Choose a surjective k-algebra homomorphism $\varphi: k[X] \to A$ from a polynomial algebra to $A$, and a $k$-linear section $\bar{\varphi}^{-1}: A \to k[X]$, i.e. a linear map satisfying $\varphi \circ \bar \varphi^{-1} = \text{id}_A$.

Now suppose you know the identity for all derivations in $\text{Der}_k(k[X],k[X])$. Given $D \in \text{Der}_k(A,A)$, you can choose a (not unique) lift $\tilde D \in \text{Der}_k(k[X],k[X])$ such that $D = \varphi \circ \tilde D \circ \bar\varphi^{-1}$. (Finding such a $\tilde D$ is not hard, it more or less amounts to the choice of section $\bar\varphi^{-1}$ we already made, because now we can just take $\tilde D := \bar\varphi^{-1} \circ D \circ \varphi$.)

Now we have identities like $D^n = \varphi \circ \tilde D^n \circ \bar\varphi^{-1}$ and can basically transport the equation from $A$ to $k[X]$ where we know it is true, and back. You will also need to use that $\varphi$ is a ring homomorphism and $\varphi\circ \bar\varphi^{-1} = \text{id}_A$ of course. Anyway, you probably see where this is going but I'll write it out just in case:

\begin{align*} (DX)^p - [X^{p-1},D^{p-1}]DX &= \varphi \big(\tilde D (\bar\varphi^{-1}X)\big)^p - \varphi[\bar\varphi^{-1}X^{p-1}, \tilde D^{p-1}]\tilde D(\bar\varphi^{-1}X)\\ &= \varphi\bigg(\big(\tilde D(\bar\varphi^{-1}X)\big)^p - [(\bar\varphi^{-1}X)^{p-1},\tilde D^{p-1}]\tilde D(\bar\varphi^{-1}X)\bigg)\\ &= \varphi(0) = 0 \end{align*}


Case of $p=2$.

Second of all, let's observe the formula in the easiest case $p=2$. Namely, \begin{align} [D,X](Y) &= D(XY) - XDY\\ &= XDY + YDX - XDY\\ &= YDX, \end{align} so $[D,X]$ is the operator of multiplication by $DX$. Thus $[D,X](DX) = DX\cdot DX = (DX)^2$ as desired.

Case of $p=3$.

Now try a less trivial, still easily computable case.

\begin{align} [D^2,X^2](Y) &= D^2(X^2Y) - X^2D^2Y\\ &= D(2XYDX + X^2DY) - X^2D^2Y\\ &= 2D(XYDX) + 2XDX\cdot DY\\ &= 4XDXDY + 2Y(DX)^2 +2XYD^2X \end{align}

This might not look like much but when $Y = DX$ the first and last terms combine and cancel and you are left with $2DX(DX)^2 = - (DX)^3$ as desired.

Now there is a general approach: expand $[X^{p-1},D^{p-1}]Y$, and match terms that become the same when setting $Y = DX$. Compute their coefficients mod $p$ and make sure they all cancel, except the middle term. Finally make sure the coefficient on the middle term is 1.

2
On

I wrote a proof of the derivation of Hochschild's formula, even though I suspect it's not what you're looking for. Maybe it is of some use to you, then again maybe not. :)


Warm up. Let's write the equation like this: $$D^{p-1}(X^{p-1}DX) = X^{p-1}D^pX - (DX)^p$$ Then on the left hand side if we use the Leibniz rule (a lot) we expect to get a lot of terms, but we only see two of them on the right hand side. So, this equation is saying that there are very many terms having coefficients in $\mathbb Z$ which are divisible by $p$.


Plan. I know this isn't the answer you wanted, but you can obtain the coefficients of the expansion of the left hand side of the above formula over $\mathbb Z$ pretty easily and in full generality just using the Leibniz rule for iterated derivations: $$D^n(X_1\cdots X_r) = \sum_{\substack{n\, =\, n_1+\cdots+n_r,\\0\,\leq\, n_i\, \leq \,n}} \frac{n!}{n_1!\cdots n_r!}D^{n_1}X_1\cdots D^{n_r}X_r$$ and it is also easy to see that formulas you get reduce to what we want mod $p$. We'll do this below.


Execution. For our use we set $X_1,\ldots,X_{r-1} = X$ and $X_r = DX$ and then $r = p$. Then we group like terms: $$D^{p-1}(X^{p-1} DX) = \sum_{\substack{p\, =\, e_1m_1+\cdots+e_km_k,\\ \\ 0\,\leq\, m_1 < m_2 < \cdots < m_k,\\ \\0\,<\, e_i,\ e_1+\cdots+e_k = p}} A_{\bar m, \bar e}\cdot(D^{m_1}X)^{e_1}\cdots (D^{m_k}X)^{e_k}$$ where $A_{\bar m, \bar e}$ is a coefficient, and here $\bar m = (m_1,\ldots, m_k)$ and similarly for $\bar e$. This coefficient $A_{\bar m, \bar e}$ is obtained my summing the possible multinomial coefficients from the Leibniz rule above (second factors below), and multiplying by a factor counting the orders (first factor below):

\begin{align*} A_{\bar m, \bar e} &= \cdot\sum_{i=1}^k \frac{(p-1)!}{e_1!\cdots (e_i-1)!\cdots e_k!}\cdot\frac{(p-1)!}{(m_1!)^{e_1}\cdots (m_i!)^{e_i-1}(m_i-1)!\cdots (m_k!)^{e_k}}\\ &= \sum_{i=1}^k \frac{(p-1)!^2e_im_i}{\prod_{j=1}^k e_j! (m_j!)^{e_j}}\\ &= \frac{(p-1)!^2\cdot p}{\prod_{i=1}^k e_i! (m_i!)^{e_i}}\\ \end{align*}

OK, here's what we accomplished:

Formula. Over any commutative ring $R$, for any derivation $D \in Der(R,R)$ and $X \in R$ we have $$D^{p-1}(X^{p-1} DX) = \sum_{\substack{p\, =\, e_1m_1+\cdots+e_km_k,\\ \\ 0\,\leq\, m_1 < m_2 < \cdots < m_k,\\ \\0\,<\, e_i,\ e_1+\cdots+e_k = p}} \frac{(p-1)!\cdot p!}{\prod_{i=1}^k e_i! (m_i!)^{e_i}}\cdot(D^{m_1}X)^{e_1}\cdots (D^{m_k}X)^{e_k} $$

This formula looks good because of the $p$ in the numerator, meaning most of the terms will die. What terms don't die? There are just two options:

  1. We could have $e_i = p$. This forces $k=1, m_1 = 1$, for which $$A_{\bar m, \bar e} = (p-1)!p!/p! = (p-1)!$$ which is congruent to $-1$ mod $p$ (by Wilson's theorem). This is the correct coefficient for $(D^1X)^p$, so that checks out.

  2. We could have $m_i = p$. Because $\sum m_ie_i = p$ then $e_i = 1$. But because $\sum e_i = p$, we need to have $m_1 = 0$ and $e_1 = p-1$. Thus $$A_{\bar m, \bar e} = (p-1)!p/(p-1)!p! = 1$$ which is indeed the correct coefficient for $X^{p-1}D^pX$.

Done!


Conclusion. I have the impression that you are looking for a different style of proof, and I am not the person to ask whether one exists. In fact, I think this is not the right website for that!

If your question is "What has been the progress in the past 40 years with regards to the remark that 'the proof of this fact is unfortunately computational...' from Katz' paper?" then that is a research level question and you will attract a more appropriate set of viewers by asking on Math Overflow. You might even be able to find someone there who could say "No one has done anything about that!" or "Yes someone did that, see this paper..."