Given that an SVM will have the following function:

And if I was to use a kernel, this would become:
Where the kernel can be the Gaussian kernel:
How Would I go about finding its gradient wrt the input? I need to know this as to then apply this to a larger problem of a CNN with its last layer being this SVM, so I can then find the gradient of this output wrt the input of the CNN.

