
Intuitively the gradient is the vector pointing to the maximum rate of change. But this can be either up or down. How would the gradient point on this surface?

Intuitively the gradient is the vector pointing to the maximum rate of change. But this can be either up or down. How would the gradient point on this surface?
On
Let $f(x,y)=-x^2-y^2$
The gradient: $\nabla f(x,y)=(-2x,-2y)$
The gradient of $f$ in a point $(x,y)$ gives you the direction of the maximum rate of change in $f$ at that point. Notice that the gradient is a two dimensional vector, and it is an horizontal vector, so it can't point neither up or down. But, if you ask "if I follow the gradient direction, will $f$ value be increased or decreased?" the answer is increased.
Take for example the point $(1,1)$, where $\nabla f(1,1)=(-2,-2)$, you can easily see that the gradient points towards the origin, and the value of $f$ will be increased at that direction.
On
The gradient has the direction in which the function increases (as you say at the maximum rate of change). If $F(x,y,z)=x^2+y^2+z=0$ then (formally and mathematically using coordinates) the gradient $\nabla F=(2x,2y,1)$ depends on $x$ and $y$. (In general $\nabla F(x,y,z)=(\frac{\partial F}{\partial x},\frac{\partial F}{\partial y},\frac{\partial F}{\partial z})$ would depend on the the point $(x,y,z)$ on the surface.) At the point $(x,y,z)=(0,0,0)$ we have $\nabla F=(0,0,1)$ which does point exactly up. When say $(x,y,z)=(1,1,-2)$ then $\nabla F=(2,2,1)$ which you might say points up but not exactly vertically up (where we interpret "vertical" as having the same direction as the $z$-axis). So the gradient depends on the point (of the surface) at which you evaluate it, but the third (i.e. $z$) coordinate of the gradient is always $1$ (for the function $F$ in your question), so it is positive, so you might perhaps interpret this as the gradient pointing up (in addition to possibly also pointing a little to the left/right and back/front) but this interpretation might be a bit confusing and not quite standard. You might also say that "locally" it always points up (in direction from "inside" to the "outside" of) the surface. It depends on how you define "vertically" or "up". If you are a little bug on the surface then regardless of the point you stand on, then vertically, or up, might mean perpendicular (orthogonal) to the surface and pointing upward of the surface, from the "inside" to the "outside" (though sometimes there may be obstacles, globally, like for the Klein bottle, to defining which is inside and which is outside). Compare what the word "vertically" means for people living on Earth (which could be interpreted as a sphere $G(x,y,z)=x^2+y^2+z^2=1$ with $\nabla G=(2x,2y,2z)$ ), for people in Australia and for people in USA vertically means "up and perpendicular to the ground", yet from someone looking from a spaceship these two directions are certainly not the same, not parallel, and may not both be qualified as vertical. People tend to perhaps associate "vertical" with the North pole, but that is just a convention related to the way maps of the Earth are usually drawn. More precisely it depends on what kind of a coordinate system you use (e.g. Cartesian vs Spherical) and how you would orient it. It is always the case that the gradient is perpendicular to the surface (assuming partial derivatives exist) and it seems that one may interpret this as the gradient being vertical ("up", from the surface, locally). This is of course not the same as the other meaning of "vertical" (having the same direction as the $z$-axis). Rather, if you stand on the (outside of the) surface, then that gradient goes up, orthogonal to the surface.
Here is an image with three gradient vectors:
at the point $(0,0,0)$ grad $(0,0,1)$,
at the point $(\frac12,\frac12,\frac{-1}2)$ grad $(1,1,1)$, and
at the point $(\frac12,\frac{-1}2,\frac{-1}2)$ grad $(1,-1,1)$ (shown with its shadow).

The gradient does not point up or down or anything in an absolute sense. Gradient is a vector field of a scalar function $ z+ x^2 + y^2 = 0 $ directed normal to the surface. A sign convention can be chosen for convex surface so the normal can point "out ".
For a 2D surface $ z = - (x^2 + y^2 ) $ in 3D, the gradient is the rate of change at a given point of surface. To talk about the rate of change of the entire surface of paraboloid conveys no sense. For a chosen direction of x- and y- , vector cross product of partial derivatives $ ( z_x X z_y ) $ can convey the sense of the local normal.
Just as in a 1D curve $ y = - x^2 $ on a 2D plane. The slope is the rate of change at a given point of a curve. To talk about the rate of change of the entire curve conveys no sense.