From mathematics perspective, cameras do convert the 3d shapes into 2d shapes in the photos. If we consider a 3D coordinate system X-Y-Z which the origins is the camera (or its lens or things like that) and select direction like this:

Where Blue: X, Green: Y, Red: Z
And say this camera gives us an image with 2D coordination system of X'-Y' with origins at middle:

How it is possible to get a general equation that converts 3D location of every point in X-Y-Z coordination system into Y'-Z' coordination system? Of course the reverse is not simply possible (it is not simply possible to reconstruct 3D objects from 2D image). Hope it makes sense...
The general equation for how a camera converts a 3D point (x,y,z) specified in the camera's coordinate system (where z is the optical axis), into a 2D point (u,v) are:
where f is the focal length of the lens. To get meaningful 2D coordinates you may have to multiply by another constant for the sensor size/resolution, you can roll this up into a single value of "f" that represents all linear scaling factors.