Representing everywhere a camera can see as a matrix

151 Views Asked by At

I'm learning about Computer Graphics and there is one point really puzzling me.

I understand that vertices (vectors) represent points in space and that transformation matrices represent changes that could be made to said vertices.

However, when it comes to defining what is visible to a camera, I don't understand how a single matrix can represent the entire space in view.

For example, suppose I have a cube and a cylinder somewhere in 3D space, how can I create a camera that "sees" them? How can I describe a matrix that fits the entire frustum?

I am finding this concept incredibly confusing.

2

There are 2 best solutions below

2
On

AFAIK you wouldn't consider the concept of a camera as you describe it as an object inside you matrix. There are obviously many ways to achieve a camera, but let me vaguely describe one way. This isn't meant to be technically correct, but give you somewhat of an idea of how it can work.

Think of it more like the following:

Your (model view) matrix is representing your world state. Your "camera" is just looking at that matrix from a fixed position. Instead of moving the camera, you move the entire world (matrix).

You don't rotate the camera left, you rotate your world right.

You don't move your camera forward, you move your world backward.

This way you could define a pane like z=0 as your cameras position and not consider anything with z>0 ("behind the camera") as visible. Starting from there you could span up your frustrum into the distance (up to some arbitrary "max view distance") to determine what would be visible.

So, no matter what the camera sees, it all stays in your world matrix. Your camera is more of an applied manipulation to your world state, in OpenGL stored in a different matrix. I would recommend on reading up on the difference between Model Matrix and View Matrix in OpenGL, there are some nice visual explanations.

3
On

The matrix is only part of the representation of what the camera can see. It's defined by two things: the matrix and a cone (or, more precisely, a frustrum). The frustrum typically has four parameters (angular width, angular height, near clipping plane, far clipping plane).

You're right that one way to see the matrix action is that it transforms vertices. Another way to see it is to think about all of the vertices simultaneously and say that the matrix transforms space (while the camera remains fixed, so the effect is to move things in and out of the frustrum). And a third way to think about it is that it does the inverse of that transformation only on the camera (so space stays fixed and the camera moves).

So if you want the camera to see an object you need to find a transformation which moves that object into the frustrum, or alternatively a transformation which moves the camera such that the object is in the frustrum.