# Camera Models

## Pinhole Model

Let point $P = [X, Y, Z]$ located at the object in 3D space, point $P' = [X', Y', Z']$ located on the imaging plane in 3D space, point $P'' = [u, v]$ located on the image in pixel coordinates.

Since most modern digital cameras will automatically flip the image to its upright position for us, we can move the imaging plane to the front side as well (same side as the object). Then we can make math easier (replace the -1 with 1) and have $X' = f \frac{X}{Z}$and $Y' = f \frac{Y}{Z}$ according to the similar triangle.

The relation between three reference frames (O, O', and O'') is the following.

\begin{align*} u &= \alpha X' + c_x = f_x \frac{X}{Z} + c_x \\ v &= \beta Y' + c_y = f_y \frac{Y}{Z} + c_y \end{align*}

In matrix form, we can obtain the camera intrinsic matrix $K$.

$\begin{equation*} Z\left(\begin{array}{l} u \\ v \\ 1 \end{array}\right)=\left(\begin{array}{ccc} f_{x} & 0 & c_{x} \\ 0 & f_{y} & c_{y} \\ 0 & 0 & 1 \end{array}\right)\left(\begin{array}{l} X \\ Y \\ Z \end{array}\right) \stackrel{\text { def }}{=} \boldsymbol{K} \boldsymbol{P} \end{equation*}$

## References

• The Double Sphere Camera Model, Daniel Cremers group, 3DV 2018

• Very good discussions and summary of camera models; read this paper first!

• 14 Lectures in Visual SLAM

Last updated