[Help] Getting the depth of the image plane #47

cs-mshah · 2024-04-03T18:29:21Z

Firstly, thanks a ton for making this library. It is extremely helpful in performing common operations. I wasn't able to find anything so simple.
I have the following problem: I want to back-project the image plane to world coordinates. Basically, the depth map should contain the depth of the image plane. How can I compute this? Will they all be 1? Can you show using an example.

yxlao · 2024-04-07T07:37:26Z

I think what you mean by the "depth of the image plane" is the distance from the camera center to the image plane. This distance is referred to as the focal length, and there are two types of focal length representations: focal length in pixels and physical focal length in metric space.

TLDR: Typically, the focal length is expressed in pixels in computer vision, as specified in the intrinsic camera matrix $K$. If you want to compute the physical focal length, you'll need additional information including the sensor size (in metric unit) and the resolution of the camera.

Let's break it down. Assuming you have camera intrinsic $K$ matrix:

$$ K=\left[\begin{array}{ccc} f_{x} & 0 & c_{x} \\ 0 & f_{y} & c_{y} \\ 0 & 0 & 1 \end{array}\right] $$

Focal Length in Pixels: $f_x$ and $f_y$ in the intrinsic camera matrix $K$ are the focal lengths in pixels. These are unitless values that does not have any physical scale. You can imagine that scaling the focal length and the sensor size by the same factor will not change the image projection relationship at all. This is the most common representation in computer vision as we don't care about the physical size of the sensor, nor the physical focal length.
Physical Focal Length: The physical focal length is the focal length of the lens in metric units (e.g., millimeters). If you want to convert the physical focal length to focal length in pixels, you'll need to know the sensor size in metric units and the resolution of the camera. To compute the metric focal length from the pixel focal length, you would use the following formulas:

$$ f_{metric_x} = \frac{f_x}{resolution_x} \times sensor_x $$

$$ f_{metric_y} = \frac{f_y}{resolution_y} \times sensor_y $$

Where:

$f_x$ and $f_y$ are the given focal lengths in pixels along the x and y axes, respectively.
$resolution_x$ and $resolution_y$ are the resolution of the camera sensor in pixels along the width (x-axis) and height (y-axis).
$sensor_x$ and $sensor_y$ are the physical sizes of the sensor along the width and height in metric units (typically millimeters).
$f_{metric_x}$ and $f_{metric_y}$ are the calculated physical focal lengths in metric units along the x and y dimensions, respectively.
Also, you may assume that $f_x = f_y$ for typical cameras (uniform square pixels, symmetric lens).

yxlao · 2024-04-07T07:41:45Z

If you want to project depth images to 3D as point clouds, you may use the functions in ct.project. Typically you'll need the intrinsic and extrinsic camera parameters to project a depth image to 3D. Also, pay attention to the depth image format, as it could be in different units or different scales.

import open3d as o3d
import camtools as ct
import json
import numpy as np

from pathlib import Path


def main():
    # Get paths.
    redwood = o3d.data.SampleRedwoodRGBDImages()
    im_color_path = Path(redwood.color_paths[0])
    im_depth_path = Path(redwood.depth_paths[0])
    camera_intrinsic_path = Path(redwood.camera_intrinsic_path)

    # Load K (intrinsic).
    with open(camera_intrinsic_path, "r") as f:
        camera_intrinsic = json.load(f)
    K = np.array(camera_intrinsic["intrinsic_matrix"]).reshape(3, 3).T

    # Load T (extrinsic), assume identity.
    T = np.eye(4)

    # Load images and depths.
    im_color = ct.io.imread(im_color_path)
    im_depth = ct.io.imread_depth(im_depth_path, depth_scale=1000.0)

    # Create point cloud.
    points, colors = ct.project.im_depth_im_color_to_points_colors(
        im_depth=im_depth, im_color=im_color, K=K, T=T
    )

    # Visualize.
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(points)
    pcd.colors = o3d.utility.Vector3dVector(colors)
    o3d.visualization.draw_geometries([pcd])


if __name__ == "__main__":
    main()

This shall give you:

cs-mshah · 2024-04-19T05:49:03Z

Thanks. The explanation was really helpful. But I wanted to know the actual focal length in mm since I want to back-project my points to the depth of the image plane itself. Is there a way to know the size of the pixel in mm or the $sensor_x$, $sensor_y$ for a SIMPLE_PINHOLE or PINHOLE camera used by colmap? Or should I just assume the standard: 1px = 0.264mm

yxlao · 2024-04-23T06:10:51Z

Is there a way to know the size of the pixel in mm or the for a SIMPLE_PINHOLE or PINHOLE camera used by colmap?

As far as I know, COLMAP's reconstruction of points and cameras is not physically scaled. That is, the scale is relative (or arbitrary) as we don't know the physical scale of COLMAP's reconstruction.

Extrinsic properties: You have to manually to obtain a physical scale or provide physical-scale camera poses to COLMAP for it to reconstruct physical-scale points.
Intrinsic properties: The same applies to your question about "pixels scale in mm". You either have to know the physical specifications of your camera in advance, or use one of the camera calibration techniques by capturing a known pattern in physical space.

yxlao closed this as completed Apr 7, 2024

yxlao reopened this Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Help] Getting the depth of the image plane #47

[Help] Getting the depth of the image plane #47

cs-mshah commented Apr 3, 2024

yxlao commented Apr 7, 2024 •

edited

yxlao commented Apr 7, 2024

cs-mshah commented Apr 19, 2024

yxlao commented Apr 23, 2024 •

edited

[Help] Getting the depth of the image plane #47

[Help] Getting the depth of the image plane #47

Comments

cs-mshah commented Apr 3, 2024

yxlao commented Apr 7, 2024 • edited

yxlao commented Apr 7, 2024

cs-mshah commented Apr 19, 2024

yxlao commented Apr 23, 2024 • edited

yxlao commented Apr 7, 2024 •

edited

yxlao commented Apr 23, 2024 •

edited