# Generalized ICP

- Classic ICP uses a point-to-point distance metric, which can lead to a non-quadratic error surface, especially when the point clouds are not well aligned initially. This is because the closest point correspondence can change dramatically for different transformations, leading to a discontinuous, and hence non-quadratic, error surface.
- GICP, on the other hand, uses a plane-to-plane distance metric, which can result in a smoother and more quadratic-like error surface. This is because the plane-to-plane distance changes more smoothly with the transformation, leading to a more continuous error surface.
- The above two statements about two methods are generated by GPT and to be verified by experiments.

- Both ICP and GICP establish error terms in the least-square forms (with or without covariance matrix as the weight), and this seems to fit GN and LM use cases. However, in practice, despite least-square error terms, the point cloud registration problem can still be highly non-linear, due to the nature of transformation matrix (in SE(3) space), and how the correspondence points are established in each iteration. Therefore, in practice, BFGS may run more robustly than GN and LM.

- Gauss-Newton (GN): It assumes that the error surface is quadratic, which allows it to solve directly for the step that minimizes the error.
- Levenberg-Marquardt (LM): This is a modification of Gauss-Newton that introduces a damping factor, making it more robust to non-quadratic error surfaces. When the damping factor is high, LM behaves more like gradient descent, taking small, reliable steps. When the damping factor is low, LM behaves more like Gauss-Newton, taking large, confident steps. This adaptability makes LM more reliable than Gauss-Newton in practice.
- Although both GN and LM are iterative methods for nonlinear problem, at each iteration of the state, they approximate the model using first-order Taylor expansion (i.e. linearization), then the linearized Jacobian-residual equation can be solved by a linear solver in the form of Ax=b.
- Even in the colored point cloud registration algorithm, the color term can also be formulated as linearized, local approximation and solved by GN method.

- BFGS: This is a quasi-Newton method that approximates the Hessian matrix using gradient information instead of computing it directly. This makes it more efficient than Gauss-Newton and LM for problems
*with a large number of parameters*(e.g., quadrotor motion planning). BFGS also includes a line search to find the optimal step, which can make it more robust to non-quadratic error surfaces.- This is a general unconstrained nonlinear optimization method that does not assume least-square error terms. It does not linearize the model (which contains first-order info only), but instead makes use of the second order information (i.e. curvature, encoded in the approximated Hessian matrix).

PCL GICP algorithm uses BFGS optimizer to perform unconstrained nonlinear optimization. It represent state in 6-by-1 vector (XYZ, RPY) as the compact form, and restores to transformation matrix when needed.

The gradient is computed separately for XYZ and RPY. For XYZ, it computes the distance residual between corresponding points, multiplied by the weights from pre-computed covariance matrices (in

`df`

and `fdf`

functions). For RPY, it expresses the rotation matrix R as ZYX Euler angles, and then takes the derivative of the elements of the rotation matrix with respect to the Euler angles (in `computeRDerivative`

function.)Essentially, the distance metric used in plane-to-plane ICP or GICP is the Mahalanobis distance between two distributions.

The standard Mahalanobis distance is used to measure point-to-distribution distance. It is formulated as the following:

$D(\mathbf{p}) = \sqrt{(\mathbf{p} - \boldsymbol{\mu})^\top \Sigma^{-1} (\mathbf{p} - \boldsymbol{\mu})}$

Intuitively, Mahalanobis distance is the distance of the test point from the center of mass divided by the width of the ellipsoid in the direction of the test point.

If we extend this concept to measure the distance between two distributions (such as two planes), the formula becomes the following:

$D(\boldsymbol{\mu}_1, \boldsymbol{\mu}_2) = \sqrt{(\boldsymbol{\mu}_1 - \boldsymbol{\mu}_2)^\top (\Sigma_1 + \Sigma_2)^{-1} (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_2)}$

This is to measure the residual between two mean points, weighted by the sum of their respective covariance matrices.

If we take into account that one of the point cloud will be transformed at each iteration, so as the covariance matrix of that point cloud, then the measure becomes:

$D(\boldsymbol{\mu}_1, \boldsymbol{\mu}_2') = \sqrt{(\boldsymbol{\mu}_1 - \boldsymbol{\mu}_2')^\top (\Sigma_1 + T \Sigma_2 T^\top)^{-1} (\boldsymbol{\mu}_1 - \boldsymbol{\mu}_2')}$

This is exactly what the GICP algorithm is designed to minimize (Eq. 2 from the original paper):

$T = \underset{T}{\arg\min} \sum_i d_i^{(T)^\top} \left( C_i^B + T C_i^A T^\top \right)^{-1} d_i^{(T)}$

.double gicp_loss(const Eigen::Vector3d& mean_A, const Eigen::Matrix3d& cov_A, const Eigen::Vector3d& mean_B, const Eigen::Matrix3d& cov_B, const Eigen::Matrix3d& R, const Eigen::Vector3d& t) {

Eigen::Vector3d d = mean_B - (R * mean_A + t);

Eigen::Matrix3d RCR = cov_B + R * cov_A * R.transpose();

Eigen::Matrix3d RCR_inv = RCR.inverse();

Eigen::Vector3d RCRd = RCR_inv * d;

double loss = d.dot(RCRd);

}

// PCL registration/impl/gicp.hpp

// in function computeTransformation()

if (nn_dist_sq[0] < dist_threshold) {

Eigen::Matrix3d& C1 = (*input_covariances_)[i];

Eigen::Matrix3d& C2 = (*target_covariances_)[nn_indices[0]];

Eigen::Matrix3d& M = mahalanobis_[i];

// M = R*C1

M = R * C1;

// temp = M*R' + C2 = R*C1*R' + C2

Eigen::Matrix3d temp = M * R.transpose();

temp += C2;

// M = temp^-1

M = temp.inverse();

source_indices[i] = static_cast<int>(i);

target_indices[i] = nn_indices[0];

}

// in function df() or fdf()

for (int i = 0; i < m; ++i) {

// The last coordinate, p_src[3] is guaranteed to be set to 1.0 in registration.hpp

Vector4fMapConst p_src = gicp_->tmp_src_->points[(*gicp_->tmp_idx_src_)[i]].getVector4fMap();

// The last coordinate, p_tgt[3] is guaranteed to be set to 1.0 in registration.hpp

Vector4fMapConst p_tgt = gicp_->tmp_tgt_->points[(*gicp_->tmp_idx_tgt_)[i]].getVector4fMap();

Eigen::Vector4f pp(transformation_matrix * p_src);

// The last coordinate is still guaranteed to be set to 1.0

Eigen::Vector3d res(pp[0] - p_tgt[0], pp[1] - p_tgt[1], pp[2] - p_tgt[2]);

// temp = M*res

Eigen::Vector3d temp(gicp_->mahalanobis((*gicp_->tmp_idx_src_)[i]) * res);

// Increment total error

f += double(res.transpose() * temp);

}

References: ChatGPT, https://en.wikipedia.org/wiki/Mahalanobis_distance, Fast-GICP loss function, PCL GICP implementation.

Since GICP algorithm will regularize the covariance matrices by (0.001, 1, 1) diagonal matrix, where the only information retained herein is the normal direction, it is possible to represent this covariance matrix by its point normal, and reconstruct this regularized covariance back at a later time.

To this end, we can estimate the point normals using the same method as estimating these covariance matrices, and then save this normal information in each point itself.

In fact, the normal estimation algorithm will need to first estimate a covariance matrix (using exactly the same way as GICP), and then perform eigenvalue decomposition on this covariance matrix, to obtain its smallest eigenvalue as the normal direction. Lastly, this algorithm may flip the normal to point toward viewpoint (or sensor origin, in most cases).

GICP algorithm will estimate covariance matrices as usual, and then regularize the covariance matrix by (0.001, 1, 1) diagonal matrix. Two points from source and target clouds, along with their regularized covariance matrices, will be used to compute the Mahalanobis distance between each other, and contribute to the overall optimization problem as an error term.

There are difference ways to reconstruct this. Below we compared a few methods (using Python code equivalent to original C++ implementation) and the results (in terms of Mahalanobis distance) are identical, or numerically the same.

import numpy as np

# Function to generate points on a plane

def generate_plane_points(mean, normal, n=20, noise_level=0.001):

points = np.random.rand(n, 3)

for i in range(n):

points[i] = points[i] - np.dot(points[i] - mean, normal) * normal

points[i] += np.random.normal(scale=noise_level, size=3)

return points

# Function to calculate Mahalanobis distance

def mahalanobis_distance(mu1, mu2, sigma1, sigma2):

cov_inv = np.linalg.inv(sigma1 + sigma2)

delta = mu1 - mu2

return np.sqrt(np.dot(np.dot(delta.T, cov_inv), delta))

# Function to estimate plane normal

def estimate_plane_normal(points):

covariance_matrix = np.cov(points, rowvar=False)

eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

# The normal is the eigenvector corresponding to the smallest eigenvalue

normal = eigenvectors[:, np.argmin(eigenvalues)]

return normal

# Regularize covariance matrix

def regularize_covariance_matrix(cov, gicp_epsilon=0.001):

U, _, _ = np.linalg.svd(cov)

S = np.array([1, 1, gicp_epsilon]) # Modified singular values

cov_modified = np.dot(U, np.dot(np.diag(S), U.T))

return cov_modified

# Reconstruct regularized covariance matrix from normal

def reconstruct_covariance_from_normal(normal, gicp_epsilon=0.001):

# Ensure the normal is a unit vector

normal = normal / np.linalg.norm(normal)

# Find two vectors perpendicular to the normal

if (normal[0] == 0) and (normal[1] == 0):

vec1 = np.array([1, 0, 0])

else:

vec1 = np.array([-normal[1], normal[0], 0])

vec1 = vec1 / np.linalg.norm(vec1)

vec2 = np.cross(normal, vec1)

vec2 = vec2 / np.linalg.norm(vec2)

# Form an orthonormal basis

U = np.column_stack([vec1, vec2, normal])

# Apply regularization to the singular values

S = np.array([1, 1, gicp_epsilon]) # Regularized singular values

# Reconstruct the covariance matrix

cov_modified = np.dot(U, np.dot(np.diag(S), U.T))

return cov_modified

# LOCUS method to reconstruct covariance matrix from normal

def locus_two_plane_vectors_from_normal(normal):

normal_eig = normal / np.linalg.norm(normal)

if abs(normal_eig[2]) < 1e-7:

normal_eig[2] = 1e-7

vec1 = np.array([1.0, 0.0, -normal_eig[0] / normal_eig[2]])

vec1 = vec1 / np.linalg.norm(vec1)

vec2 = np.cross(normal_eig, vec1)

vec2 = vec2 / np.linalg.norm(vec2)

return 0.001 * np.outer(normal_eig, normal_eig) + np.outer(vec1, vec1) + np.outer(vec2, vec2)

# Define two planes

mean1 = np.array([0, 0, 0])

normal1 = np.array([1, 0, 0])

plane1 = generate_plane_points(mean1, normal1)

mean2 = np.array([1, 0, 0])

normal2 = np.array([-1, 0, 0])

plane2 = generate_plane_points(mean2, normal2)

# Calculate means and covariances

mu1 = np.mean(plane1, axis=0)

sigma1 = np.cov(plane1, rowvar=False)

mu2 = np.mean(plane2, axis=0)

sigma2 = np.cov(plane2, rowvar=False)

# Estimate normals using eigenvalue decomposition

normal1_estimated = estimate_plane_normal(plane1)

normal2_estimated = estimate_plane_normal(plane2)

# Calculate Mahalanobis distance before transformation

distance_before = mahalanobis_distance(mu1, mu2, sigma1, sigma2)

print("Mahalanobis distance before transformation: {:.3f}".format(distance_before))

# Regularize covariance matrices and recalculate Mahalanobis distance

sigma1_regular = regularize_covariance_matrix(sigma1)

sigma2_regular = regularize_covariance_matrix(sigma2)

distance_before_regular = mahalanobis_distance(mu1, mu2, sigma1_regular, sigma2_regular)

print("Mahalanobis distance before transformation (regularized): {:.3f}".format(distance_before_regular))

# Reconstruct covariance matrices from normals and recalculate Mahalanobis distance

sigma1_reconstructed = reconstruct_covariance_from_normal(normal1_estimated)

sigma2_reconstructed = reconstruct_covariance_from_normal(normal2_estimated)

distance_before_reconstructed = mahalanobis_distance(mu1, mu2, sigma1_reconstructed, sigma2_reconstructed)

print("Mahalanobis distance before transformation (reconstructed): {:.3f}".format(distance_before_reconstructed))

# Reconstruct covariance matrices from normals using LOCUS method and recalculate Mahalanobis distance

sigma1_reconstructed_locus = locus_two_plane_vectors_from_normal(normal1_estimated)

sigma2_reconstructed_locus = locus_two_plane_vectors_from_normal(normal2_estimated)

distance_before_reconstructed_locus = mahalanobis_distance(mu1, mu2, sigma1_reconstructed_locus, sigma2_reconstructed_locus)

print("Mahalanobis distance before transformation (reconstructed using LOCUS method): {:.3f}".format(distance_before_reconstructed_locus))

The related code snippets are:

// in gicp.hpp, function computeTransformation()

previous_transformation_ = transformation_;

try {

rigid_transformation_estimation_(

output, source_indices, *target_, target_indices, transformation_);

/* compute the delta from this iteration */

delta = 0.;

for (int k = 0; k < 4; k++) {

for (int l = 0; l < 4; l++) {

double ratio = 1;

if (k < 3 && l < 3) { // rotation part of the transform

ratio = 1. / rotation_epsilon_;

} else {

ratio = 1. / transformation_epsilon_;

}

double c_delta = ratio *

fabs(previous_transformation_(k, l) - transformation_(k, l));

if (c_delta > delta) {

delta = c_delta;

}

}

}

}

if (nr_iterations_ >= max_iterations_ || delta < 1) {

converged_ = true;

}

// default values in initialization

rotation_epsilon_(2e-3),

transformation_epsilon_ = 5e-4;

This is to compute the difference in the transformation matrix, represented by

`delta`

variable.The algorithm is deemed converged, if no element in the rotation matrix has a difference (computed against last iteration) greater than the

`rotation_epsilon_`

, and no element in the translation vector has a difference greater than the `transformation_epsilon_`

.`transformation_epsilon_`

can be matched to meters directly, i.e. 5e-4 means no difference greater than 0.0005 meter.`rotation_epsilon_`

corresponds to the max difference in the elements of the 3x3 rotation matrix. For example, if a rotation matrix represents a z-axis rotation of alpha degree, then the converge condition isrotation_epsilon_ = sin(alpha * M_PI / 180), or

alpha = arcsin(rotation_epsilon_) * 180 / M_PI.

When rotation_epsilon_ = 2e-3, the corresponding angle in degree is 0.11 degree. Other values can be inferred accordingly.

- Objective function:$T = \underset{T}{\arg\min} \sum_i d_i^{(T)^\top} \left( C_i^B + T C_i^A T^\top \right)^{-1} d_i^{(T)}$.
- We basically omit the derivative of the Mahalanobis matrix in the middle, and simplify it as a constant matrix denoted by M. We substitute d with its original definition into the objective function and obtain the following.
- After rearrangement:$f_i(\mathbf{R}, \mathbf{t}) = (\mathbf{p}_i - \mathbf{R} \mathbf{q}_i - \mathbf{t})^{\top} \mathbf{M} (\mathbf{p}_i - \mathbf{R} \mathbf{q}_i - \mathbf{t}).$
- Derivative of R: on:$\frac{\mathrm{d} f_i(\mathbf{R}, \mathbf{t})}{\mathrm{d} \mathbf{R}} = 2 (- \mathbf{q}_i) \times \left(\mathbf{M} (\mathbf{p}_i - \mathbf{R} \mathbf{q}_i - \mathbf{t}) \right)^{\top}.$
- Derivative of t: on:$\frac{\mathrm{d} f_i(\mathbf{R}, \mathbf{t})}{\mathrm{d} \mathbf{t}} = - 2 \mathbf{M} (\mathbf{p}_i - \mathbf{R} \mathbf{q}_i - \mathbf{t}).$
- The above two derivatives are consistent with the PCL GICP implementation in C++.
- See the math notes of GICP on overleaf for more information and derivations.

- Linearization of transformation matrix
- ZYX Euler angle for rotation
- Quaternion for rotation
- Cayley–Rodrigues Parameters (in the plane adjustment paper)

Last modified 1mo ago