【svopro】图像对齐align2D

1. 数学描述

对齐2D图像的过程可以看作是通过最小化重投影误差来更新当前图像中的特征点位置和几何变换参数。具体来说，这个过程可以分为以下几个步骤：

预处理

定义半个块的大小 $halfpatch\_size$ 、块的大小 $patch\_size$ 和块的面积 $patch\_area$ 。定义参考块导数数组 $ref\_patch\_dx$ 和 $ref\_patch\_dy$ ，以及一个4x4的零矩阵 $H$ 。

计算梯度和Hessian矩阵

遍历参考块的像素，计算每个像素点的梯度值，并根据是否使用仿射补偿来设置雅可比矩阵 $J$ 的值。同时累加导数矩阵的乘积，得到Hessian矩阵 $H$ 。

具体地，对于参考块中的像素 $(x, y)$ , 其灰度值为 $I (x, y)$ ，则其梯度值为：

$\begin{aligned} \nabla I&=[I_x,I_y] \\ I_x&=\frac{1}{2} (I(x+1,y)-I(x-1,y)) \\ I_y&=\frac{1}{2} (I(x,y+1)-I(x,y-1)) \end{aligned}$

如果不使用仿射补偿，则将Hessian矩阵 $H$ 的仿射参数块置为单位矩阵，并将残差块置为零。

如果使用仿射补偿，则雅可比矩阵 $J$ 的形式为：

$\left(\begin{matrix} I_x & 0 & x & 0 \\ 0 & I_y & 0 & -I \end{matrix}\right)$

其中， $x, y$ 为像素位置， $I$ 为像素灰度值。

迭代优化

初始化像素位置为初始估计值，更新量为零。进入迭代循环，对于每次迭代:

3.1 计算像素在当前图像中的位置

$u_r=\lfloor u \rfloor, v_r=\lfloor v \rfloor$

其中， $(u, v)$ 为当前像素位置。

3.2 计算插值权重

$\begin{aligned} subpix\_x&=u-u_r \\ subpix\_y&=v-v_r \\ wTL&=(1-subpix\_x)(1-subpix\_y) \\ wTR&=subpix\_x(1-subpix\_y) \\ wBL&=(1-subpix\_x)subpix\_y \\ wBR&=subpix\_xsubpix\_y \end{aligned}$

3.3 对搜索块区域进行插值

$search\_pixel=wTL\times I(u_r,v_r)+wTR\times I(u_r+1,v_r)+wBL\times I(u_r,v_r+1) + wBR\times I(u_r+1,v_r+1)$

3.4 计算残差

$res=search\_pixel-\alpha\times ref\_patch+mean\_diff$

其中， $\alpha$ 为增益系数。

3.5 计算Jacobian矩阵

$J=\left(\begin{matrix} I_x \\ I_y \\ 1 & 0 \\ 0 & -I \end{matrix}\right)$

其中， $I_x,I_y$ 为参考块中对应像素点的导数。

如果不使用仿射补偿，则将 $J$ 的仿射参数置为零。

3.6 更新Hessian矩阵和残差

$\begin{aligned} H &= H + J \times J^T \\ Jres &= J^T \times res \\ \end{aligned}$

3.7 求解更新量

$update=H^{-1}\times Jres$

3.8 更新像素位置和几何变换参数

$\begin{aligned} u &\leftarrow u + update[0] \\ v &\leftarrow v + update[1] \\ mean\_diff &\leftarrow mean\_diff + update[2] \\ \alpha &\leftarrow \alpha + update[3] \end{aligned}$

3.9 判断是否收敛

如果更新量的平方和小于指定的终止条件，则认为对齐收敛。

最后更新估计的像素位置 $cur\_px\_estimate$ ，并返回是否收敛的结果。

2. 代码描述

这个接口的作用是对2D图像进行对齐，使其在参考块区域内与当前图像最匹配。
cur_img：当前图像。
ref_patch_with_border：带边界的参考块。
ref_patch：不带边界的参考块。
n_iter：迭代次数。
affine_est_offset：是否估计平移偏移。
affine_est_gain：是否估计增益。
cur_px_estimate：当前像素点的估计。
no_simd：是否禁用SIMD指令。
each_step：每一步的位置信息（可选）。

bool align2D(const cv::Mat& cur_img, uint8_t* ref_patch_with_border, uint8_t* ref_patch,
    const int n_iter, const bool affine_est_offset, const bool affine_est_gain,
    Keypoint& cur_px_estimate, bool no_simd, std::vector<Eigen::Vector2f>* each_step) {
    
    

#ifdef __ARM_NEON__
  if (!no_simd)
    return align2D_NEON(cur_img, ref_patch_with_border, ref_patch, n_iter, cur_px_estimate);
#endif
如果支持ARM NEON指令集并且不禁用SIMD指令，则调用align2D_NEON函数进行处理。

  if (each_step)
    each_step->clear();

如果提供了每一步的位置信息的容器each_step，则清空该容器。

  const int halfpatch_size_ = 4;
  const int patch_size_ = 8;
  const int patch_area_ = 64;
  bool converged = false;

定义半个块的大小、块的大小和块的面积，并初始化是否收敛的标志为假。

  // We optimize feature position and two affine parameters.
  // compute derivative of template and prepare inverse compositional
  float __attribute__((__aligned__(16))) ref_patch_dx[patch_area_];
  float __attribute__((__aligned__(16))) ref_patch_dy[patch_area_];
  Eigen::Matrix4f H;
  H.setZero();

我们优化特征的位置以及两个仿射参数。计算模板的导数，并准备逆组合。定义带有对齐内存对齐属性的参考块导数数组，以及一个4x4的零矩阵H。

  // compute gradient and hessian
  const int ref_step = patch_size_ + 2;
  float* it_dx = ref_patch_dx;
  float* it_dy = ref_patch_dy;
  for (int y = 0; y < patch_size_; ++y) {
    
    
    uint8_t* it = ref_patch_with_border + (y + 1) * ref_step + 1;
    for (int x = 0; x < patch_size_; ++x, ++it, ++it_dx, ++it_dy) {
    
    
      Eigen::Vector4f J;
      J[0] = 0.5 * (it[1] - it[-1]);
      J[1] = 0.5 * (it[ref_step] - it[-ref_step]);

      // If not using the affine compensation, force the jacobian to be zero.
      J[2] = affine_est_offset ? 1.0 : 0.0;
      J[3] = affine_est_gain ? -1.0 * it[0] : 0.0;

      *it_dx = J[0];
      *it_dy = J[1];
      H += J * J.transpose();
    }
  }

计算梯度和Hessian矩阵。遍历参考块的像素，计算每个像素点的梯度值，并根据是否使用仿射补偿来设置雅可比矩阵J的值。同时累加导数矩阵的乘积，得到Hessian矩阵H。

  // If not use affine compensation, force update to be zero by
  // * setting the affine parameter block in H to identity
  // * setting the residual block to zero (see below)
  if (!affine_est_offset) {
    
    
    H(2, 2) = 1.0;
  }
  if (!affine_est_gain) {
    
    
    H(3, 3) = 1.0;
  }
  Eigen::Matrix4f Hinv = H.inverse();
  float mean_diff = 0;
  float alpha = 1.0;

如果不使用仿射补偿，则将Hessian矩阵H的仿射参数块置为单位矩阵，并将残差块置为零。计算Hessian矩阵的逆矩阵Hinv，并初始化平均差mean_diff和alpha。

  // Compute pixel location in new image:
  float u = cur_px_estimate.x();
  float v = cur_px_estimate.y();

  if (each_step)
    each_step->push_back(Eigen::Vector2f(u, v));

计算像素在新图像中的位置，并将其加入每一步的位置信息（如果提供了容器）。

  // termination condition
  const float min_update_squared =
      0.03 * 0.03;  // TODO I suppose this depends on the size of the image (ate)
  const int cur_step = cur_img.step.p[0];
  Eigen::Vector4f update;
  update.setZero();
  for (int iter = 0; iter < n_iter; ++iter) {
    
    
    int u_r = std::floor(u);
    int v_r = std::floor(v);
    if (u_r < halfpatch_size_ || v_r < halfpatch_size_ || u_r >= cur_img.cols - halfpatch_size_ ||
        v_r >= cur_img.rows - halfpatch_size_)
      break;

    if (std::isnan(u) || std::isnan(v))
      return false;

定义迭代终止条件、当前图像的步长，以及更新量的初始值。进入迭代循环，对于每次迭代，计算像素在当前图像中的位置（舍入到整数）。如果超出图像边界或出现无效的位置，则退出迭代并返回false。


    // compute interpolation weights
    float subpix_x = u - u_r;
    float subpix_y = v - v_r;
    float wTL = (1.0 - subpix_x) * (1.0 - subpix_y);
    float wTR = subpix_x * (1.0 - subpix_y);
    float wBL = (1.0 - subpix_x) * subpix_y;
    float wBR = subpix_x * subpix_y;

    // loop through search_patch, interpolate
    uint8_t* it_ref = ref_patch;
    float* it_ref_dx = ref_patch_dx;
    float* it_ref_dy = ref_patch_dy;
    Eigen::Vector4f Jres;
    Jres.setZero();
    for (int y = 0; y < patch_size_; ++y) {
    
    
      uint8_t* it =
          (uint8_t*)cur_img.data + (v_r + y - halfpatch_size_) * cur_step + u_r - halfpatch_size_;
      for (int x = 0; x < patch_size_; ++x, ++it, ++it_ref, ++it_ref_dx, ++it_ref_dy) {
    
    
        float search_pixel =
            wTL * it[0] + wTR * it[1] + wBL * it[cur_step] + wBR * it[cur_step + 1];
        float res = search_pixel - alpha * (*it_ref) + mean_diff;
        Jres[0] -= res * (*it_ref_dx);
        Jres[1] -= res * (*it_ref_dy);

        if (affine_est_offset) {
    
    
          Jres[2] -= res;
        }

        if (affine_est_gain) {
    
    
          Jres[3] -= (-1) * res * (*it_ref);
        }
      }
    }

    if (!affine_est_offset) {
    
    
      Jres[2] = 0.0;
    }
    if (!affine_est_gain) {
    
    
      Jres[3] = 0.0;
    }

计算插值权重，并对搜索块区域进行插值。根据权重和像素差异计算残差Jres。如果不使用仿射补偿，则将Jres的仿射参数置为零。

    update = Hinv * Jres;
    u += update[0];
    v += update[1];
    mean_diff += update[2];
    alpha += update[3];

    if (each_step)
      each_step->push_back(Eigen::Vector2f(u, v));

    if (update[0] * update[0] + update[1] * update[1] < min_update_squared) {
    
    
      converged = true;
      break;
    }
  }

  cur_px_estimate << u, v;
  (void)no_simd;

  return converged;
}

根据Hessian矩阵的逆矩阵Hinv、残差Jres和更新量update，更新像素的位置、平均差和增益系数alpha。如果更新量的平方和小于指定的终止条件，则认为对齐收敛。最后更新估计的像素位置cur_px_estimate，并返回是否收敛的结果。

3. 参考文献：

L. Lucchi, C. Gasser, T. Hofmann, and V. Friebel, “Efficient deformable registration of non-linearly distorted 3D point-sets,” in Proc. IEEE Int. Conf. Comput. Vis., 2013, pp. 3059–3066.
OpenCV官方文档