Ceres Solver 官方教程学习笔记（十一）——非线性最小二乘法建模Modeling Non-linear Least Squares （上）

本页内容主要根据官方教程翻译而成。

简介

Ceres由两个部分组成。一个是建模API，它提供了非常丰富的工具，可以迅速构建一个优化问题模型。另一个是解算器API，用于管控最小化算法。这一章将围绕如何用Ceres进行优化问题建模展开。下一章 Solving Non-linear Least Squares 主要讨论各种不同的求解方法。

这里省略了一大段跟第一章简介重复的内容。

代价函数CostFunction

代价函数负责计算残差向量和雅可比矩阵。具体地说，现有一个关于参数块 $\left[x_{1}, ... , x_{k}\right]$ 的函数 $f\left(x_{1},...,x_{k}\right)$ 。对于给定的参数块 $\left[x_{1}, ... , x_{k}\right]$ ，代价函数的作用就是计算 $\left[x_{1}, ... , x_{k}\right]$ 和雅可比矩阵

J_{i} = \frac{\partial}{\partial x_{i}} f (x_{1}, . . ., x_{k}) \forall i \in {1, \dots, k}

$J_i = \frac{\partial}{\partial x_i} f(x_1, ..., x_k) \quad \forall i \in \{1, \ldots, k\}$

这段原文有点絮叨，凑字数吗？

class CostFunction {
 public:
  virtual bool Evaluate(double const* const* parameters,
                        double* residuals,
                        double** jacobians) = 0;
  const vector<int32>& parameter_block_sizes();
  int num_residuals() const;

 protected:
  vector<int32>* mutable_parameter_block_sizes();
  void set_num_residuals(int num_residuals);
};

参数块的数量和大小被记录在CostFunction::parameter_block_sizes_。输出残差的个数被记录在CostFunction::num_residuals_。从此类继承的用户代码将使用相应的访问器设置这两个成员。添加残差块到Problem时，该信息将由Problem进行验证。

评估函数CostFunction::Evaluate

bool CostFunction::Evaluate(double const *const *parameters,
                double *residuals,
                double **jacobians)

CostFunction::Evaluate用于计算残差向量和雅可比矩阵。

parameters是一个数组的数组。它包含了个数等于参数块个数parameter_block_sizes_.size()的子数组。每个子数组parameters[i]都存储着第 $i$ 个参数块内的参数，大小等于该参数块内参数的个数parameter_block_sizes_[i]。该数组永远不为Null。
residuals是一个大小等于残差个数num_residuals_的数组。它也永远不为Null。
jacobians也是一个数组的数组。大小等于参数块的个数parameter_block_sizes_.size()。如果它为Null就意味着，用户只希望计算残差。每一个元素都对应一个子数组jacobians[i]。每个子数组都是大小为num_residuals x parameter_block_sizes_[i]的行优先数组。如果某个子数组jacobians[i]不是Null，那说明用户要求计算对应parameters[i]的残差向量的雅可比矩阵，并且存在这个子数组，即
jacobians[i][r * parameter_block_sizes_[i] + c] $=\frac{\displaystyle \partial \text{residual}[r]}{\displaystyle \partial \text{parameters}[i][c]}$
返回值反映了计算残差或者雅可比矩阵是否成功。

指定大小的代价函数SizedCostFunction

该类继承自CostFunction类。如果参数块和残差块的大小在编译时已知，那么用户可以把它们指定为模板参数。并且使用SizeCostFunction。这样用户只需要编程实现CostFunction::Evaluate()即可。下面是SizedCostFunction的代码，不是例程。

template<int kNumResiduals,
         int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
         int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
class SizedCostFunction : public CostFunction {
 public:
  virtual bool Evaluate(double const* const* parameters,
                        double* residuals,
                        double** jacobians) const = 0;
};

自动微分AutoDiffCostFunction

定义一个CostFunction 或 SizedCostFunction可能是一个繁琐且容易出错的过程，尤其是在计算导数的时候。为此，Ceres提供了 AutoDiffCostFunction.。

template <typename CostFunctor,
       int kNumResiduals,  // Number of residuals, or ceres::DYNAMIC.
       int N0,       // Number of parameters in block 0.
       int N1 = 0,   // Number of parameters in block 1.
       int N2 = 0,   // Number of parameters in block 2.
       int N3 = 0,   // Number of parameters in block 3.
       int N4 = 0,   // Number of parameters in block 4.
       int N5 = 0,   // Number of parameters in block 5.
       int N6 = 0,   // Number of parameters in block 6.
       int N7 = 0,   // Number of parameters in block 7.
       int N8 = 0,   // Number of parameters in block 8.
       int N9 = 0>   // Number of parameters in block 9.
class AutoDiffCostFunction : public
SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
 public:
  explicit AutoDiffCostFunction(CostFunctor* functor);
  // Ignore the template parameter kNumResiduals and use
  // num_residuals instead.
  AutoDiffCostFunction(CostFunctor* functor, int num_residuals);
};

为了获得一个可以自动微分的代价函数，必须顶一个类。这个类中必须带有模板化的操作符()的重载，一个Functor。它使用模板T类型进行代价函数运算。自动微分将根据需要用Jet类型替代模板T。但这个是隐藏的，编程的时候要把这个T看作一个双精度浮点数。这个函数必须把计算结果以最后一个参数（唯一一个非常量参数）传递出来，并且返回True，告诉计算机运算成功完成。
例如，现有一个标量的偏差函数 $e = k - x^\top y$ 。这里 $x$ 和 $y$ 都是二维向量参数， $k$ 是个常量。这种类型的偏差，即一个常量和一个表达式的差值，在最小二乘法问题中很常见。例如， $x^\top y$ 可能是一系列测量结果的期望值，那么每一次测量 $K$ 都对应了一个代价函数类的实例。被加到Problem中的是 $e^2$ 或者 $(k - x^\top y)^2$ 。平方处理由Ceres优化框架完成。这个例子的具体代码如下：

class MyScalarCostFunctor {
  MyScalarCostFunctor(double k): k_(k) {}

  template <typename T>
  bool operator()(const T* const x , const T* const y, T* e) const {
    e[0] = k_ - x[0] * y[0] - x[1] * y[1];
    return true;
  }

 private:
  double k_;
};

注意，在operator()的声明中，首先是输入参数，他们都是指向T类型数组的常指针。如果由更多的输入参数就跟在y后面。而输出值永远是最后一个参数，并且也是一个指向数组的指针。在上述例子中，e是标量。所以只赋值e[0]。
然后给出这个类的定义，它的自动微分代价函数可以如下构造：

CostFunction* cost_function
    = new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
        new MyScalarCostFunctor(1.0));              ^  ^  ^
                                                    |  |  |
                        Dimension of residual ------+  |  |
                        Dimension of x ----------------+  |
                        Dimension of y -------------------+

在这个例子中，对每次测量k都有一个实例。模板参数1,2,2将Functor描述为一个一维输出参数和两个二维输入参数。AutoDiffCostFunction也支持在运行时动态确定参数个数。例如下面的代码：

CostFunction* cost_function
    = new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
        new CostFunctorWithDynamicNumResiduals(1.0),   ^     ^  ^
        runtime_number_of_residuals); <----+           |     |  |
                                           |           |     |  |
                                           |           |     |  |
          Actual number of residuals ------+           |     |  |
          Indicate dynamic number of residuals --------+     |  |
          Dimension of x ------------------------------------+  |
          Dimension of y ---------------------------------------+

Ceres目前支持代价函数最多有10个相互独立的变量，但是对每个变量有多少维度没有限制。

注意，新用户常常犯的一个错误就是把模板参数中的数字理解成参数的个数。但事实上，模板参数中数字的含义是每个参数的维度。这两个概念不能混淆。比如在这个例子中x y都是二维变量，所以模板参数中有两个2。

动态自动微分DynamicAutoDiffCostFunction

AutoDiffCostFunction 需要在编译时知道参数块的数量和它们的大小。它也有10个参数块的上限。在许多应用程序中，这是不够的。如贝塞尔曲线拟合，神经网络训练等。在这种情况下可以使用DynamicAutoDiffCostFunction，像 AutoDiffCostFunction一样，用户必须定义模板函数，但是具体参数略有不同，我们希望 cost functors 的接口是：

struct MyCostFunctor {
  template<typename T>
  bool operator()(T const* const* parameters, T* residuals) const { //由一个个的具体参数变成了一个参数块的大数组parameters
  }
}

由于参数的大小是在运行时确定的，所以在创建DynamicAutoDiffCostFunction之后，还必须指定大小。

DynamicAutoDiffCostFunction<MyCostFunctor, 4>* cost_function =
  new DynamicAutoDiffCostFunction<MyCostFunctor, 4>(
    new MyCostFunctor());
cost_function->AddParameterBlock(5);
cost_function->AddParameterBlock(10);
cost_function->SetNumResiduals(21);

在底层，对代价函数的计算分多次进行，每次计算一小组微分(默认情况下4个，由Stride模板参数控制)。分组更小计算更高效，但会导致更多次的运算。而更大分组虽然可以同时减少计算次数，但是有时候会造成缓存数据损失。必须对此做出权衡。最优值取决于各种参数块的数量和大小。建议在使用DynamicAutoDiffCostFunction之前，先试着用AutoDiffCostFunction。

数值微分 NumericDiffCostFunction

在某些情况下，定义一个带有模板的代价函数是不现实的。比如有时候你需要调用一个外部的库函数来计算残值等。在这种情况下需要用到数值微分法。

template <typename CostFunctor,
          NumericDiffMethodType method = CENTRAL,
          int kNumResiduals,  // Number of residuals, or ceres::DYNAMIC.
          int N0,       // Number of parameters in block 0.
          int N1 = 0,   // Number of parameters in block 1.
          int N2 = 0,   // Number of parameters in block 2.
          int N3 = 0,   // Number of parameters in block 3.
          int N4 = 0,   // Number of parameters in block 4.
          int N5 = 0,   // Number of parameters in block 5.
          int N6 = 0,   // Number of parameters in block 6.
          int N7 = 0,   // Number of parameters in block 7.
          int N8 = 0,   // Number of parameters in block 8.
          int N9 = 0>   // Number of parameters in block 9.
class NumericDiffCostFunction : public
SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
};

为了获得数值微分法的代价函数，必须借助一个Functor重载操作符operator()来计算残值。而计算结果必须以最后一个参数（唯一一个非常量参数）传递出来，并且令其返回一个true来告知系统计算成功。关于参数的设置详情可以参考CostFunction部分。这里给出一个小例子：

struct ScalarFunctor {
 public:
  bool operator()(const double* const x1,
                  const double* const x2,
                  double* residuals) const;
}

这里有一个标量误差值 $e = k - x'y$ ， $x$ 和 $y$ 分别是二维列向量， $'$ 符号表示矩阵的转置， $k$ 是个常量。这种一个常数和一个表达式之间的误差在最小二乘问题当中非常常见。例如 $x'y$ 可能是一系列测量之后的计算值。对每一个测量值 $k$ 都有一个对应的代价函数的实例。具体代码如下：

class MyScalarCostFunctor {
  MyScalarCostFunctor(double k): k_(k) {}

  bool operator()(const double* const x,
                  const double* const y,
                  double* residuals) const {
    residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
    return true;
  }

 private:
  double k_;
};

注意形参列表中最先出现的是输入变量 $x$ 和 $y$ ，并且均为指向双精度数组的指针。如果仍有其他输入就写在 $y$ 后面。输出量永远位于最后一位，并且是一个数组指针。在上面的例子中，残差值是个标量，所以只有residuals[0]有意义。然后给出代价函数类的定义。使用中心差分法的代码如下：

CostFunction* cost_function
    = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
        new MyScalarCostFunctor(1.0));                    ^     ^  ^  ^
                                                          |     |  |  |
                              Finite Differencing Scheme -+     |  |  |
                              Dimension of residual ------------+  |  |
                              Dimension of x ----------------------+  |
                              Dimension of y -------------------------+

模板内每个参数的含义跟自动微分法类似，只是多一个差分方式参数。同样的，数值微分法也支持在运行过程中动态确定残差的维度，具体代码如下：

CostFunction* cost_function
    = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, DYNAMIC, 2, 2>(
        new CostFunctorWithDynamicNumResiduals(1.0),               ^     ^  ^
        TAKE_OWNERSHIP,                                            |     |  |
        runtime_number_of_residuals); <----+                       |     |  |
                                           |                       |     |  |
                                           |                       |     |  |
          Actual number of residuals ------+                       |     |  |
          Indicate dynamic number of residuals --------------------+     |  |
          Dimension of x ------------------------------------------------+  |
          Dimension of y ---------------------------------------------------+

Ceres最多支持十个参数，但对每个参数的维度没有限制。至于差分方法Ceres提供了三种可能性：
FORWARD前向差分法，CENTRAL中心差分法和RIDDERSRIDDERS方法。关于这三种方法的原理和应用场景在之前的教程中已经解释过，这里不再重复。通常情况下有先尝试使用中心差分法，然后再根据中心差分法的求算结果，选择前向差分法提高速度，或者使用Ridders法提高精度。

数值微分与本地参数化Numeric Differentiation & LocalParameterization

如果您的成本函数取决于必须位于流形上的参数块，并且无法评估该函数的参数块值不在歧管上，那么您可能在数值上区分这些函数时遇到问题。
这一部分超出我的需要，略过。
原文请点击http://ceres-solver.org/nnls_modeling.html#numeric-differentiation-localparameterization

动态数值微分DynamicNumericDiffCostFunction

如果参数超过10个，用普通的数值微分法就行不通了。如果遇到这种情况可以使用动态微分代价函数DynamicNumericDiffCostFunction。跟普通的数值微分法一样，用户必须给定一个Functor。这里的Functor可以像下面这样定义：

struct MyCostFunctor {
  bool operator()(double const* const* parameters, double* residuals) const {
  }
}

因为参数的维度在运行时已知，所以用户必须在创建代价函数的时候指定参数块维度如下：

DynamicNumericDiffCostFunction<MyCostFunctor>* cost_function =
  new DynamicNumericDiffCostFunction<MyCostFunctor>(new MyCostFunctor);
cost_function->AddParameterBlock(5);
cost_function->AddParameterBlock(10);
cost_function->SetNumResiduals(21);

同样的根据经验公式，使用动态数值微分法DynamicNumericDiffCostFunction前，最好先尝试使用一般的数值微分法NumericDiffCostFunction

CostFunctionToFunctor

CostFunctionToFunctor是一个转换类。它使用户可以把代价函数对象转换为带有模板的Functors，以便使用自动微分算法。通过这个类，用户就可以把自动微分法、解析微分法和数值微分法随意组合使用了。下面给出一个例子：

class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
  public:
    IntrinsicProjection(const double* observation);
    virtual bool Evaluate(double const* const* parameters,
                          double* residuals,
                          double** jacobians) const;
};

上面给出了一个代价函数，在这个函数中实现了将一个空间点投影到一个成像平面（理论投影点）并且求出与实际投影点之间的差。这个函数可以计算出残差值，并且既可以用解析微分法也可以用数值微分法计算出它的雅可比矩阵。

现在，我们想用相机外矩阵，即旋转和平移，来构造这个代价函数。假设我们有一个模板函数如下：

template<typename T>
void RotateAndTranslatePoint(const T* rotation,
                             const T* translation,
                             const T* point,
                             T* result);

那么我们可以：

struct CameraProjection {
  CameraProjection(double* observation)
  : intrinsic_projection_(new IntrinsicProjection(observation)) {
  }

  template <typename T>
  bool operator()(const T* rotation,
                  const T* translation,
                  const T* intrinsics,
                  const T* point,
                  T* residual) const {
    T transformed_point[3];
    RotateAndTranslatePoint(rotation, translation, point, transformed_point);

    // 注意这里我们就像使用其他模板函数一样直接使用intrinsic_projection_()。
    return intrinsic_projection_(intrinsics, transformed_point, residual);
  }

 private:
  CostFunctionToFunctor<2,5,3> intrinsic_projection_;
};

注意实际上CostFunctionToFunctor取得了CostFunction的控制权，然后被放到到构造函数当中。

在上面的例子中，我们假设IntrinsicProjection是一个可以计算残差值和微分的代价函数。假设情况并非如此，而是IntrinsicProjection像下面这样定义：

struct IntrinsicProjection
  IntrinsicProjection(const double* observation) {
    observation_[0] = observation[0];
    observation_[1] = observation[1];
  }

  bool operator()(const double* calibration,
                  const double* point,
                  double* residuals) {
    double projection[2];
    ThirdPartyProjectionFunction(calibration, point, projection);
    residuals[0] = observation_[0] - projection[0];
    residuals[1] = observation_[1] - projection[1];
    return true;
  }
 double observation_[2];
};

这里所使用的ThirdPartyProjectionFunction是一个我们无法改动的第三方库函数。那么我们就希望使用数值微分法来计算其微分。

struct CameraProjection {
  CameraProjection(double* observation)
    intrinsic_projection_(
      new NumericDiffCostFunction<IntrinsicProjection, CENTRAL, 2, 5, 3>(
        new IntrinsicProjection(observation)) {
  }

  template <typename T>
  bool operator()(const T* rotation,
                  const T* translation,
                  const T* intrinsics,
                  const T* point,
                  T* residuals) const {
    T transformed_point[3];
    RotateAndTranslatePoint(rotation, translation, point, transformed_point);
    return intrinsic_projection_(intrinsics, transformed_point, residual);
  }

 private:
  CostFunctionToFunctor<2,5,3> intrinsic_projection_;
};

我估计最后一段代码官方教程原文贴错了。真正的代码应该是用Functor封装第三方库函数，然后用数值微分法的套路生成一个数值微分法的CostFunctor，再用CostFunctionToFunctor来改造，使其成为Auto能使用的模板化的新的Functor。具体的在之前的教程Ceres Solver 官方教程学习笔记（Ⅹ）——自动微分法接口Interfacing with Automatic Differentiation里有个例子。