tf.nn.l2_loss和 tf.nn.l2_normalize

一,L2正则化:

tf.nn.l2_loss形如 1 / 2 Σ w 2 ,一般用于优化的目标函数中的正则项,防止参数太多复杂容易过拟合。

tf.nn.l2_loss(var)
Computes half the L2 norm of a tensor without the `sqrt`:

      output = sum(t ** 2) / 2

不过在这里又有疑问?什么是L1,L2范数和L1,L2正则化项,都听过但是名字感觉非常相似,下面来讨论下:
L1范数 – (Lasso Regression)

L1范数表示向量中每个元素绝对值的和:

(87) x 1 = ( | x 1 | + + | x m | ) )

L2范数 – (Ridge Regression)

L2范数即欧氏距离:

(88) x 2 = ( | x 1 | 2 + + | x m | 2 ) 1 2 = x T x

  • L1正则化是指权值向量w中各个元素的绝对值之和,通常表示为||w||1
  • L2正则化是指权值向量w中各个元素的平方和然后再求平方根(可以看到Ridge回归的L2正则化项有平方符号),通常表示为||w||2
    也就是说Lx范数应用于优化的目标函数就叫做Lx正则化。

二,tf.nn.l2_normalize

这个是用L2范数的结果来对输入进行scale,输出的维度不变。
源码如下:

def l2_normalize(x, dim, epsilon=1e-12, name=None):
  """Normalizes along dimension `dim` using an L2 norm.

  For a 1-D tensor with `dim = 0`, computes

      output = x / sqrt(max(sum(x**2), epsilon))

  For `x` with more dimensions, independently normalizes each 1-D slice along
  dimension `dim`.

  Args:
    x: A `Tensor`.
    dim: Dimension along which to normalize.  A scalar or a vector of
      integers.
    epsilon: A lower bound value for the norm. Will use `sqrt(epsilon)` as the
      divisor if `norm < sqrt(epsilon)`.
    name: A name for this operation (optional).

  Returns:
    A `Tensor` with the same shape as `x`.
  """
  with ops.name_scope(name, "l2_normalize", [x]) as name:
    x = ops.convert_to_tensor(x, name="x")
    square_sum = math_ops.reduce_sum(math_ops.square(x), dim, keep_dims=True)
    x_inv_norm = math_ops.rsqrt(math_ops.maximum(square_sum, epsilon))
    return math_ops.multiply(x, x_inv_norm, name=name)

l2_normalize(x, dim, epsilon=1e-12, name=None)
主要参数就是dim,从上面代码看到,dim的计算是按照reduce_sum的方式来的,reduce可以理解为减少维度的意思,通过实验来验证dim的方式。


x = tf.nn.l2_normalize(x_init, dim=[0, 1, 2])
"Normalizes along dimension `dim` using an L2 norm

猜你喜欢

转载自blog.csdn.net/m0_37561765/article/details/79645026