一,L2正则化:
tf.nn.l2_loss形如 ,一般用于优化的目标函数中的正则项,防止参数太多复杂容易过拟合。
tf.nn.l2_loss(var)
Computes half the L2 norm of a tensor without the `sqrt`:
output = sum(t ** 2) / 2
不过在这里又有疑问?什么是L1,L2范数和L1,L2正则化项,都听过但是名字感觉非常相似,下面来讨论下:
L1范数 – (Lasso Regression)
L1范数表示向量中每个元素绝对值的和:
L2范数 – (Ridge Regression)
L2范数即欧氏距离:
- L1正则化是指权值向量w中各个元素的绝对值之和,通常表示为||w||1
- L2正则化是指权值向量w中各个元素的平方和然后再求平方根(可以看到Ridge回归的L2正则化项有平方符号),通常表示为||w||2
也就是说Lx范数应用于优化的目标函数就叫做Lx正则化。
二,tf.nn.l2_normalize
这个是用L2范数的结果来对输入进行scale,输出的维度不变。
源码如下:
def l2_normalize(x, dim, epsilon=1e-12, name=None):
"""Normalizes along dimension `dim` using an L2 norm.
For a 1-D tensor with `dim = 0`, computes
output = x / sqrt(max(sum(x**2), epsilon))
For `x` with more dimensions, independently normalizes each 1-D slice along
dimension `dim`.
Args:
x: A `Tensor`.
dim: Dimension along which to normalize. A scalar or a vector of
integers.
epsilon: A lower bound value for the norm. Will use `sqrt(epsilon)` as the
divisor if `norm < sqrt(epsilon)`.
name: A name for this operation (optional).
Returns:
A `Tensor` with the same shape as `x`.
"""
with ops.name_scope(name, "l2_normalize", [x]) as name:
x = ops.convert_to_tensor(x, name="x")
square_sum = math_ops.reduce_sum(math_ops.square(x), dim, keep_dims=True)
x_inv_norm = math_ops.rsqrt(math_ops.maximum(square_sum, epsilon))
return math_ops.multiply(x, x_inv_norm, name=name)
l2_normalize(x, dim, epsilon=1e-12, name=None)
主要参数就是dim,从上面代码看到,dim的计算是按照reduce_sum的方式来的,reduce可以理解为减少维度的意思,通过实验来验证dim的方式。
x = tf.nn.l2_normalize(x_init, dim=[0, 1, 2])
"Normalizes along dimension `dim` using an L2 norm