1、变量作用域机制主要由两个函数实现:
tf.get_variable(<name>, <shape>, <initializer>)
tf.variable_scope(<scope_name>)
2、常用的initializer有
tf.constant_initializer(value) # 初始化一个常量值,
tf.random_uniform_initializer(a, b) # 从a到b均匀分布的初始化,
tf.random_normal_initializer(mean, stddev) # 用所给平均值和标准差初始化正态分布.
3、变量作用域的tf.variable_scope()带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签(后面会说到)来区分以上的两种情况。嵌套的作用域附加名字所用的规则和文件目录的规则很类似。
对于采用了变量作用域的网络结构,结构伪代码如下:
import tensorflow as tf
def my_image_filter():
with tf.variable_scope("conv1"):
weights = tf.get_variable("weights", [1], initializer=tf.random_normal_initializer())
print("weights:%s" % weights.name)
with tf.variable_scope("conv2"):
biases = tf.get_variable("biases", [1], initializer=tf.constant_initializer(0.3))
print("biases:%s" % biases.name)
result1 = my_image_filter()
输出:
weights:conv1/weights:0
biases:conv2/biases:0
4、如果连续调用两次my_image_filter()将会报出ValueError:
result1 = my_image_filter()
result2 = my_image_filter()
ValueError: Variable conv1/weights already exists, disallowed. Did you mean to s
et reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
解决方案: 若不在网络架构中采用变量作用域则不会报错,但是会产生两组变量,而不是共享变量。
a、 当tf.get_variable_scope().reuse == True时;该情况下会搜索一个已存在的“foo/v”并将该变量的值赋给v1,若找不到“foo/v”变量则会抛出ValueError。
b、当tf.get_variable_scope().reuse == tf.AUTO_REUSE时,该方法是为重用变量所设置;该情况不会抛出ValueError
import tensorflow as tf
def my_image_filter():
with tf.variable_scope("conv1", reuse=tf.AUTO_REUSE):
# Variables created here will be named "conv1/weights", "conv1/biases".
weights = tf.get_variable("weights", [1], initializer=tf.random_normal_initializer())
print("weights:%s" % weights.name)
with tf.variable_scope("conv2", reuse=tf.AUTO_REUSE):
# Variables created here will be named "conv2/weights", "conv2/biases".
biases = tf.get_variable("biases", [1], initializer=tf.constant_initializer(0.3))
print("biases:%s" % biases.name)
result1 = my_image_filter()
result2 = my_image_filter()
5、 在模型加载时,如果网络框架中采用变量作用域,也会出现该问题:Variable conv1/weights already exists disallowed. Did you mean to set reuse=True
解决方案:
如果Restart kernel 之后再次执行就不会有问题了(相当于重启了spyder,这样不能从根本解决问题。而且多次重启,也不太好。)
这个问题主要是由于再次执行的时候,之前的计算图已经存在了,再次执行时会和之前已经存在的产生冲突。解决方法:
在代码前面加一句:tf.reset_default_graph()
tf.reset_default_graph()
ckpt_file = tf.train.latest_checkpoint(model_path)
print(ckpt_file)
paths['model_path'] = ckpt_file
model = BiLSTM_CRF(args, embeddings, tag2label, word2id, paths, config=config)
model.build_graph()
参考文献:
Tensorflow学习笔记(三)--变量作用域 https://blog.csdn.net/qq184861643/article/details/78116468
错误:ValueError: Variable layer1-conv1/weight already exists https://blog.csdn.net/xiaohuihui1994/article/details/80829832