TensorFlow踩坑记（陆续上演……）已更新3记

1.TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles.For reference,

错误原因： TensorFlow中，在开始训练时，feed_dict=的数据，必须是实实在在的数据而不能是张量流，也就是说可以直接用print()输出来。

如:

xs,ys = mnist.train.next_batch(BATCH_SIZE)
# print(xs)
reshped_xs = tf.reshape(xs,
                      [BATCH_SIZE,
                      mnist_inference.IMAGE_SIZE,
                      mnist_inference.IMAGE_SIZE,
                      mnist_inference.NUM_CHANNELS])
 xx=sess.run(reshped_xs)
 _,loss_value,step = sess.run([train_step,loss,global_step],
                             feed_dict={x:xx,y_:ys})

xs本来一开始是数据不是张量，但是经过tf.reshpae之后，就变成了一个张量，所以在喂给feed_dict前要先xx=sess.run(reshped_xs)

2. ValueError: setting an array element with a sequence.

错误原因： 一般来说这个错误是由于feed_dict的维度与定义的占位符维度不一致，导致feed之后变成了字符串。

例如：

TensorFlow中，feed_dict=的数据，不支持稀疏表示方式，需要.toarray()转换，如果不转换，就会出现这个错误。

sess.run([train_step,loss,global_step],
feed_dict={x:X[start:end].toarray(),y_:Y[start:end]})

3. Tensorflow训练时内存持续增加并占满.

2018年8月9日16:55:48

今天在跑程序的时候，内存一个劲儿的涨。本地不行拿到服务器上去跑，62G内存分分钟干没了，不知道问题出在哪儿。经过在网上的一番查找，才弄清楚。一句话说：在迭代循环时，不能再包含任何张量的计算表达式，包括以tf.开头的函数（如tf.nn.embedding_lookup）

扫描二维码关注公众号，回复： 3464710 查看本文章

如果你非得计算，请在循环体外面定义好表达式，在循环中直接run

举例：

import tensorflow as tf

a = tf.Variable(tf.truncated_normal(shape=[100,1000]),name='a')
b = tf.Variable(tf.truncated_normal(shape=[100,1000]),name='b')

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    while True:
        print(sess.run(a+b))

可以看到，在循环体中出现了a+b 这个表达式，当你在运行程序的时候，内存会慢慢的增大（当然这个程序的增长速度还不足以导致崩掉）。原因是在Tensorflow的机制中，任何张量的计算表达式（函数操作）都会被作为节点添加到计算图中。如果循环中有表达式，那么计算图中就会被不停的加入几点，导致内存上升。

正确的做法应该是：(将表达式定义在外边)

import tensorflow as tf

a = tf.Variable(tf.truncated_normal(shape=[100,1000]),name='a')
b = tf.Variable(tf.truncated_normal(shape=[100,1000]),name='b')
z=a+b
with tf.Session() as sess:
    tf.global_variables_initializer().run()
    while True:
        print(sess.run(z))

同时TensorFlow也提供了一个办法来检查这个问题：

import tensorflow as tf

a = tf.Variable(tf.truncated_normal(shape=[100,1000]),name='a')
b = tf.Variable(tf.truncated_normal(shape=[100,1000]),name='b')

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    while True:
        print(sess.run(a+b))
        sess.graph.finalize()

此时将报错：RuntimeError: Graph is finalized and cannot be modified.
sess.graph.finalize()这个函数告诉TensorFlow，计算图我已经定义完毕。所以当循环到第二次的时候就会报错。

再例如：

import tensorflow as tf

a = tf.Variable(tf.truncated_normal(shape=[2, 3]), name='a')
b = tf.Variable(tf.truncated_normal(shape=[2, 3]), name='b')

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    sess.graph.finalize()
    c = tf.concat([a, b], axis=0)
    print(sess.run(c))

如上程序也会报错，因为tf.concat()会增加计算图中的节点，而在此之前，我已申明计算图定义完毕。这也证明，tf.开头的函数也将导致计算图中的节点增加。解决方法同上。

4. Tensorflow训练时出现TypeError: unhashable type: 'numpy.ndarray’错误
2018年9月25日12点18分

出现这个错误的语句出现在feed_dic = {x: batch_X, y: batch_y}，弄了半天没debug好，百度后才发现问题。
问题主要出在placeholder的变量名在其它地方重复了，我这个地方y重了

y = tf.placeholder(dtype=tf.float32, shape=[None], name='y-input')
X, y = gen_data()
feed_dic = {x: batch_X, y: batch_y}

即{x: batch_X, y: batch_y}中的y就不再是placeholder了，所以才会出现unhashable type: 'numpy.ndarray’错误，只要去掉重名即可：

y_ = tf.placeholder(dtype=tf.float32, shape=[None], name='y-input')
X, y = gen_data()
feed_dic = {x: batch_X, y_: batch_y}
`

TensorFlow踩坑记（陆续上演……）已更新3记

猜你喜欢