你要的答案或许都在这里:小鹏的博客目录
以前打比赛的大师,做的卷积自编码用来图像的降噪:样本是噪声图片,标签是没有噪声的图片。
整个结果很简单,跟自动编码器很像,结构上都是降维后升维(encoder-decoder)的过程,但是用途完全不一样。
这只是一个雏形,后面优化的空间很大,前段时间在做图像语义分隔,原理类似,但是图像语义分隔开始扩展到很多领域:无人驾驶,基于图片的三维重建等等,随之而来是:开山之作:FCN、CRF、SegNet/DeconvNet、DeepLab系列、unet、refinenet、mask-rcnn等等。但是都是基于迁移学习来做的,迁移学习一个很大的优势就是可以用少量样本达到很好的效果,不信可以自己搭架模型试一下,所示现在能用迁移学习的绝不从头自己开始训练。
看一下效果图:
要自己生成训练样本可以参考:tf20: CNN—识别字符验证码
代码如下:
import re import cv2 as cv import numpy as np import tensorflow as tf import sklearn.preprocessing as pre premn=pre.MinMaxScaler() premn.fit_transform(range(0,256)) """加载数据""" def equalizeHistColor(img): """ equalizeHistColor:对彩色图像进行直方图均衡化. img:待处理的图片 """ channels=cv.split(img) for channel in channels: channel=cv.equalizeHist(channel) return cv.merge(channels) def readLabels(folder='./CaptchaImages/labels.txt'): labels=[] with open('./CaptchaImages/labels.txt','r') as flabel: for line in flabel: line=line.strip('\n') noise,source=re.split(',',line) source,label=re.split(':',source) labels.append([noise,source,label]) return np.array(labels) labels=readLabels() def readImages(start,num): noiseimg=[] sourceimg=[] for i in range(start,start+num): noiseimg.append(cv.cvtColor(equalizeHistColor(cv.imread(labels[i,0])),cv.COLOR_RGB2GRAY)) sourceimg.append(cv.cvtColor(equalizeHistColor(cv.imread(labels[i,1])),cv.COLOR_RGB2GRAY)) return np.array(noiseimg),np.array(sourceimg) input('数据加载完成,任意键继续...') """构造模型""" inputs_ = tf.placeholder(tf.float32, (None, 60, 180, 1), name='inputs_') targets_ = tf.placeholder(tf.float32, (None, 60, 180, 1), name='targets_') """Encoder三层卷积""" conv1 = tf.layers.conv2d(inputs_, 64, (3,3), padding='same', activation=tf.nn.relu) conv1 = tf.layers.max_pooling2d(conv1, (2,2), (2,2), padding='same') conv2 = tf.layers.conv2d(conv1, 64, (3,3), padding='same', activation=tf.nn.relu) conv2 = tf.layers.max_pooling2d(conv2, (2,2), (2,2), padding='same') conv3 = tf.layers.conv2d(conv2, 32, (3,3), padding='same', activation=tf.nn.relu) conv3 = tf.layers.max_pooling2d(conv3, (2,2), (2,2), padding='same') """Decoder三层卷积""" conv7 = tf.image.resize_nearest_neighbor(conv3, (7,7)) conv7 = tf.layers.conv2d(conv7, 32, (3,3), padding='same', activation=tf.nn.relu) conv8 = tf.image.resize_nearest_neighbor(conv7, (14,14)) conv8 = tf.layers.conv2d(conv8, 64, (3,3), padding='same', activation=tf.nn.relu) conv9 = tf.image.resize_nearest_neighbor(conv8, (60,180)) conv9 = tf.layers.conv2d(conv9, 64, (3,3), padding='same', activation=tf.nn.relu) """logits and outputs""" logits_ = tf.layers.conv2d(conv9, 1, (3,3), padding='same', activation=None) outputs_ = tf.nn.sigmoid(logits_, name='outputs_') """loss and Optimizer""" loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=targets_, logits=logits_) cost = tf.reduce_mean(loss) optimizer = tf.train.AdamOptimizer(0.001).minimize(cost) input('构造模型完成,任意键开始训练...') """训练""" sess = tf.Session() saver = tf.train.Saver(tf.trainable_variables()) noise_factor = 0.5 epochs = 3 batch_size = 70 sess.run(tf.global_variables_initializer()) for e in range(epochs): n=0 for idx in range(labels.shape[0]//batch_size): noisy_imgs,imgs=readImages(idx*batch_size,batch_size) noisy_imgs=noisy_imgs.astype('float32') imgs=imgs.astype('float32') for i in range(noisy_imgs.shape[0]): noisy_imgs[i]=premn.transform(noisy_imgs[i]) imgs[i]=premn.transform(imgs[i]) noisy_imgs=noisy_imgs.reshape(-1, 60, 180, 1) imgs=imgs.reshape(-1, 60, 180, 1) batch_cost, _ = sess.run([cost, optimizer], feed_dict={inputs_: noisy_imgs, targets_: imgs}) print("Epoch: {}/{} ".format(e+1, epochs), "Training loss: {:.9f}".format(batch_cost)) if batch_cost<0.1: n=n+1 else: n=0 if n>30: break input('模型训练完成,任意键开始预测...') """预测""" noisy_imgs,imgs=readImages(3000,5) for i in range(noisy_imgs.shape[0]): noisy_imgs[i]=premn.transform(noisy_imgs[i]) imgs[i]=premn.transform(imgs[i]) noisy_imgs=noisy_imgs.reshape(-1, 60, 180, 1) reconstructed = sess.run(outputs_, feed_dict={inputs_: noisy_imgs.reshape((5, 60, 180, 1))})
tensorflow系列:
1. Ubuntu 16.04 安装 Tensorflow(GPU支持)
8. tf6: autoencoder—WiFi指纹的室内定位
9. tf7: RNN—古诗词
10. tf8:RNN—生成音乐
11. tf9: PixelCNN
13. tf11: retrain谷歌Inception模型
14. tf12: 判断男声女声
15. tf13: 简单聊天机器人
16. tf14: 黑白图像上色
17. tf15: 中文语音识别
19. tf17: “声音大挪移”
20. tf18: 根据姓名判断性别
21. tf19: 预测铁路客运量
26. tf24: GANs—生成明星脸
28. tf26: AI操盘手
29. tensorflow_cookbook--preface
36. 04 Support Vector Machines
37. tf API 研读1:tf.nn,tf.layers, tf.contrib概述
38. tf API 研读2:math
39. tensorflow中的上采样(unpool)和反卷积(conv2d_transpose)
40. tf API 研读3:Building Graphs
41. tf API 研读4:Inputs and Readers
44. tf.contrib.rnn.static_rnn与tf.nn.dynamic_rnn区别
45. Tensorflow使用的预训练的resnet_v2_50,resnet_v2_101,resnet_v2_152等模型预测,训练
46. tensorflow下设置使用某一块GPU、多GPU、CPU的情况
47. 工业器件检测和识别
48. 将tf训练的权重保存为CKPT,PB ,CKPT 转换成 PB格式。并将权重固化到图里面,并使用该模型进行预测
49. tensorsor快速获取所有变量,和快速计算L2范数
51. Tensorflow实战学习笔记
53. tf28: 手写汉字识别
54. tf29: 使用tensorboard可视化inception_v4
55. tf30: center loss及其mnist上的应用
58. tf33: 图片降噪:卷积自编码