第4章 深度估计与分割(SGBM、GrabCut、分水岭) 个人笔记

1. 前言

主要针对 《OpenCV 3 计算机视觉 Python语言实现》 第4章 深度估计与分割 做的个人笔记。
本章并没有使用深度摄像头,主要分为三部分:极几何计算视差图(应用StereoSGBM算法)GrabCut算法分水岭算法
其中,对分水岭算法理解遇到了点困难,好在已解决,特此记录。

2. 极几何计算视差图

2.1 StereoSGBM算法代码

import numpy as np
import cv2


def update():
    stereo.setBlockSize(cv2.getTrackbarPos('window_size', 'disparity'))
    stereo.setUniquenessRatio(cv2.getTrackbarPos('uniquenessRatio', 'disparity'))
    stereo.setSpeckleWindowSize(cv2.getTrackbarPos('speckleWindowSize', 'disparity'))
    stereo.setSpeckleRange(cv2.getTrackbarPos('speckleRange', 'disparity'))
    stereo.setDisp12MaxDiff(cv2.getTrackbarPos('disp12MaxDiff', 'disparity'))

    print('computing disparity...')
    disp = stereo.compute(imgL, imgR).astype(np.float32) / 16.0

    cv2.imshow('left', imgL)
    cv2.imshow('right', imgR)
    cv2.imshow('disp', disp)
    
    cv2.imshow('disp', (disp - min_disp) / num_disp)
    cv2.imshow('disparity', (disp - min_disp) / num_disp)


if __name__ == "__main__":
    window_size = 3
    min_disp = 16
    num_disp = 192 - min_disp
    blockSize = window_size
    uniquenessRatio = 1
    speckleRange = 12
    speckleWindowSize = 3
    disp12MaxDiff = 200
    P1 = 600
    P2 = 2400

    imgL = cv2.imread('depth1.jpg')
    imgR = cv2.imread('depth2.jpg')
    imgL = cv2.resize(imgL, (800, 450))
    imgR = cv2.resize(imgR, (800, 450))
    print(imgL.shape)
    print(imgR.shape)

    cv2.namedWindow('disparity')
    cv2.createTrackbar('speckleRange', 'disparity', speckleRange, 50, update)
    cv2.createTrackbar('window_size', 'disparity', window_size, 10, update)
    cv2.createTrackbar('speckleWindowSize', 'disparity', speckleWindowSize, 200, update)
    cv2.createTrackbar('uniquenessRatio', 'disparity', uniquenessRatio, 50, update)
    cv2.createTrackbar('disp12MaxDiff', 'disparity', disp12MaxDiff, 250, update)

    '''
    cv2.createStereoSGBM(minDisparity, numDisparities, blockSize[, P1[, P2[, disp12MaxDiff[, preFilterCap[, uniquenessRatio[, speckleWindowSize[, speckleRange[, mode]]]]]]]]) → retval
    Parameters: 
        minDisparity – Minimum possible disparity value. Normally, it is zero but sometimes rectification algorithms can shift images, so this parameter needs to be adjusted accordingly.
        numDisparities – Maximum disparity minus minimum disparity. The value is always greater than zero. In the current implementation, this parameter must be divisible by 16.
        blockSize – Matched block size. It must be an odd number >=1 . Normally, it should be somewhere in the 3..11 range.
        P1 – The first parameter controlling the disparity smoothness. See below.
        P2 – The second parameter controlling the disparity smoothness. The larger the values are, the smoother the disparity is. P1 is the penalty on the disparity change by plus or minus 1 between neighbor pixels. P2 is the penalty on the disparity change by more than 1 between neighbor pixels. The algorithm requires P2 > P1 . See stereo_match.cpp sample where some reasonably good P1 and P2 values are shown (like 8*number_of_image_channels*SADWindowSize*SADWindowSize and 32*number_of_image_channels*SADWindowSize*SADWindowSize , respectively).
        disp12MaxDiff – Maximum allowed difference (in integer pixel units) in the left-right disparity check. Set it to a non-positive value to disable the check.
        preFilterCap – Truncation value for the prefiltered image pixels. The algorithm first computes x-derivative at each pixel and clips its value by [-preFilterCap, preFilterCap] interval. The result values are passed to the Birchfield-Tomasi pixel cost function.
        uniquenessRatio – Margin in percentage by which the best (minimum) computed cost function value should “win” the second best value to consider the found match correct. Normally, a value within the 5-15 range is good enough.
        speckleWindowSize – Maximum size of smooth disparity regions to consider their noise speckles and invalidate. Set it to 0 to disable speckle filtering. Otherwise, set it somewhere in the 50-200 range.
        speckleRange – Maximum disparity variation within each connected component. If you do speckle filtering, set the parameter to a positive value, it will be implicitly multiplied by 16. Normally, 1 or 2 is good enough.
        mode – Set it to StereoSGBM::MODE_HH to run the full-scale two-pass dynamic programming algorithm. It will consume O(W*H*numDisparities) bytes, which is large for 640x480 stereo and huge for HD-size pictures. By default, it is set to false .    
    '''

    stereo = cv2.StereoSGBM_create(
        minDisparity=min_disp,
        numDisparities=num_disp,
        blockSize=window_size,
        uniquenessRatio=uniquenessRatio,
        speckleRange=speckleRange,
        speckleWindowSize=speckleWindowSize,
        disp12MaxDiff=disp12MaxDiff,
        P1=P1,
        P2=P2
    )
    update()
    cv2.waitKey(0)
    cv2.destroyAllWindows()

2.2 StereoSGBM算法思路解析

1. 导入numpy 模块和 cv2 模块
2. 主代码:
给定参数初值
读入两幅图片
创建disparity窗口,并在里面创建五个滚动条
创建一个StereoSGBM实例 stereo
3. update函数:
将滚动条返回的值传给实例 stereo
调用compute方法计算视差图

2.3 个人疑惑

  1. 计算出disp为何要除以16?
    具体原因我不确定,可能的原因是:
    参考链接1. 出于精度需要,所有的视差在输出时都扩大了16倍(2^4)。
  2. 为什么最后要对disparity归一化?
    参考链接4.在整数表示的颜色空间中,数值范围是0-255,但在浮点数表示的颜色空间中,数值范围是0-1

3. GrabCut算法代码

import numpy as np
import cv2
from matplotlib import pyplot as plt

# 读入图片,并创建对应大小的掩膜
img = cv2.imread('statue_small.jpg')
mask = np.zeros(img.shape[:2], np.uint8)
print(img.shape)

# 创建背景、前景模型
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)

# 给定分割区域,并使用GrabCut分割
rect = (100, 50, 421, 378)
cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 10, cv2.GC_INIT_WITH_RECT)

# 判读掩膜,将背景置0,否则置1
mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8')
# 将掩膜变成三维与img与,得到结果
img = img*mask2[:, :, np.newaxis]

# 输出结果
plt.subplot(121), plt.imshow(cv2.cvtColor(cv2.imread('statue_small.jpg'), cv2.COLOR_BGR2RGB))
plt.title("original"), plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(img)
plt.title("grabcut"), plt.xticks([]), plt.yticks([])
plt.show()

参考链接5对GrabCutGrabCut函数说明及参数介绍已经足够详细,提供的代码我也加了详细的注释,就不再单独对算法思路进行介绍了。

4. 分水岭算法

4.1 分水岭代码

本代码,是在源码的基础上,结合个人需要改进的,原创。

import numpy as np
import cv2
from matplotlib import pyplot as plt

# 读入照片并灰度化
img = cv2.imread('basil.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 使用反二进制和OTSU进行阈值处理,大于阈值为黑,小于阈值为白
_, ret = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

# 消除噪声
kernel = np.ones((3, 3), np.uint8)
opening = cv2.morphologyEx(ret, cv2.MORPH_OPEN, kernel, iterations = 2)

# 确定背景区域 sure background area
sure_bg = cv2.dilate(opening, kernel, iterations=3)

# 寻找确定的前景区域 Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
_, sure_fg = cv2.threshold(dist_transform, 0.7*dist_transform.max(), 255, 0)

# 寻找不确定区域 Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg, sure_fg)

# 显示不同的区域
plt.figure("ret")
plt.subplot(2, 2, 1), plt.title('ret')
plt.imshow(ret, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 2), plt.title('sure_bg')
plt.imshow(sure_bg, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 3), plt.title('sure_fg')
plt.imshow(sure_fg, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 4), plt.title('unknown')
plt.imshow(unknown, cmap='gray'), plt.axis('off')
plt.show()

# 确定标记,并读取区域的个数(含背景) Marker labelling
num, markers = cv2.connectedComponents(sure_fg)
print(num)

# 将markers中的元素乘30,方便观察
markers2 = [item * 30 for item in markers]
markers2 = np.array(markers2, dtype=np.uint8)

# 所有背景区域+1 Add one to all labels so that sure background is not 0, but 1
markers = markers + 1
markers3 = [item * 30 for item in markers]
markers3 = np.array(markers3, dtype=np.uint8)

# 不确定区域置0 Now, mark the region of unknown with zero
markers[unknown == 255] = 0
markers3 = [item * 30 for item in markers]
markers3 = np.array(markers3, dtype=np.uint8)

# 使用分水岭算法执行基于标记的图像分割
markers = cv2.watershed(img, markers)
markers4 = np.array(markers, dtype=np.uint8)

# 汇总对比不同时期的markers
plt.figure("markers")
plt.subplot(2, 2, 1)
plt.imshow(markers2, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 2)
plt.imshow(markers3, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 3)
plt.imshow(markers4, cmap='gray'), plt.axis('off')
plt.subplot(2, 2, 4)
plt.imshow(markers, cmap='gray'), plt.axis('off')
plt.show()

# 区域之间的边界处为-1,绘制红色
img[markers == -1] = [255, 0, 0]

# 显示图片
plt.figure("img")
plt.imshow(img)
plt.show()

4.2 部分代码详解

参考链接7.在对照书读完分水岭算法后,突然发现此链接!
引用的内容都很精辟,后悔没提前看到此链接。

4.2.1 对灰度化的图像gray进行阈值处理

采用OTSU算法(大津法)得到阈值,然后大于阈值的点置0,其他的置1.得到ret.

_,ret=cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

4.2.2 创建标记,并打印区域的个数(含背景)

创建标记,然后在里面标记区域,我们确定的区域(前景或者背景)用不同的正整数标记出来,我们不确认的区域保持0,我们可以用 cv2.connectedComponents() 实现。其将图像背景标成0,其他目标用从1开始(1,2,3,……)的整数标记。

#确定标记,并读取区域的个数(含背景) Marker labelling
num, markers = cv2.connectedComponents(sure_fg)
print(num)

其实这就是我们的种子,水漫时会从这里漫出。

4.2.3 设置栅栏

根据未知区域unknown在markers中设置栅栏,并将背景区域加入种子区域,一起漫水

watershed漫水算法需要我们将栅栏区域设置为0,所以我们需要将markers中背景区域(原来为0,会干扰算法)设置为其他整数,解决方法:将markers整体+1.

如果背景标记为 0 不改变,那分水岭算法就会把它当成未知区域了。

markers = markers + 1
markers[unknown == 255] = 0

4.2.4 漫水并寻找栅栏

根据种子开始漫水,让水漫起来找到最后的漫出点(栅栏边界),越过这个点后各个山谷中水开始合并。注意watershed会将找到的栅栏在markers中设置为-1.

5. 结语

本文主要针对 《OpenCV 3 计算机视觉 Python语言实现》 第4章 深度估计与分割 的内容,捋一捋思路后针对自己的理解,做了一份笔记。
在写此博客期间,还有新的收获,解决了一个写博客之前存在的问题,很是lucky。
对分水岭算法理解遇到了点困难,动笔前以为理解了,结果写的过程中,发现原先的理解还是有偏差。

也许这就是我写博客的一大原因吧!一路走来,万分庆幸!
(主要是记录过往,还能装逼哈哈哈哈哈)
不过,我现在也还是好奇,为嘛背景也参与漫水?最后的漫水操作,怎么就能得到分水岭?
就此打住吧,不深究了。这些写好的函数,真的强。

哦,对了,大家七夕快乐哟~
半(程序)猿半(攻城)狮本渣渣,祝天下有情人终成眷属呀~

参考链接

  1. 双目匹配与视差计算
    https://blog.csdn.net/pinbodexiaozhu/article/details/45585361
  2. OpenCV3-Python深度估计—基于图像
    http://www.yyearth.com/article/18-06/227.html
    该文还提供了StereoBM()的实现,但我并没有测试。
  3. cv::StereoSGBM Class Reference
    https://docs.opencv.org/master/d2/d85/classcv_1_1StereoSGBM.html
    Open CV 文档中对StereoSGBM的定义
  4. Opencv中convertTo函数
    https://blog.csdn.net/liuhuicsu/article/details/70994840?fps=1&locationNum=8
  5. OpenCV(EmguCV)2.1新特性介绍之图像分割GrabCut(GrabCut Of OpenCV 2.1)
    https://www.cnblogs.com/xrwang/archive/2010/04/27/GrabCut.html
  6. Structural Analysis and Shape Descriptors
    https://docs.opencv.org/3.1.0/d3/dc0/group__imgproc__shape.html#gac2718a64ade63475425558aa669a943a
    Open CV 文档中对connectedComponents的定义
  7. OpenCV—分水岭算法
    https://www.cnblogs.com/ssyfj/p/9278815.html
    此链接包含了很多有用的链接,建议大家都打开看看。
发布了24 篇原创文章 · 获赞 15 · 访问量 5383

猜你喜欢

转载自blog.csdn.net/qq_34122861/article/details/98721969