V4L2 DQBUF 阻塞问题

之前在调试USB camera的时候,应用走的RK的USBcameraHAL,会出现问题:USB camera不支持热拔插,在apk预览的时候,拔掉USB camera,会出现卡死,apk没返回,导致插入后下次打不开apk预览,追溯其根源原因发现是HAL在DQBUF的时候阻塞没有返回。这篇文章简单讲一下这个问题的排查解决。

(1)问题描述

USB camera插入的状态下,apk预览,此时拔掉USB camera,即断开数据传输,出现应用无返回,必须杀死进程。

(2)问题原因

追溯问题的原因是因为预览的场景下,拔掉usb camera,数据流断开,但是上层应用还是在取流,在DQBUF的操作时候阻塞了,没有返回,导致的卡死。

(3)代码追溯

使用VIDIOC_DQBUF命令调用ioctl,是应用向驱动节点取数据的ioctl,最终会调用到vb2_dqbuf函数,内核使用vb2_dqbuf函数将填充满数据的缓存从驱动中返回给应用。

下面贴一下流程:

vb2_dqbuf 

/**
* vb2_dqbuf() - Dequeue a buffer to the userspace
* @q:>->---videobuf2 queue
* @b:>->---buffer structure passed from userspace to VIDIOC_DQBUF() handler
*>->---in driver
* @nonblocking: if true, this call will not sleep waiting for a buffer if no
*>->--- buffers ready for dequeuing are present. Normally the driver
*>->--- would be passing (file->f_flags & O_NONBLOCK) here
*
* Should be called from VIDIOC_DQBUF() ioctl handler of a driver.
*
* This function:
*
* #) verifies the passed buffer,
* #) calls buf_finish callback in the driver (if provided), in which
*    driver can perform any additional operations that may be required before
*    returning the buffer to userspace, such as cache sync,
* #) the buffer struct members are filled with relevant information for
*    the userspace.
*
* The return values from this function are intended to be directly returned
* from VIDIOC_DQBUF() handler in driver.
*/
int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking)
{
    int ret;

    if (vb2_fileio_is_active(q)) {
        dprintk(1, "file io in progress\n");
        return -EBUSY;
    }

    if (b->type != q->type) {
        dprintk(1, "invalid buffer type\n");
        return -EINVAL;
    }

    ret = vb2_core_dqbuf(q, NULL, b, nonblocking);

    /*
     *  After calling the VIDIOC_DQBUF V4L2_BUF_FLAG_DONE must be
     *  cleared.
     */
    b->flags &= ~V4L2_BUF_FLAG_DONE;

    return ret;
}

中间的代码流程这边先省略,看下最后调用的地方:__vb2_wait_for_done_vb

/**
* __vb2_wait_for_done_vb() - wait for a buffer to become available
* for dequeuing
*
* Will sleep if required for nonblocking == false.
*/
static int __vb2_wait_for_done_vb(struct vb2_queue *q, int nonblocking)
{

    for (;;) {
        int ret;
        /*
         * 调用过STREAM_ON 这里streaming值为1
         * 如何为0,说明执行了stream_off操作,就不会有数据继续产生
         */
        if (!q->streaming) {
            dprintk(1, "streaming off, will not wait for buffers\n");
            return -EINVAL;
        }

        if (q->error) {
            dprintk(1, "Queue in error state, will not wait for buffers\n");
            return -EIO;
        }

        if (q->last_buffer_dequeued) {
            dprintk(3, "last buffer dequeued already, will not wait for buffers\n");
            return -EPIPE;
        }
       /*
        * done_list上有buffer则跳出这个循环,继续往下走
        * 对于使用了select的方式,这里应该就返回了
        */
        if (!list_empty(&q->done_list)) {
            /*
             * Found a buffer that we were waiting for.
             */
            break;
        }

        if (nonblocking) {
            dprintk(3, "nonblocking and no buffers to dequeue, will not wait\n");
            return -EAGAIN;
        }

        /*
         * We are streaming and blocking, wait for another buffer to
         * become ready or for streamoff. Driver's lock is released to
         * allow streamoff or qbuf to be called while waiting.
         */
        call_void_qop(q, wait_prepare, q);

        /*
         * All locks have been released, it is safe to sleep now.
         */
        dprintk(3, "will sleep waiting for buffers\n");
       /*
        * wait_event_interruptible(wq, condition)
        * 对于condition来说
        * condition = 0 休眠
        * condition = 1 唤醒
        * 前提是wake_up_interruptible唤醒后,进一步才是condition
        */
        ret = wait_event_interruptible(q->done_wq,
                !list_empty(&q->done_list) || !q->streaming ||
                q->error);

        /*
         * We need to reevaluate both conditions again after reacquiring
         * the locks or return an error if one occurred.
         */

        call_void_qop(q, wait_finish, q);
        if (ret) {
            dprintk(1, "sleep was interrupted\n");
            return ret;
        }
    }
    return 0;
}

阻塞的时候就是卡在了wait_event_interruptible,这里一直没有返回,实际出现的问题的场景下是因为上层再取数据的时候,底层数据发生了异常,导致buf队列没有数据,然后就卡在这个位置没有返回了,根本是需要解决底层没有数据的问题。

但是应用的流程上应该需要在卡住的时候设置超时,返回错误,而不是卡死,最后是使用了select的函数来监听文件句柄。

(4)解决方法

使用select函数对fd文件句柄进行监测,当阻塞时,则不会去dqbuf,这样就可以有效避免上述问题的出现。

@@ -3916,11 +3949,28 @@ sp<V4L2Frame> ExternalCameraDeviceSession::dequeueV4l2FrameLocked(/*out*/nsecs_t
         buffer.m.planes = planes;
         buffer.length = PLANES_NUM;
     }
-
+    ALOGE("@%s(%d) VIDIOC_DQBUF begin",__FUNCTION__,__LINE__);
+    int ts;
+    fd_set fds;
+    struct timeval tv;
+
+    FD_ZERO(&fds);
+    FD_SET(mV4l2Fd.get(), &fds);
+    tv.tv_sec = 2;
+	tv.tv_usec = 0;
+
+    ts = select(mV4l2Fd.get() + 1, &fds, NULL, NULL, &tv);
+    ALOGE("@%s(%d) select time",__FUNCTION__,__LINE__);
+	if(ts == 0)
+	{
+        ALOGE("@%s(%d) select time out",__FUNCTION__,__LINE__);
+		return -1;
+	}
     if (TEMP_FAILURE_RETRY(ioctl(mV4l2Fd.get(), VIDIOC_DQBUF, &buffer)) < 0) {
         ALOGE("%s: DQBUF fails: %s", __FUNCTION__, strerror(errno));
         return ret;
     }
+    ALOGE("@%s(%d) VIDIOC_DQBUF done",__FUNCTION__,__LINE__);
 #endif
     ATRACE_END();

①如果参数timeout设为NULL,则表示select()一直阻塞,直到有句柄状态变化

②如果timeout值为0,则select不阻塞直接返回

③如果timeout为某个特定值,则在特定时间内阻塞直到有句柄状态变化,如果这个时间所有句柄状态都无变化,则超时返回0

猜你喜欢

转载自blog.csdn.net/qq_34341546/article/details/129039725