前言
本系列内容力求将nvdla的内核态驱动整理清楚,如果有分析不对的请指出。本章是分析nvdla_gem.c代码第二部分。
一的链接如下:NVDLA内核态驱动代码整理一
续一整理好的函数和结构体的内容:
函数原型 | 功能 |
---|---|
static int32_t nvdla_fill_task_desc(struct nvdla_ioctl_submit_task *local_task,struct nvdla_task *task) |
将local_task 的任务地址数量num_addresses 和任务具体内容的指针handles ,其中local_task->num_addresses * sizeof(struct nvdla_mem_handle) 就是在申请所有具体任务相关数据的地址空间 |
static int32_t nvdla_submit(struct drm_device *drm, void *arg,struct drm_file *file) |
nvdla_submit 函数传入参数arg (该参数的使之内容是nvdla_submit_args 结构体类型的变量,包含内容为任务、任务的数量等),arg 传入的任务转换为nvdla_ioctl_submit_task 结构体类型的任务,随后调用nvdla_fill_task_desc 完成用户态空间任务数据到内核态空间任务数据的下陷。与此同时,利用传入的drm_device 结构体指针drm 通过dev_get_drvdata 来获取与其他子系统交互的过程中当前的driver data ,从而引入完成nvdla_fill_task_desc 功能的另一个关键变量task ,并将drm_file 结构体提交给task ,其中drm_file 结构体包含针对该file的每个文件描述符操作后的状态变量。最后使用nvdla_task_submit 函数提交 NVDLA 任务并等待任务完成的函数。 |
static int32_t nvdla_gem_alloc(struct nvdla_gem_object *nobj) |
nvdla_gem_alloc 函数,该函数传入的变量是nvdla用于存储管理的结构体nvdla_gem_object ,根据前面介绍,该结构含有三个重要的变量,负责drm 下存储分配和管理的drm_gem_object 结构体、内核态虚拟地址kvaddr 和dma 相关变量。整个函数实现的功能是dma地址分配。 |
static void nvdla_gem_free(struct nvdla_gem_object *nobj) |
释放nvdla_gem_alloc 申请到的设备dma缓冲区 |
static struct nvdla_gem_object * nvdla_gem_create_object(struct drm_device *drm, uint32_t size) |
用于创建 NVDLA GEM对象的函数,随后分配和管理 DMA缓冲区的内核对象。前半部分的创建通过内核定义APIdrm_gem_private_object_init 函数实现,后半部分调用nvdla_gem_alloc 实现 |
static void nvdla_gem_free_object(struct drm_gem_object *dobj) |
用于释放 NVDLA GEM对象的函数,用于销毁和释放先前分配的 DMA缓冲区的内核对象 |
static struct nvdla_gem_object * nvdla_gem_create_with_handle(struct drm_file *file_priv,struct drm_device *drm, uint32_t size,uint32_t *handle) |
用于创建具有句柄(handle)的 NVDLA GEM对象的函数。它允许用户空间应用程序创建 GEM 对象,并返回一个句柄 |
static int32_t nvdla_gem_create(struct drm_device *drm, void *data, struct drm_file *file) |
和nvdla_gem_create_with_handle(struct drm_file *file_priv,struct drm_device *drm, uint32_t size,uint32_t *handle) 完全一样 |
结构体 | 功能 |
---|---|
nvdla_gem_object |
包含重要的变量,首先是drm_gem_object ,用于drm 存储管理和分配的结构体;其次是*kvaddr :这是一个指针成员,通常用于存储内核虚拟地址。这个地址指向内核中的数据缓冲区,该缓冲区可能包含了与图形或DMA相关的数据。这个成员可能被用于快速访问数据,而无需进行物理内存地址转换;最后是和dma 相关的地址和属性 |
nvdla_mem_handle |
作为媒介联通用户态空间任务结构体nvdla_ioctl_submit_task 和内核态空间任务结构体nvdla_task |
nvdla_ioctl_submit_task |
用户态空间任务结构体 |
nvdla_task |
内核态空间任务结构体 |
nvdla_device |
包含的信息是设备常用信息,比如中断、平台设备、drm设备等 |
nvdla_submit_args |
该结构体包含任务信息,用于用户态空间传入任务相关数据的参数,并通过该参数和nvdla_ioctl_submit_task 交互,总体来说,任务粒度高于nvdla_ioctl_submit_task |
drm_file |
包含针对该file的每个文件描述符操作后的状态变量 |
drm_gem_object |
描述drm 的存储分配对象,包含了该对象归属的设备drm_device 和对象的大小size |
drm_device |
描述了drm 设备结构体,包含了该总线设备的数据结构 |
一、nvdla_gem.c代码解读二
1. nvdla_drm_gem_object_mmap函数与物理地址、虚拟地址、总线地址地厘清
继续读代码,nvdla_drm_gem_object_mmap
函数如下:
static int32_t nvdla_drm_gem_object_mmap(struct drm_gem_object *dobj,
struct vm_area_struct *vma)
/*
接受两个参数,一个是指向 struct drm_gem_object 结构体的指针 dobj,
表示要进行内存映射的 GEM 对象,另一个是指向 struct vm_area_struct 结构体的指针 vma,
表示正在进行内存映射的虚拟内存区域。
*/
{
int32_t ret;
struct nvdla_gem_object *nobj = to_nvdla_obj(dobj);
struct drm_device *drm = dobj->dev;
vma->vm_flags &= ~VM_PFNMAP; // 清除 vma 的 vm_flags,将其中的 VM_PFNMAP 标志位清零。VM_PFNMAP 标志表示使用物理页帧号(PFN)进行映射,这里将其清除,表示不使用 PFN 进行映射。
vma->vm_pgoff = 0; // 将 vma 的 vm_pgoff 成员设置为零,表示映射的起始页帧号为零。
/*
调用 dma_mmap_attrs 函数来执行内存映射操作。这个函数通常由 DMA 设备驱动程序提供,
用于将 DMA 缓冲区映射到虚拟内存区域中。参数包括:
drm->dev:DRM 设备的指针。
vma:要映射的虚拟内存区域。
nobj->kvaddr:GEM 对象的内核虚拟地址。
nobj->dma_addr:GEM 对象的 DMA 地址。
dobj->size:GEM 对象的大小。
nobj->dma_attrs:DMA 属性。
映射成功时,该函数会返回零;否则,它会返回一个错误码。
*/
ret = dma_mmap_attrs(drm->dev, vma, nobj->kvaddr, nobj->dma_addr,
dobj->size, nobj->dma_attrs);
if (ret)
drm_gem_vm_close(vma);
// 如果映射失败,调用 drm_gem_vm_close(vma) 来关闭虚拟内存区域,并释放映射的资源。然后返回错误码,表示映射失败。
return ret;
}
顺带一提mmap
的功能,比如显卡设备有很大的显存,驱动程序将该显存映射到内核的地址空间,如果用户想要进行绘制,则需要在用户空间中开辟出一样大小的内存,将要绘制的图像数据填充到该内存中,然后调用write
系统调用,将数据复制到内核空间的显存中,从而完成图像绘制。但是这个过程伴随着相当大的性能损耗,因此如果应用程序可以直接访问显存将非常省事儿!字符设备驱动提供了一个mmap
的接口,将内核空间的内存所对应的“物理地址空间"
"再次映射"到"用户空间"
。
因此可以理解该函数的意图:用于实现 NVDLA GEM对象的内存映射(mmap)操作的函数。内存映射允许用户空间应用程序将内核中的 GEM 对象映射到应用程序的地址空间中,以便应用程序可以直接访问该对象的数据。
首先该函数的传入参数包括drm_gem_object *dobj
,因此还是经典三件套,nvdla_gem_object * nobj
和drm_device *drm
,并通过指针建立关系。因此剩下的就是本函数的重点。
1、dma_mmap_attrs
函数,原型如下:
/**
* dma_mmap_attrs - map a coherent DMA allocation into user space
* @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
* @vma: vm_area_struct describing requested user mapping
* @cpu_addr: kernel CPU-view address returned from dma_alloc_attrs
* @handle: device-view address returned from dma_alloc_attrs
* @size: size of memory originally requested in dma_alloc_attrs
* @attrs: attributes of mapping properties requested in dma_alloc_attrs
*
* Map a coherent DMA buffer previously allocated by dma_alloc_attrs
* into user space. The coherent DMA buffer must not be freed by the
* driver until the user space mapping has been released.
*/
static inline int
dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr,
dma_addr_t dma_addr, size_t size, unsigned long attrs)
{
const struct dma_map_ops *ops = get_dma_ops(dev);
BUG_ON(!ops);
if (ops->mmap)
return ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size);
}
从注释可以看出该函数将之前由dma_alloc_attrs()
分配的一致DMA缓冲区映射到用户空间中。在释放用户空间映射之前,驱动程序不得释放一致性DMA缓冲区。其中vma
指的是vm_area_struct
描述请求的用户映射。
重点:这里有两个概念需要澄清:
1)、void *cpu_addr
是kernel CPU-view address returned from dma_alloc_attrs
,说人话就是从CPU视角看的虚拟地址,现行的CPU使用虚拟地址来获取指令和数据,然而虚拟地址在交给Memory以后由内存控制器完成虚拟地址向物理地址的转换,因为内存需要使用物理地址来管理地址。
2)、dma_addr_t dma_addr
是device-view address returned from dma_alloc_attrs
,这个是从设备视角看到的地址,也可以理解成总线地址。
我从《Linux设备驱动开发详解》上摘几句话:
基于DMA的硬件使用的是总线地址而不是物理地址,总线地址是从设备角度上看到的内存地址,
物理地址则是从CPU MMU控制器外围角度上看到的内存地址(从CPU核角度看到的是虚拟地址)。
虽然在PC上,对于ISA和PCI而言,总线地址就是物理地址,但并非每个平台都是如此。
因为有时候接口总线通过桥接电路连接,桥接电路会将IO地址映射为不同的物理地址。
可以顺带看一看vm_area_strcut
结构体,该结构体定义了一个内存VMM内存区域。每个VM-区域/任务都有一个。VM 区域是进程虚拟内存空间的任何部分,对页错误处理程序有特殊规则(即共享库、可执行区域等)。
/*
* This struct defines a memory VMM memory area. There is one of these
* per VM-area/task. A VM area is any part of the process virtual memory
* space that has a special rule for the page-fault handlers (ie a shared
* library, the executable area etc).
*/
struct vm_area_struct {
/* The first cache line has the info for VMA tree walking. */
unsigned long vm_start; /* Our start address within vm_mm. */
unsigned long vm_end; /* The first byte after our end address
within vm_mm. */
/* linked list of VM areas per task, sorted by address */
struct vm_area_struct *vm_next, *vm_prev;
struct rb_node vm_rb;
/*
* Largest free memory gap in bytes to the left of this VMA.
* Either between this VMA and vma->vm_prev, or between one of the
* VMAs below us in the VMA rbtree and its ->vm_prev. This helps
* get_unmapped_area find a free area of the right size.
*/
unsigned long rb_subtree_gap;
/* Second cache line starts here. */
struct mm_struct *vm_mm; /* The address space we belong to. */
pgprot_t vm_page_prot; /* Access permissions of this VMA. */
unsigned long vm_flags; /* Flags, see mm.h. */
/*
* For areas with an address space and backing store,
* linkage into the address_space->i_mmap interval tree.
*/
struct {
struct rb_node rb;
unsigned long rb_subtree_last;
} shared;
/*
* A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma
* list, after a COW of one of the file pages. A MAP_SHARED vma
* can only be in the i_mmap tree. An anonymous MAP_PRIVATE, stack
* or brk vma (with NULL file) can only be in an anon_vma list.
*/
struct list_head anon_vma_chain; /* Serialized by mmap_sem &
* page_table_lock */
struct anon_vma *anon_vma; /* Serialized by page_table_lock */
/* Function pointers to deal with this struct. */
const struct vm_operations_struct *vm_ops;
/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
units */
struct file * vm_file; /* File we map to (can be NULL). */
void * vm_private_data; /* was vm_pte (shared mem) */
atomic_long_t swap_readahead_info;
#ifndef CONFIG_MMU
struct vm_region *vm_region; /* NOMMU mapping region */
#endif
#ifdef CONFIG_NUMA
struct mempolicy *vm_policy; /* NUMA policy for the VMA */
#endif
struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
} __randomize_layout;
2、drm_gem_vm_close
函数,原型如下:
/**
* drm_gem_vm_close - vma->ops->close implementation for GEM
* @vma: VM area structure
*
* This function implements the #vm_operations_struct close() callback for GEM
* drivers. This must be used together with drm_gem_vm_open().
*/
void drm_gem_vm_close(struct vm_area_struct *vma)
{
struct drm_gem_object *obj = vma->vm_private_data;
drm_gem_object_put_unlocked(obj);
}
用于关闭虚拟内存区域,并释放映射的资源。然后返回错误码,表示映射失败。
2. nvdla_drm_gem_mmap_buf函数
继续读代码,nvdla_drm_gem_mmap_buf
函数如下:
static int32_t nvdla_drm_gem_mmap_buf(struct drm_gem_object *obj,
struct vm_area_struct *vma)
{
int32_t ret;
ret = drm_gem_mmap_obj(obj, obj->size, vma);
if (ret)
return ret;
return nvdla_drm_gem_object_mmap(obj, vma);
}
按照代码中的ret情况,如果ret没有成功,那么必然会执行nvdla_drm_gem_object_mmap
,所以nvdla_drm_gem_object_mmap
和drm_gem_mmap_obj
的功能是相同的,可能是后者创建方式对软硬件更为友好,但后者如果行不通,则前者兜底。该函数中关键的部分是drm_gem_mmap_obj()函数
,其原型如下:
/**
* drm_gem_mmap_obj - memory map a GEM object
* @obj: the GEM object to map
* @obj_size: the object size to be mapped, in bytes
* @vma: VMA for the area to be mapped
*
* Set up the VMA to prepare mapping of the GEM object using the gem_vm_ops
* provided by the driver. Depending on their requirements, drivers can either
* provide a fault handler in their gem_vm_ops (in which case any accesses to
* the object will be trapped, to perform migration, GTT binding, surface
* register allocation, or performance monitoring), or mmap the buffer memory
* synchronously after calling drm_gem_mmap_obj.
*
* This function is mainly intended to implement the DMABUF mmap operation, when
* the GEM object is not looked up based on its fake offset. To implement the
* DRM mmap operation, drivers should use the drm_gem_mmap() function.
*
* drm_gem_mmap_obj() assumes the user is granted access to the buffer while
* drm_gem_mmap() prevents unprivileged users from mapping random objects. So
* callers must verify access restrictions before calling this helper.
*
* Return 0 or success or -EINVAL if the object size is smaller than the VMA
* size, or if no gem_vm_ops are provided.
*/
int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned long obj_size,
struct vm_area_struct *vma)
{
struct drm_device *dev = obj->dev;
/* Check for valid size. */
if (obj_size < vma->vm_end - vma->vm_start)
return -EINVAL;
if (!dev->driver->gem_vm_ops)
return -EINVAL;
vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_ops = dev->driver->gem_vm_ops;
vma->vm_private_data = obj;
vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
/* Take a ref for this mapping of the object, so that the fault
* handler can dereference the mmap offset's pointer to the object.
* This reference is cleaned up by the corresponding vm_close
* (which should happen whether the vma was created by this call, or
* by a vm_open due to mremap or partial unmap or whatever).
*/
drm_gem_object_get(obj);
return 0;
}
3. nvdla_drm_gem_mmap函数
继续读代码,nvdla_drm_gem_mmap
函数原型如下:
static int32_t nvdla_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
{
int32_t ret;
struct drm_gem_object *obj;
ret = drm_gem_mmap(filp, vma);
if (ret)
return ret;
obj = vma->vm_private_data;
return nvdla_drm_gem_object_mmap(obj, vma);
}
略去不提,和前面那个函数的功能类似,都有nvdla_drm_gem_object_mmap
兜底。
4. nvdla_drm_gem_prime_get_sg_table函数与Scatter-Gather表
继续读代码,nvdla_drm_gem_prime_get_sg_table
函数原型如下:
static struct sg_table
*nvdla_drm_gem_prime_get_sg_table(struct drm_gem_object *dobj)
{
int32_t ret;
struct sg_table *sgt;
struct drm_device *drm = dobj->dev;
struct nvdla_gem_object *nobj = to_nvdla_obj(dobj);
sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
if (!sgt)
return ERR_PTR(-ENOMEM);
/*
ret = dma_get_sgtable_attrs(drm->dev, sgt, nobj->kvaddr, nobj->dma_addr, dobj->size, nobj->dma_attrs);:调用 dma_get_sgtable_attrs 函数来填充 SG 表。这个函数接受多个参数,包括:
drm->dev:表示 DRM 设备。
sgt:指向 SG 表的指针,用于存储结果。
nobj->kvaddr:GEM 对象的内核虚拟地址。
nobj->dma_addr:GEM 对象的 DMA 地址。
dobj->size:GEM 对象的大小。
nobj->dma_attrs:DMA 属性,可能包括一些标志和参数,以控制 DMA 操作的行为。
*/
ret = dma_get_sgtable_attrs(drm->dev, sgt, nobj->kvaddr,
nobj->dma_addr, dobj->size,
nobj->dma_attrs);
if (ret) {
DRM_ERROR("failed to allocate sgt, %d\n", ret);
kfree(sgt);
return ERR_PTR(ret);
}
return sgt;
}
该函数实现了实现了在 GEM对象上获取 Scatter-Gather 表(SG 表)的操作。SG 表是一种数据结构,用于描述分散在物理内存中的连续数据块的位置和大小,通常在 DMA操作中使用,以便可以有效地传输分散的数据块。
其中重要的内容包括两项:
1、sg_table
结构体
struct sg_table {
struct scatterlist *sgl; /* the list */
unsigned int nents; /* number of mapped entries */
unsigned int orig_nents; /* original size of list */
};
/*
* Notes on SG table design.
*
* We use the unsigned long page_link field in the scatterlist struct to place
* the page pointer AND encode information about the sg table as well. The two
* lower bits are reserved for this information.
*
* If bit 0 is set, then the page_link contains a pointer to the next sg
* table list. Otherwise the next entry is at sg + 1.
*
* If bit 1 is set, then this sg entry is the last element in a list.
*
* See sg_next().
*
*/
我们使用 scatterlist
结构中的 unsigned long page_link
字段来放置页面指针并编码有关 sg
表的信息。两个较低位被保留用于该信息,如果设置了bit 0,则page_link
包含指向下一个sg
表列表的指针,否则下一个条目位sg + 1
。如果设置了bit 1,则该sg
条目是列表中的最后一个元素。
2、dma_get_sgtable_attrs
函数,内核中查到的函数原型如下:
static inline int
dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt, void *cpu_addr,
dma_addr_t dma_addr, size_t size,
unsigned long attrs)
{
const struct dma_map_ops *ops = get_dma_ops(dev);
BUG_ON(!ops);
if (ops->get_sgtable)
return ops->get_sgtable(dev, sgt, cpu_addr, dma_addr, size,
attrs);
return dma_common_get_sgtable(dev, sgt, cpu_addr, dma_addr, size);
}
注意到dma_get_sgtable_attrs
也同样有虚拟地址和总线地址之分。
5. nvdla_drm_gem_prime_vmap函数
继续读代码,nvdla_drm_gem_prime_vmap
函数原型如下:
static void *nvdla_drm_gem_prime_vmap(struct drm_gem_object *obj)
{
struct nvdla_gem_object *nobj = to_nvdla_obj(obj);
return nobj->kvaddr;
}
该函数用于返回虚拟地址。
6. nvdla_gem_dma_addr函数
继续读代码,nvdla_gem_dma_addr
函数原型如下:
int32_t nvdla_gem_dma_addr(struct drm_device *dev, struct drm_file *file,
uint32_t fd, dma_addr_t *addr)
{
int32_t ret;
uint32_t handle;
struct nvdla_gem_object *nobj;
struct drm_gem_object *dobj;
ret = drm_gem_prime_fd_to_handle(dev, file, fd, &handle);
if (ret)
return ret;
dobj = drm_gem_object_lookup(file, handle);
if (!dobj)
return -EINVAL;
nobj = to_nvdla_obj(dobj);
*addr = nobj->dma_addr;
//drm_gem_object_put_unlocked(dobj);
drm_gem_object_unreference_unlocked(dobj);
return 0;
}
该函数的目的是获取给定文件描述符(fd)
对应的GEM
对象的DMA
地址。首先,通过 drm_gem_prime_fd_to_handle
函数将文件描述符
转换为GEM
对象的句柄(handle)
。然后,通过 drm_gem_object_lookup
函数查找具有给定句柄
的GEM
对象。接着将找到的GEM
对象转换为特定类型的GEM
对象指针。最后,将GEM
对象的DMA 地址(dma_addr)
赋值给addr
参数,并释放GEM
对象的引用计数。
其中有2个函数比较重要。
1、drm_gem_prime_fd_to_handle
,函数原型如下:
/**
* drm_gem_prime_fd_to_handle - PRIME import function for GEM drivers
* @dev: dev to export the buffer from
* @file_priv: drm file-private structure
* @prime_fd: fd id of the dma-buf which should be imported
* @handle: pointer to storage for the handle of the imported buffer object
*
* This is the PRIME import function which must be used mandatorily by GEM
* drivers to ensure correct lifetime management of the underlying GEM object.
* The actual importing of GEM object from the dma-buf is done through the
* gem_import_export driver callback.
*/
int drm_gem_prime_fd_to_handle(struct drm_device *dev,
struct drm_file *file_priv, int prime_fd,
uint32_t *handle)
{
struct dma_buf *dma_buf;
struct drm_gem_object *obj;
int ret;
dma_buf = dma_buf_get(prime_fd);
if (IS_ERR(dma_buf))
return PTR_ERR(dma_buf);
mutex_lock(&file_priv->prime.lock);
ret = drm_prime_lookup_buf_handle(&file_priv->prime,
dma_buf, handle);
if (ret == 0)
goto out_put;
/* never seen this one, need to import */
mutex_lock(&dev->object_name_lock);
obj = dev->driver->gem_prime_import(dev, dma_buf);
if (IS_ERR(obj)) {
ret = PTR_ERR(obj);
goto out_unlock;
}
if (obj->dma_buf) {
WARN_ON(obj->dma_buf != dma_buf);
} else {
obj->dma_buf = dma_buf;
get_dma_buf(dma_buf);
}
/* _handle_create_tail unconditionally unlocks dev->object_name_lock. */
ret = drm_gem_handle_create_tail(file_priv, obj, handle);
drm_gem_object_put_unlocked(obj);
if (ret)
goto out_put;
ret = drm_prime_add_buf_handle(&file_priv->prime,
dma_buf, *handle);
mutex_unlock(&file_priv->prime.lock);
if (ret)
goto fail;
dma_buf_put(dma_buf);
return 0;
fail:
/* hmm, if driver attached, we are relying on the free-object path
* to detach.. which seems ok..
*/
drm_gem_handle_delete(file_priv, *handle);
dma_buf_put(dma_buf);
return ret;
out_unlock:
mutex_unlock(&dev->object_name_lock);
out_put:
mutex_unlock(&file_priv->prime.lock);
dma_buf_put(dma_buf);
return ret;
}
这是GEM驱动程序必须强制使用的PRIME导入功能,以确保对底层GEM对象进行正确的生命周期管理。从dma-buf实际导入GEM对象是通过GEM_import_export驱动程序回调完成的。
2、drm_gem_object_lookup
,函数原型如下:
/**
* drm_gem_object_lookup - look up a GEM object from it's handle
* @filp: DRM file private date
* @handle: userspace handle
*
* Returns:
*
* A reference to the object named by the handle if such exists on @filp, NULL
* otherwise.
*/
struct drm_gem_object *
drm_gem_object_lookup(struct drm_file *filp, u32 handle)
{
struct drm_gem_object *obj;
spin_lock(&filp->table_lock);
/* Check if we currently have a reference on the object */
obj = idr_find(&filp->object_idr, handle);
if (obj)
drm_gem_object_get(obj);
spin_unlock(&filp->table_lock);
return obj;
}
格外注意两个函数传入参数的flip
和handle
,前者是DRM File的私有数据,后者是用户态空间的数据。
7. nvdla_gem_destroy函数
继续读代码,nvdla_gem_destroy
函数原型如下:
/*
static int32_t nvdla_gem_destroy(struct drm_device *drm, void *data, struct drm_file *file):
这个函数的目的是销毁给定句柄对应的 GEM 对象。
通过 drm_gem_dumb_destroy 函数销毁具有给定句柄的 GEM 对象
*/
static int32_t nvdla_gem_destroy(struct drm_device *drm, void *data,
struct drm_file *file)
{
struct nvdla_gem_destroy_args *args = data;
return drm_gem_dumb_destroy(file, drm, args->handle);
}
8. nvdla_drm_fops结构体
继续读代码,nvdla_drm_fops
结构体注册,如下:
static const struct file_operations nvdla_drm_fops = {
.owner = THIS_MODULE,
.open = drm_open,
.release = drm_release,
.unlocked_ioctl = drm_ioctl,
.mmap = nvdla_drm_gem_mmap,
.poll = drm_poll,
.read = drm_read,
#ifdef CONFIG_COMPAT
.compat_ioctl = drm_compat_ioctl,
#endif
.llseek = noop_llseek,
};
这里属于基本操作的范畴了,不赘述。不过对前面的代码翻来覆去,在文件操作命令上注册的只有nvdla_drm_gem_mmap
函数感到十分诧异。但也觉得很有道理,就是必然需要一个操作将DMA映射到内核态内存空间的数据再次映射到用户态内存空间。
9. drm_ioctl_desc结构体
继续读代码,drm_ioctl_desc
结构体注册,如下:
static const struct drm_ioctl_desc nvdla_drm_ioctls[] = {
DRM_IOCTL_DEF_DRV(NVDLA_SUBMIT, nvdla_submit, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(NVDLA_GEM_CREATE, nvdla_gem_create, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(NVDLA_GEM_MMAP, nvdla_gem_map_offset, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(NVDLA_GEM_DESTROY, nvdla_gem_destroy, DRM_RENDER_ALLOW),
};
static struct drm_driver nvdla_drm_driver = {
// 设备和驱动分离思想的体现
.driver_features = DRIVER_GEM | DRIVER_PRIME | DRIVER_RENDER,
.gem_vm_ops = &drm_gem_cma_vm_ops,
.gem_free_object_unlocked = nvdla_gem_free_object,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_export = drm_gem_prime_export,
.gem_prime_import = drm_gem_prime_import,
.gem_prime_get_sg_table = nvdla_drm_gem_prime_get_sg_table,
.gem_prime_vmap = nvdla_drm_gem_prime_vmap,
.gem_prime_vunmap = nvdla_drm_gem_prime_vunmap,
.gem_prime_mmap = nvdla_drm_gem_mmap_buf,
.ioctls = nvdla_drm_ioctls,
.num_ioctls = ARRAY_SIZE(nvdla_drm_ioctls),
.fops = &nvdla_drm_fops,
.name = "nvdla",
.desc = "NVDLA driver",
.date = "20171017",
.major = 0,
.minor = 0,
.patchlevel = 0,
};
此处体现设备和驱动分离的设计思想,相关函数的定义在本章和上一章中已讲解,不再展开。
我们只要介绍2个结构体。
1、drm_ioctl_desc
结构体,原型如下:
/* Ioctl table */
static const struct drm_ioctl_desc drm_ioctls[] = {
DRM_IOCTL_DEF(DRM_IOCTL_VERSION, drm_version,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_GET_UNIQUE, drm_getunique, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_GET_MAGIC, drm_getmagic, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_IRQ_BUSID, drm_irq_by_busid, DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_GET_MAP, drm_legacy_getmap_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_GET_STATS, drm_getstats, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_GET_CAP, drm_getcap, DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SET_CLIENT_CAP, drm_setclientcap, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_SET_VERSION, drm_setversion, DRM_UNLOCKED | DRM_MASTER),
DRM_IOCTL_DEF(DRM_IOCTL_SET_UNIQUE, drm_invalid_op, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_BLOCK, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_UNBLOCK, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AUTH_MAGIC, drm_authmagic, DRM_AUTH|DRM_UNLOCKED|DRM_MASTER),
DRM_IOCTL_DEF(DRM_IOCTL_ADD_MAP, drm_legacy_addmap_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_RM_MAP, drm_legacy_rmmap_ioctl, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_SET_SAREA_CTX, drm_legacy_setsareactx, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_GET_SAREA_CTX, drm_legacy_getsareactx, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_SET_MASTER, drm_setmaster_ioctl, DRM_UNLOCKED|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_DROP_MASTER, drm_dropmaster_ioctl, DRM_UNLOCKED|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_ADD_CTX, drm_legacy_addctx, DRM_AUTH|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_RM_CTX, drm_legacy_rmctx, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_MOD_CTX, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_GET_CTX, drm_legacy_getctx, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_SWITCH_CTX, drm_legacy_switchctx, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_NEW_CTX, drm_legacy_newctx, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_RES_CTX, drm_legacy_resctx, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_ADD_DRAW, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_RM_DRAW, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_LOCK, drm_legacy_lock, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_UNLOCK, drm_legacy_unlock, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_FINISH, drm_noop, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_ADD_BUFS, drm_legacy_addbufs, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_MARK_BUFS, drm_legacy_markbufs, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_INFO_BUFS, drm_legacy_infobufs, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_MAP_BUFS, drm_legacy_mapbufs, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_FREE_BUFS, drm_legacy_freebufs, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_DMA, drm_legacy_dma_ioctl, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_CONTROL, drm_legacy_irq_control, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
#if IS_ENABLED(CONFIG_AGP)
DRM_IOCTL_DEF(DRM_IOCTL_AGP_ACQUIRE, drm_agp_acquire_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_RELEASE, drm_agp_release_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_ENABLE, drm_agp_enable_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_INFO, drm_agp_info_ioctl, DRM_AUTH),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_ALLOC, drm_agp_alloc_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_FREE, drm_agp_free_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_BIND, drm_agp_bind_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_AGP_UNBIND, drm_agp_unbind_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
#endif
DRM_IOCTL_DEF(DRM_IOCTL_SG_ALLOC, drm_legacy_sg_alloc, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_SG_FREE, drm_legacy_sg_free, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_WAIT_VBLANK, drm_wait_vblank_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODESET_CTL, drm_legacy_modeset_ctl_ioctl, 0),
DRM_IOCTL_DEF(DRM_IOCTL_UPDATE_DRAW, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_GEM_CLOSE, drm_gem_close_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_GEM_FLINK, drm_gem_flink_ioctl, DRM_AUTH|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_GEM_OPEN, drm_gem_open_ioctl, DRM_AUTH|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_PRIME_HANDLE_TO_FD, drm_prime_handle_to_fd_ioctl, DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_PRIME_FD_TO_HANDLE, drm_prime_fd_to_handle_ioctl, DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETPLANERESOURCES, drm_mode_getplane_res, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETCRTC, drm_mode_getcrtc, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_SETCRTC, drm_mode_setcrtc, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETPLANE, drm_mode_getplane, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_SETPLANE, drm_mode_setplane, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CURSOR, drm_mode_cursor_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETGAMMA, drm_mode_gamma_get_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_SETGAMMA, drm_mode_gamma_set_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETENCODER, drm_mode_getencoder, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETCONNECTOR, drm_mode_getconnector, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_ATTACHMODE, drm_noop, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_DETACHMODE, drm_noop, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETPROPERTY, drm_mode_getproperty_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_SETPROPERTY, drm_connector_property_set_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETPROPBLOB, drm_mode_getblob_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETFB, drm_mode_getfb, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_ADDFB, drm_mode_addfb_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_ADDFB2, drm_mode_addfb2, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_RMFB, drm_mode_rmfb_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_PAGE_FLIP, drm_mode_page_flip_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_DIRTYFB, drm_mode_dirtyfb_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_DUMB, drm_mode_create_dumb_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_MAP_DUMB, drm_mode_mmap_dumb_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_DESTROY_DUMB, drm_mode_destroy_dumb_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_OBJ_GETPROPERTIES, drm_mode_obj_get_properties_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_OBJ_SETPROPERTY, drm_mode_obj_set_property_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CURSOR2, drm_mode_cursor2_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_ATOMIC, drm_mode_atomic_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATEPROPBLOB, drm_mode_createblob_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_DESTROYPROPBLOB, drm_mode_destroyblob_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_CREATE, drm_syncobj_create_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_DESTROY, drm_syncobj_destroy_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, drm_syncobj_handle_to_fd_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, drm_syncobj_fd_to_handle_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_LIST_LESSEES, drm_mode_list_lessees_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_GET_LEASE, drm_mode_get_lease_ioctl, DRM_MASTER|DRM_UNLOCKED),
DRM_IOCTL_DEF(DRM_IOCTL_MODE_REVOKE_LEASE, drm_mode_revoke_lease_ioctl, DRM_MASTER|DRM_UNLOCKED),
};
#define DRM_CORE_IOCTL_COUNT ARRAY_SIZE( drm_ioctls )
/**
* DOC: driver specific ioctls
*
* First things first, driver private IOCTLs should only be needed for drivers
* supporting rendering. Kernel modesetting is all standardized, and extended
* through properties. There are a few exceptions in some existing drivers,
* which define IOCTL for use by the display DRM master, but they all predate
* properties.
*
* Now if you do have a render driver you always have to support it through
* driver private properties. There's a few steps needed to wire all the things
* up.
*
* First you need to define the structure for your IOCTL in your driver private
* UAPI header in ``include/uapi/drm/my_driver_drm.h``::
*
* struct my_driver_operation {
* u32 some_thing;
* u32 another_thing;
* };
*
* Please make sure that you follow all the best practices from
* ``Documentation/ioctl/botching-up-ioctls.txt``. Note that drm_ioctl()
* automatically zero-extends structures, hence make sure you can add more stuff
* at the end, i.e. don't put a variable sized array there.
*
* Then you need to define your IOCTL number, using one of DRM_IO(), DRM_IOR(),
* DRM_IOW() or DRM_IOWR(). It must start with the DRM_IOCTL\_ prefix::
*
* ##define DRM_IOCTL_MY_DRIVER_OPERATION \
* DRM_IOW(DRM_COMMAND_BASE, struct my_driver_operation)
*
* DRM driver private IOCTL must be in the range from DRM_COMMAND_BASE to
* DRM_COMMAND_END. Finally you need an array of &struct drm_ioctl_desc to wire
* up the handlers and set the access rights::
*
* static const struct drm_ioctl_desc my_driver_ioctls[] = {
* DRM_IOCTL_DEF_DRV(MY_DRIVER_OPERATION, my_driver_operation,
* DRM_AUTH|DRM_RENDER_ALLOW),
* };
*
* And then assign this to the &drm_driver.ioctls field in your driver
* structure.
*
* See the separate chapter on :ref:`file operations<drm_driver_fops>` for how
* the driver-specific IOCTLs are wired up.
*/
而代码中的DRM_IOCTL_DEF_DRV
显然是额外加入与NVDLA_DRM相关的ioctl操作。关于ioctl的flag怎么确定是DRM_RENDER_ALLOW
,如下:
/**
* enum drm_ioctl_flags - DRM ioctl flags
*
* Various flags that can be set in &drm_ioctl_desc.flags to control how
* userspace can use a given ioctl.
*/
enum drm_ioctl_flags {
/**
* @DRM_AUTH:
*
* This is for ioctl which are used for rendering, and require that the
* file descriptor is either for a render node, or if it's a
* legacy/primary node, then it must be authenticated.
*/
DRM_AUTH = BIT(0),
/**
* @DRM_MASTER:
*
* This must be set for any ioctl which can change the modeset or
* display state. Userspace must call the ioctl through a primary node,
* while it is the active master.
*
* Note that read-only modeset ioctl can also be called by
* unauthenticated clients, or when a master is not the currently active
* one.
*/
DRM_MASTER = BIT(1),
/**
* @DRM_ROOT_ONLY:
*
* Anything that could potentially wreak a master file descriptor needs
* to have this flag set. Current that's only for the SETMASTER and
* DROPMASTER ioctl, which e.g. logind can call to force a non-behaving
* master (display compositor) into compliance.
*
* This is equivalent to callers with the SYSADMIN capability.
*/
DRM_ROOT_ONLY = BIT(2),
/**
* @DRM_UNLOCKED:
*
* Whether &drm_ioctl_desc.func should be called with the DRM BKL held
* or not. Enforced as the default for all modern drivers, hence there
* should never be a need to set this flag.
*/
DRM_UNLOCKED = BIT(4),
/**
* @DRM_RENDER_ALLOW:
*
* This is used for all ioctl needed for rendering only, for drivers
* which support render nodes. This should be all new render drivers,
* and hence it should be always set for any ioctl with DRM_AUTH set.
* Note though that read-only query ioctl might have this set, but have
* not set DRM_AUTH because they do not require authentication.
*/
DRM_RENDER_ALLOW = BIT(5),
};
可以发现DRM_RENDER_ALLOW
基本满足全部操作的基本需求,因此自定义的NVDLA_DRM的ioctl的flag并未添加其他flag。
2、drm_driver
结构体,原型如下:
/**
* struct drm_driver - DRM driver structure
*
* This structure represent the common code for a family of cards. There will
* one drm_device for each card present in this family. It contains lots of
* vfunc entries, and a pile of those probably should be moved to more
* appropriate places like &drm_mode_config_funcs or into a new operations
* structure for GEM drivers.
*/
struct drm_driver {
...}
整段定义超过550+行,不贴出。nvdla_drm_driver
结构体中就是简单赋值。
10. nvdla_drm注册与注销
继续读代码,nvdla_drm
注册和注销的操作,如下:
int32_t nvdla_drm_probe(struct nvdla_device *nvdla_dev)
{
int32_t dma;
int32_t err;
struct drm_device *drm;
struct drm_driver *driver = &nvdla_drm_driver;
drm = drm_dev_alloc(driver, &nvdla_dev->pdev->dev);
if (IS_ERR(drm))
return PTR_ERR(drm);
nvdla_dev->drm = drm;
err = drm_dev_register(drm, 0);
if (err < 0)
goto unref;
/**
* TODO Register separate driver for memory and use DT node to
* read memory range
*/
//dma = dma_declare_coherent_memory(drm->dev, 0xC0000000, 0xC0000000, 0x40000000, DMA_MEMORY_MAP | DMA_MEMORY_EXCLUSIVE);
printk("This is before-dma_declare!");
dma=dma_declare_coherent_memory(drm->dev, 0x40000000, 0x40000000,0x40000000, DMA_MEMORY_EXCLUSIVE);
if (!(dma)) {
err = -ENOMEM;
printk("This is err signal!");
printk("err: %d\n", err);
goto unref;
}
return 0;
unref:
drm_dev_unref(drm);
return err;
}
void nvdla_drm_remove(struct nvdla_device *nvdla_dev)
{
drm_dev_unregister(nvdla_dev->drm);
dma_release_declared_memory(&nvdla_dev->pdev->dev);
drm_dev_unref(nvdla_dev->drm);
}
有若干部分值得注意:
1、drm_dev_alloc
函数,原型如下:
/**
* drm_dev_alloc - Allocate new DRM device
* @driver: DRM driver to allocate device for
* @parent: Parent device object
*
* Allocate and initialize a new DRM device. No device registration is done.
* Call drm_dev_register() to advertice the device to user space and register it
* with other core subsystems. This should be done last in the device
* initialization sequence to make sure userspace can't access an inconsistent
* state.
*
* The initial ref-count of the object is 1. Use drm_dev_get() and
* drm_dev_put() to take and drop further ref-counts.
*
* Note that for purely virtual devices @parent can be NULL.
*
* Drivers that wish to subclass or embed &struct drm_device into their
* own struct should look at using drm_dev_init() instead.
*
* RETURNS:
* Pointer to new DRM device, or ERR_PTR on failure.
*/
struct drm_device *drm_dev_alloc(struct drm_driver *driver,
struct device *parent)
{
struct drm_device *dev;
int ret;
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
return ERR_PTR(-ENOMEM);
ret = drm_dev_init(dev, driver, parent);
if (ret) {
kfree(dev);
return ERR_PTR(ret);
}
return dev;
}
例化的操作是drm = drm_dev_alloc(driver, &nvdla_dev->pdev->dev);
分配并初始化新的DRM设备,未完成任何设备注册。调用drm_dev_register()将设备通告到用户空间,并将其注册到其他核心子系统。这应该在设备初始化序列的最后一步完成,以确保用户空间不能访问不一致的状态。返回的内容是:指向新DRM设备的指针,或出现故障时的ERR_PTR。
2、drm_dev_register
函数,原型如下:
/**
* drm_dev_register - Register DRM device
* @dev: Device to register
* @flags: Flags passed to the driver's .load() function
*
* Register the DRM device @dev with the system, advertise device to user-space
* and start normal device operation. @dev must be allocated via drm_dev_alloc()
* previously.
*
* Never call this twice on any device!
*
* NOTE: To ensure backward compatibility with existing drivers method this
* function calls the &drm_driver.load method after registering the device
* nodes, creating race conditions. Usage of the &drm_driver.load methods is
* therefore deprecated, drivers must perform all initialization before calling
* drm_dev_register().
*
* RETURNS:
* 0 on success, negative error code on failure.
*/
int drm_dev_register(struct drm_device *dev, unsigned long flags)
{
struct drm_driver *driver = dev->driver;
int ret;
mutex_lock(&drm_global_mutex);
ret = drm_minor_register(dev, DRM_MINOR_RENDER);
if (ret)
goto err_minors;
ret = drm_minor_register(dev, DRM_MINOR_PRIMARY);
if (ret)
goto err_minors;
ret = create_compat_control_link(dev);
if (ret)
goto err_minors;
dev->registered = true;
if (dev->driver->load) {
ret = dev->driver->load(dev, flags);
if (ret)
goto err_minors;
}
if (drm_core_check_feature(dev, DRIVER_MODESET))
drm_modeset_register_all(dev);
ret = 0;
DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
driver->name, driver->major, driver->minor,
driver->patchlevel, driver->date,
dev->dev ? dev_name(dev->dev) : "virtual device",
dev->primary->index);
goto out_unlock;
err_minors:
remove_compat_control_link(dev);
drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
drm_minor_unregister(dev, DRM_MINOR_RENDER);
out_unlock:
mutex_unlock(&drm_global_mutex);
return ret;
}
例化的操作是err = drm_dev_register(drm, 0);
在系统中注册DRM设备dev,向用户空间通告设备并开始正常的设备操作。
3、dma_declare_coherent_memory
函数,原型如下:
static inline int
dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size, int flags)
{
return -ENOSYS;
}
注意这里的地址,是物理地址和总线地址的搭配,不再是虚拟地址和总线地址的搭配。
在DMA-API.txt
中的介绍如下:
::
int
dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size, int
flags)
Declare region of memory to be handed out by dma_alloc_coherent() when
it's asked for coherent memory for this device.
phys_addr is the CPU physical address to which the memory is currently
assigned (this will be ioremapped so the CPU can access the region).
device_addr is the DMA address the device needs to be programmed
with to actually address this memory (this will be handed out as the
dma_addr_t in dma_alloc_coherent()).
size is the size of the area (must be multiples of PAGE_SIZE).
flags can be ORed together and are:
- DMA_MEMORY_EXCLUSIVE - only allocate memory from the declared regions.
Do not allow dma_alloc_coherent() to fall back to system memory when
it's out of memory in the declared region.
实际的例化如下:
dma=dma_declare_coherent_memory(drm->dev, 0x40000000, 0x40000000,0x40000000, DMA_MEMORY_EXCLUSIVE);
这里不解释为什么物理地址和总线地址的数值相等,但其实之前已经给出过答案了。我们现在换个视角,就是直接看内核源码中如何例化使用该函数。
Find in setup.c
dma_declare_coherent_memory(&kfr2r09_ceu_device.dev,
ceu_dma_membase, ceu_dma_membase,
ceu_dma_membase + CEU_BUFFER_MEMORY_SIZE - 1,
DMA_MEMORY_EXCLUSIVE);
Find in another setup.c
dma_declare_coherent_memory(&ecovec_ceu_devices[1]->dev,
ceu1_dma_membase, ceu1_dma_membase,
ceu1_dma_membase +
CEU_BUFFER_MEMORY_SIZE - 1,
DMA_MEMORY_EXCLUSIVE);
Find in another setup.c
dma_declare_coherent_memory(&ms7724se_ceu_devices[0]->dev,
ceu0_dma_membase, ceu0_dma_membase,
ceu0_dma_membase +
CEU_BUFFER_MEMORY_SIZE - 1,
DMA_MEMORY_EXCLUSIVE);
我们照猫画虎即可。
二、nvdla_gem.c代码内函数整理二
函数原型 | 功能 |
---|---|
static int32_t nvdla_drm_gem_object_mmap(struct drm_gem_object *dobj,struct vm_area_struct *vma) |
用于实现 NVDLA GEM对象的内存映射(mmap)操作的函数。内存映射允许用户空间应用程序将内核中的 GEM 对象映射到应用程序的地址空间中,以便应用程序可以直接访问该对象的数据。 |
static int32_t nvdla_drm_gem_mmap_buf(struct drm_gem_object *obj,struct vm_area_struct *vma) |
功能同nvdla_drm_gem_object_mmap |
static int32_t nvdla_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma) |
功能同nvdla_drm_gem_object_mmap |
static struct sg_table *nvdla_drm_gem_prime_get_sg_table(struct drm_gem_object *dobj) |
该函数实现了实现了在 GEM对象上获取 Scatter-Gather 表(SG 表)的操作。SG 表是一种数据结构,用于描述分散在物理内存中的连续数据块的位置和大小,通常在 DMA操作中使用,以便可以有效地传输分散的数据块。 |
static void *nvdla_drm_gem_prime_vmap(struct drm_gem_object *obj) |
用于返回虚拟地址 |
int32_t nvdla_gem_dma_addr(struct drm_device *dev, struct drm_file *file,uint32_t fd, dma_addr_t *addr) |
该函数的目的是获取给定文件描述符(fd) 对应的GEM 对象的DMA 地址。首先,通过 drm_gem_prime_fd_to_handle 函数将文件描述符 转换为GEM 对象的句柄(handle) 。然后,通过 drm_gem_object_lookup 函数查找具有给定句柄 的GEM 对象。接着将找到的GEM 对象转换为特定类型的GEM 对象指针。最后,将GEM 对象的DMA 地址(dma_addr) 赋值给addr 参数,并释放GEM 对象的引用计数。总的来说,该函数目的是交给用户态空间数据handle 来管理DRM device |
static int32_t nvdla_gem_destroy(struct drm_device *drm, void *data, struct drm_file *file) |
销毁给定句柄对应的 GEM 对象 |
三、nvdla_gem.c牵扯结构体整理二
结构体 | 功能 |
---|---|
sg_table |
Scatter-Gather表,用于描述分散在物理内存中的连续数据块的位置和大小 |
drm_ioctl_desc |
定义drm的ioctl操作,可以自行添加自定义ioctl操作,但需要注意ioctl的flags |
drm_ioctl_flags |
ioctl的flags说明 |
drm_driver |
包含驱动的常见定义变量 |
四、总结
以上对nvdla_gem.c代码进行了全部解读。对函数和结构体做统一整理。
函数原型 | 功能 |
---|---|
static int32_t nvdla_fill_task_desc(struct nvdla_ioctl_submit_task *local_task,struct nvdla_task *task) |
将local_task 的任务地址数量num_addresses 和任务具体内容的指针handles ,其中local_task->num_addresses * sizeof(struct nvdla_mem_handle) 就是在申请所有具体任务相关数据的地址空间 |
static int32_t nvdla_submit(struct drm_device *drm, void *arg,struct drm_file *file) |
nvdla_submit 函数传入参数arg (该参数的使之内容是nvdla_submit_args 结构体类型的变量,包含内容为任务、任务的数量等),arg 传入的任务转换为nvdla_ioctl_submit_task 结构体类型的任务,随后调用nvdla_fill_task_desc 完成用户态空间任务数据到内核态空间任务数据的下陷。与此同时,利用传入的drm_device 结构体指针drm 通过dev_get_drvdata 来获取与其他子系统交互的过程中当前的driver data ,从而引入完成nvdla_fill_task_desc 功能的另一个关键变量task ,并将drm_file 结构体提交给task ,其中drm_file 结构体包含针对该file的每个文件描述符操作后的状态变量。最后使用nvdla_task_submit 函数提交 NVDLA 任务并等待任务完成的函数。 |
static int32_t nvdla_gem_alloc(struct nvdla_gem_object *nobj) |
nvdla_gem_alloc 函数,该函数传入的变量是nvdla用于存储管理的结构体nvdla_gem_object ,根据前面介绍,该结构含有三个重要的变量,负责drm 下存储分配和管理的drm_gem_object 结构体、内核态虚拟地址kvaddr 和dma 相关变量。整个函数实现的功能是dma地址分配。 |
static void nvdla_gem_free(struct nvdla_gem_object *nobj) |
释放nvdla_gem_alloc 申请到的设备dma缓冲区 |
static struct nvdla_gem_object * nvdla_gem_create_object(struct drm_device *drm, uint32_t size) |
用于创建 NVDLA GEM对象的函数,随后分配和管理 DMA缓冲区的内核对象。前半部分的创建通过内核定义APIdrm_gem_private_object_init 函数实现,后半部分调用nvdla_gem_alloc 实现 |
static void nvdla_gem_free_object(struct drm_gem_object *dobj) |
用于释放 NVDLA GEM对象的函数,用于销毁和释放先前分配的 DMA缓冲区的内核对象 |
static struct nvdla_gem_object * nvdla_gem_create_with_handle(struct drm_file *file_priv,struct drm_device *drm, uint32_t size,uint32_t *handle) |
用于创建具有句柄(handle)的 NVDLA GEM对象的函数。它允许用户空间应用程序创建 GEM 对象,并返回一个句柄 |
static int32_t nvdla_gem_create(struct drm_device *drm, void *data, struct drm_file *file) |
和nvdla_gem_create_with_handle(struct drm_file *file_priv,struct drm_device *drm, uint32_t size,uint32_t *handle) 完全一样 |
static int32_t nvdla_drm_gem_object_mmap(struct drm_gem_object *dobj,struct vm_area_struct *vma) |
用于实现 NVDLA GEM对象的内存映射(mmap)操作的函数。内存映射允许用户空间应用程序将内核中的 GEM 对象映射到应用程序的地址空间中,以便应用程序可以直接访问该对象的数据。 |
static int32_t nvdla_drm_gem_mmap_buf(struct drm_gem_object *obj,struct vm_area_struct *vma) |
功能同nvdla_drm_gem_object_mmap |
static int32_t nvdla_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma) |
功能同nvdla_drm_gem_object_mmap |
static struct sg_table *nvdla_drm_gem_prime_get_sg_table(struct drm_gem_object *dobj) |
该函数实现了实现了在 GEM对象上获取 Scatter-Gather 表(SG 表)的操作。SG 表是一种数据结构,用于描述分散在物理内存中的连续数据块的位置和大小,通常在 DMA操作中使用,以便可以有效地传输分散的数据块。 |
static void *nvdla_drm_gem_prime_vmap(struct drm_gem_object *obj) |
用于返回虚拟地址 |
int32_t nvdla_gem_dma_addr(struct drm_device *dev, struct drm_file *file,uint32_t fd, dma_addr_t *addr) |
该函数的目的是获取给定文件描述符(fd) 对应的GEM 对象的DMA 地址。首先,通过 drm_gem_prime_fd_to_handle 函数将文件描述符 转换为GEM 对象的句柄(handle) 。然后,通过 drm_gem_object_lookup 函数查找具有给定句柄 的GEM 对象。接着将找到的GEM 对象转换为特定类型的GEM 对象指针。最后,将GEM 对象的DMA 地址(dma_addr) 赋值给addr 参数,并释放GEM 对象的引用计数。总的来说,该函数目的是交给用户态空间数据handle 来管理DRM device |
static int32_t nvdla_gem_destroy(struct drm_device *drm, void *data, struct drm_file *file) |
销毁给定句柄对应的 GEM 对象 |
结构体 | 功能 |
---|---|
nvdla_gem_object |
包含重要的变量,首先是drm_gem_object ,用于drm 存储管理和分配的结构体;其次是*kvaddr :这是一个指针成员,通常用于存储内核虚拟地址。这个地址指向内核中的数据缓冲区,该缓冲区可能包含了与图形或DMA相关的数据。这个成员可能被用于快速访问数据,而无需进行物理内存地址转换;最后是和dma 相关的地址和属性 |
nvdla_mem_handle |
作为媒介联通用户态空间任务结构体nvdla_ioctl_submit_task 和内核态空间任务结构体nvdla_task |
nvdla_ioctl_submit_task |
用户态空间任务结构体 |
nvdla_task |
内核态空间任务结构体 |
nvdla_device |
包含的信息是设备常用信息,比如中断、平台设备、drm设备等 |
nvdla_submit_args |
该结构体包含任务信息,用于用户态空间传入任务相关数据的参数,并通过该参数和nvdla_ioctl_submit_task 交互,总体来说,任务粒度高于nvdla_ioctl_submit_task |
drm_file |
包含针对该file的每个文件描述符操作后的状态变量 |
drm_gem_object |
描述drm 的存储分配对象,包含了该对象归属的设备drm_device 和对象的大小size |
drm_device |
描述了drm 设备结构体,包含了该总线设备的数据结构 |
sg_table |
Scatter-Gather表,用于描述分散在物理内存中的连续数据块的位置和大小 |
drm_ioctl_desc |
定义drm的ioctl操作,可以自行添加自定义ioctl操作,但需要注意ioctl的flags |
drm_ioctl_flags |
ioctl的flags说明 |
drm_driver |
包含驱动的常见定义变量 |