diff options
author | Andrey Konovalov <andrey.konovalov@linaro.org> | 2012-05-21 15:20:33 +0400 |
---|---|---|
committer | Andrey Konovalov <andrey.konovalov@linaro.org> | 2012-05-21 15:20:33 +0400 |
commit | 8e7922bcbd2d1e87abace7c4e130ac93b1ed54a8 (patch) | |
tree | 852c33c95478cbc7ec90a95d79066a88bfc6f357 /Documentation | |
parent | d51d01166ef5be6f048d6f121b48224fcf1241e5 (diff) | |
parent | 4cd7db43d454d815c57d2cc171758c1caf540dd9 (diff) | |
download | panda-8e7922bcbd2d1e87abace7c4e130ac93b1ed54a8.tar.gz |
Automatically merging rebase-umm-wip into merge-linux-linaro-core-tracking
Conflicting files:
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/DocBook/media/v4l/compat.xml | 4 | ||||
-rw-r--r-- | Documentation/DocBook/media/v4l/io.xml | 179 | ||||
-rw-r--r-- | Documentation/DocBook/media/v4l/vidioc-create-bufs.xml | 1 | ||||
-rw-r--r-- | Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 | ||||
-rw-r--r-- | Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 47 | ||||
-rw-r--r-- | Documentation/dma-buf-sharing.txt | 98 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 9 |
7 files changed, 324 insertions, 29 deletions
diff --git a/Documentation/DocBook/media/v4l/compat.xml b/Documentation/DocBook/media/v4l/compat.xml index bce97c50391..2a2083d5caa 100644 --- a/Documentation/DocBook/media/v4l/compat.xml +++ b/Documentation/DocBook/media/v4l/compat.xml @@ -2523,6 +2523,10 @@ ioctls.</para> <listitem> <para>Selection API. <xref linkend="selection-api" /></para> </listitem> + <listitem> + <para>Importing DMABUF file descriptors as a new IO method described + in <xref linkend="dmabuf" />.</para> + </listitem> </itemizedlist> </section> diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index b815929b5bb..f8fb5590368 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident.</para> </footnote></para> </section> + <section id="dmabuf"> + <title>Streaming I/O (DMA buffer importing)</title> + + <note> + <title>Experimental</title> + <para>This is an <link linkend="experimental"> experimental </link> + interface and may change in the future.</para> + </note> + +<para>The DMABUF framework provides a generic mean for sharing buffers between + multiple devices. Device drivers that support DMABUF can export a DMA buffer +to userspace as a file descriptor (known as the exporter role), import a DMA +buffer from userspace using a file descriptor previously exported for a +different or the same device (known as the importer role), or both. This +section describes the DMABUF importer role API in V4L2.</para> + +<para>Input and output devices support the streaming I/O method when the +<constant>V4L2_CAP_STREAMING</constant> flag in the +<structfield>capabilities</structfield> field of &v4l2-capability; returned by +the &VIDIOC-QUERYCAP; ioctl is set. Whether importing DMA buffers through +DMABUF file descriptors is supported is determined by calling the +&VIDIOC-REQBUFS; ioctl with the memory type set to +<constant>V4L2_MEMORY_DMABUF</constant>.</para> + + <para>This I/O method is dedicated for sharing DMA buffers between V4L and +other APIs. Buffers (planes) are allocated by a driver on behalf of the +application, and exported to the application as file descriptors using an API +specific to the allocator driver. Only those file descriptor are exchanged, +these files and meta-information are passed in &v4l2-buffer; (or in +&v4l2-plane; in the multi-planar API case). The driver must be switched into +DMABUF I/O mode by calling the &VIDIOC-REQBUFS; with the desired buffer type. +No buffers (planes) are allocated beforehand, consequently they are not indexed +and cannot be queried like mapped buffers with the +<constant>VIDIOC_QUERYBUF</constant> ioctl.</para> + + <example> + <title>Initiating streaming I/O with DMABUF file descriptors</title> + + <programlisting> +&v4l2-requestbuffers; reqbuf; + +memset (&reqbuf, 0, sizeof (reqbuf)); +reqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; +reqbuf.memory = V4L2_MEMORY_DMABUF; + +if (ioctl (fd, &VIDIOC-REQBUFS;, &reqbuf) == -1) { + if (errno == EINVAL) + printf ("Video capturing or DMABUF streaming is not supported\n"); + else + perror ("VIDIOC_REQBUFS"); + + exit (EXIT_FAILURE); +} + </programlisting> + </example> + + <para>Buffer (plane) file is passed on the fly with the &VIDIOC-QBUF; +ioctl. In case of multiplanar buffers, every plane can be associated with a +different DMABUF descriptor.Although buffers are commonly cycled, applications +can pass different DMABUF descriptor at each <constant>VIDIOC_QBUF</constant> +call.</para> + + <example> + <title>Queueing DMABUF using single plane API</title> + + <programlisting> +int buffer_queue(int v4lfd, int index, int dmafd) +{ + &v4l2-buffer; buf; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.fd = dmafd; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + </programlisting> + </example> + + <example> + <title>Queueing DMABUF using multi plane API</title> + + <programlisting> +int buffer_queue_mp(int v4lfd, int index, int dmafd[], int n_planes) +{ + &v4l2-buffer; buf; + &v4l2-plane; planes[VIDEO_MAX_PLANES]; + int i; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.planes = planes; + buf.length = n_planes; + + memset(&planes, 0, sizeof planes); + + for (i = 0; i < n_planes; ++i) + buf.m.planes[i].m.fd = dmafd[i]; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + </programlisting> + </example> + + <para>Filled or displayed buffers are dequeued with the +&VIDIOC-DQBUF; ioctl. The driver can unlock the buffer at any +time between the completion of the DMA and this ioctl. The memory is +also unlocked when &VIDIOC-STREAMOFF; is called, &VIDIOC-REQBUFS;, or +when the device is closed.</para> + + <para>For capturing applications it is customary to enqueue a +number of empty buffers, to start capturing and enter the read loop. +Here the application waits until a filled buffer can be dequeued, and +re-enqueues the buffer when the data is no longer needed. Output +applications fill and enqueue buffers, when enough buffers are stacked +up output is started. In the write loop, when the application +runs out of free buffers it must wait until an empty buffer can be +dequeued and reused. Two methods exist to suspend execution of the +application until one or more buffers can be dequeued. By default +<constant>VIDIOC_DQBUF</constant> blocks when no buffer is in the +outgoing queue. When the <constant>O_NONBLOCK</constant> flag was +given to the &func-open; function, <constant>VIDIOC_DQBUF</constant> +returns immediately with an &EAGAIN; when no buffer is available. The +&func-select; or &func-poll; function are always available.</para> + + <para>To start and stop capturing or output applications call the +&VIDIOC-STREAMON; and &VIDIOC-STREAMOFF; ioctls. Note that +<constant>VIDIOC_STREAMOFF</constant> removes all buffers from both queues and +unlocks all buffers as a side effect. Since there is no notion of doing +anything "now" on a multitasking system, if an application needs to synchronize +with another event it should examine the &v4l2-buffer; +<structfield>timestamp</structfield> of captured buffers, or set the field +before enqueuing buffers for output.</para> + + <para>Drivers implementing DMABUF importing I/O must support the +<constant>VIDIOC_REQBUFS</constant>, <constant>VIDIOC_QBUF</constant>, +<constant>VIDIOC_DQBUF</constant>, <constant>VIDIOC_STREAMON</constant> and +<constant>VIDIOC_STREAMOFF</constant> ioctl, the <function>select()</function> +and <function>poll()</function> function.</para> + + </section> + <section id="async"> <title>Asynchronous I/O</title> @@ -671,6 +827,14 @@ memory, set by the application. See <xref linkend="userp" /> for details. <structname>v4l2_buffer</structname> structure.</entry> </row> <row> + <entry></entry> + <entry>int</entry> + <entry><structfield>fd</structfield></entry> + <entry>For the single-plane API and when +<structfield>memory</structfield> is <constant>V4L2_MEMORY_DMABUF</constant> this +is the file descriptor associated with a DMABUF buffer.</entry> + </row> + <row> <entry>__u32</entry> <entry><structfield>length</structfield></entry> <entry></entry> @@ -746,6 +910,15 @@ should set this to 0.</entry> </entry> </row> <row> + <entry></entry> + <entry>int</entry> + <entry><structfield>fd</structfield></entry> + <entry>When the memory type in the containing &v4l2-buffer; is + <constant>V4L2_MEMORY_DMABUF</constant>, this is a file + descriptor associated with a DMABUF buffer, similar to the + <structfield>fd</structfield> field in &v4l2-buffer;.</entry> + </row> + <row> <entry>__u32</entry> <entry><structfield>data_offset</structfield></entry> <entry></entry> @@ -980,6 +1153,12 @@ pointer</link> I/O.</entry> <entry>3</entry> <entry>[to do]</entry> </row> + <row> + <entry><constant>V4L2_MEMORY_DMABUF</constant></entry> + <entry>2</entry> + <entry>The buffer is used for <link linkend="dmabuf">DMA shared +buffer</link> I/O.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml index 73ae8a6cd00..adc92be1719 100644 --- a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml @@ -98,6 +98,7 @@ information.</para> <entry><structfield>memory</structfield></entry> <entry>Applications set this field to <constant>V4L2_MEMORY_MMAP</constant> or +<constant>V4L2_MEMORY_DMABUF</constant> or <constant>V4L2_MEMORY_USERPTR</constant>.</entry> </row> <row> diff --git a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml index 9caa49af580..cb5f5ff8076 100644 --- a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml +++ b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml @@ -112,6 +112,21 @@ they cannot be swapped out to disk. Buffers remain locked until dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the device is closed.</para> + <para>To enqueue a <link linkend="dmabuf">DMABUF</link> buffer applications +set the <structfield>memory</structfield> field to +<constant>V4L2_MEMORY_DMABUF</constant> and the <structfield>m.fd</structfield> +to a file descriptor associated with a DMABUF buffer. When the multi-planar API is +used and <structfield>m.fd</structfield> of the passed array of &v4l2-plane; +have to be used instead. When <constant>VIDIOC_QBUF</constant> is called with a +pointer to this structure the driver sets the +<constant>V4L2_BUF_FLAG_QUEUED</constant> flag and clears the +<constant>V4L2_BUF_FLAG_MAPPED</constant> and +<constant>V4L2_BUF_FLAG_DONE</constant> flags in the +<structfield>flags</structfield> field, or it returns an error code. This +ioctl locks the buffer. Buffers remain locked until dequeued, +until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the +device is closed.</para> + <para>Applications call the <constant>VIDIOC_DQBUF</constant> ioctl to dequeue a filled (capturing) or displayed (output) buffer from the driver's outgoing queue. They just set the diff --git a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml index 7be4b1d29b9..e3e709be7de 100644 --- a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml @@ -48,28 +48,30 @@ <refsect1> <title>Description</title> - <para>This ioctl is used to initiate <link linkend="mmap">memory -mapped</link> or <link linkend="userp">user pointer</link> -I/O. Memory mapped buffers are located in device memory and must be -allocated with this ioctl before they can be mapped into the -application's address space. User buffers are allocated by -applications themselves, and this ioctl is merely used to switch the -driver into user pointer I/O mode and to setup some internal structures.</para> +<para>This ioctl is used to initiate <link linkend="mmap">memory mapped</link>, +<link linkend="userp">user pointer</link> or <link +linkend="dmabuf">DMABUF</link> based I/O. Memory mapped buffers are located in +device memory and must be allocated with this ioctl before they can be mapped +into the application's address space. User buffers are allocated by +applications themselves, and this ioctl is merely used to switch the driver +into user pointer I/O mode and to setup some internal structures. +Similarly, DMABUF buffers are allocated by applications through a device +driver, and this ioctl only configures the driver into DMABUF I/O mode without +performing any direct allocation.</para> - <para>To allocate device buffers applications initialize all -fields of the <structname>v4l2_requestbuffers</structname> structure. -They set the <structfield>type</structfield> field to the respective -stream or buffer type, the <structfield>count</structfield> field to -the desired number of buffers, <structfield>memory</structfield> -must be set to the requested I/O method and the <structfield>reserved</structfield> array -must be zeroed. When the ioctl -is called with a pointer to this structure the driver will attempt to allocate -the requested number of buffers and it stores the actual number -allocated in the <structfield>count</structfield> field. It can be -smaller than the number requested, even zero, when the driver runs out -of free memory. A larger number is also possible when the driver requires -more buffers to function correctly. For example video output requires at least two buffers, -one displayed and one filled by the application.</para> + <para>To allocate device buffers applications initialize all fields of the +<structname>v4l2_requestbuffers</structname> structure. They set the +<structfield>type</structfield> field to the respective stream or buffer type, +the <structfield>count</structfield> field to the desired number of buffers, +<structfield>memory</structfield> must be set to the requested I/O method and +the <structfield>reserved</structfield> array must be zeroed. When the ioctl is +called with a pointer to this structure the driver will attempt to allocate the +requested number of buffers and it stores the actual number allocated in the +<structfield>count</structfield> field. It can be smaller than the number +requested, even zero, when the driver runs out of free memory. A larger number +is also possible when the driver requires more buffers to function correctly. +For example video output requires at least two buffers, one displayed and one +filled by the application.</para> <para>When the I/O method is not supported the ioctl returns an &EINVAL;.</para> @@ -102,7 +104,8 @@ as the &v4l2-format; <structfield>type</structfield> field. See <xref <entry>&v4l2-memory;</entry> <entry><structfield>memory</structfield></entry> <entry>Applications set this field to -<constant>V4L2_MEMORY_MMAP</constant> or +<constant>V4L2_MEMORY_MMAP</constant>, +<constant>V4L2_MEMORY_DMABUF</constant> or <constant>V4L2_MEMORY_USERPTR</constant>.</entry> </row> <row> diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 3bbd5c51605..5ff4d2b84f7 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -29,13 +29,6 @@ The buffer-user in memory, mapped into its own address space, so it can access the same area of memory. -*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details] -For this first version, A buffer shared using the dma_buf sharing API: -- *may* be exported to user space using "mmap" *ONLY* by exporter, outside of - this framework. -- with this new iteration of the dma-buf api cpu access from the kernel has been - enable, see below for the details. - dma-buf operations for device dma only -------------------------------------- @@ -313,6 +306,83 @@ Access to a dma_buf from the kernel context involves three steps: enum dma_data_direction dir); +Direct Userspace Access/mmap Support +------------------------------------ + +Being able to mmap an export dma-buf buffer object has 2 main use-cases: +- CPU fallback processing in a pipeline and +- supporting existing mmap interfaces in importers. + +1. CPU fallback processing in a pipeline + + In many processing pipelines it is sometimes required that the cpu can access + the data in a dma-buf (e.g. for thumbnail creation, snapshots, ...). To avoid + the need to handle this specially in userspace frameworks for buffer sharing + it's ideal if the dma_buf fd itself can be used to access the backing storage + from userspace using mmap. + + Furthermore Android's ION framework already supports this (and is otherwise + rather similar to dma-buf from a userspace consumer side with using fds as + handles, too). So it's beneficial to support this in a similar fashion on + dma-buf to have a good transition path for existing Android userspace. + + No special interfaces, userspace simply calls mmap on the dma-buf fd. + +2. Supporting existing mmap interfaces in exporters + + Similar to the motivation for kernel cpu access it is again important that + the userspace code of a given importing subsystem can use the same interfaces + with a imported dma-buf buffer object as with a native buffer object. This is + especially important for drm where the userspace part of contemporary OpenGL, + X, and other drivers is huge, and reworking them to use a different way to + mmap a buffer rather invasive. + + The assumption in the current dma-buf interfaces is that redirecting the + initial mmap is all that's needed. A survey of some of the existing + subsystems shows that no driver seems to do any nefarious thing like syncing + up with outstanding asynchronous processing on the device or allocating + special resources at fault time. So hopefully this is good enough, since + adding interfaces to intercept pagefaults and allow pte shootdowns would + increase the complexity quite a bit. + + Interface: + int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, + unsigned long); + + If the importing subsystem simply provides a special-purpose mmap call to set + up a mapping in userspace, calling do_mmap with dma_buf->file will equally + achieve that for a dma-buf object. + +3. Implementation notes for exporters + + Because dma-buf buffers have invariant size over their lifetime, the dma-buf + core checks whether a vma is too large and rejects such mappings. The + exporter hence does not need to duplicate this check. + + Because existing importing subsystems might presume coherent mappings for + userspace, the exporter needs to set up a coherent mapping. If that's not + possible, it needs to fake coherency by manually shooting down ptes when + leaving the cpu domain and flushing caches at fault time. Note that all the + dma_buf files share the same anon inode, hence the exporter needs to replace + the dma_buf file stored in vma->vm_file with it's own if pte shootdown is + requred. This is because the kernel uses the underlying inode's address_space + for vma tracking (and hence pte tracking at shootdown time with + unmap_mapping_range). + + If the above shootdown dance turns out to be too expensive in certain + scenarios, we can extend dma-buf with a more explicit cache tracking scheme + for userspace mappings. But the current assumption is that using mmap is + always a slower path, so some inefficiencies should be acceptable. + + Exporters that shoot down mappings (for any reasons) shall not do any + synchronization at fault time with outstanding device operations. + Synchronization is an orthogonal issue to sharing the backing storage of a + buffer and hence should not be handled by dma-buf itself. This is explictly + mentioned here because many people seem to want something like this, but if + different exporters handle this differently, buffer sharing can fail in + interesting ways depending upong the exporter (if userspace starts depending + upon this implicit synchronization). + Miscellaneous notes ------------------- @@ -336,6 +406,20 @@ Miscellaneous notes the exporting driver to create a dmabuf fd must provide a way to let userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd(). +- If an exporter needs to manually flush caches and hence needs to fake + coherency for mmap support, it needs to be able to zap all the ptes pointing + at the backing storage. Now linux mm needs a struct address_space associated + with the struct file stored in vma->vm_file to do that with the function + unmap_mapping_range. But the dma_buf framework only backs every dma_buf fd + with the anon_file struct file, i.e. all dma_bufs share the same file. + + Hence exporters need to setup their own file (and address_space) association + by setting vma->vm_file and adjusting vma->vm_pgoff in the dma_buf mmap + callback. In the specific case of a gem driver the exporter could use the + shmem file already provided by gem (and set vm_pgoff = 0). Exporters can then + zap ptes by unmapping the corresponding range of the struct address_space + associated with their own file. + References: [1] struct dma_buf_ops in include/linux/dma-buf.h [2] All interfaces mentioned above defined in include/linux/dma-buf.h diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c1601e5a8b7..41996c68a5c 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -508,6 +508,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Also note the kernel might malfunction if you disable some critical bits. + cma=nn[MG] [ARM,KNL] + Sets the size of kernel global memory area for contiguous + memory allocations. For more information, see + include/linux/dma-contiguous.h + cmo_free_hint= [PPC] Format: { yes | no } Specify whether pages are marked as being inactive when they are freed. This is used in CMO environments @@ -515,6 +520,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. a hypervisor. Default: yes + coherent_pool=nn[KMG] [ARM,KNL] + Sets the size of memory pool for coherent, atomic dma + allocations if Contiguous Memory Allocator (CMA) is used. + code_bytes [X86] How many bytes of object code to print in an oops report. Range: 0 - 8192 |