From 757c264bc10ebc71074ee3f5fb66d670667a09bc Mon Sep 17 00:00:00 2001 From: Kevin Cheng Date: Thu, 18 Apr 2019 11:31:16 -0700 Subject: Add in cffi 1.12.2 (e0c7666) Since this is a mercurial repo, d/led zip of src: https://bitbucket.org/cffi/cffi/get/v1.12.2.zip Also add in misc METADATA/NOTICE/Android.bp/etc files. Bug: 122778810 Test: None Change-Id: I36c58ed07a2cdd4d9d11831908175a5c988f33c1 --- doc/source/ref.rst | 1007 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1007 insertions(+) create mode 100644 doc/source/ref.rst (limited to 'doc/source/ref.rst') diff --git a/doc/source/ref.rst b/doc/source/ref.rst new file mode 100644 index 0000000..3dc8e4b --- /dev/null +++ b/doc/source/ref.rst @@ -0,0 +1,1007 @@ +================================ +CFFI Reference +================================ + +.. contents:: + + +FFI Interface +------------- + +*This page documents the runtime interface of the two types "FFI" and +"CompiledFFI". These two types are very similar to each other. You get +a CompiledFFI object if you import an out-of-line module. You get a FFI +object from explicitly writing cffi.FFI(). Unlike CompiledFFI, the type +FFI has also got additional methods documented on the* `next page`__. + +.. __: cdef.html + + +ffi.NULL +++++++++ + +**ffi.NULL**: a constant NULL of type ````. + + +ffi.error ++++++++++ + +**ffi.error**: the Python exception raised in various cases. (Don't +confuse it with ``ffi.errno``.) + + +ffi.new() ++++++++++ + +**ffi.new(cdecl, init=None)**: +allocate an instance according to the specified C type and return a +pointer to it. The specified C type must be either a pointer or an +array: ``new('X *')`` allocates an X and returns a pointer to it, +whereas ``new('X[n]')`` allocates an array of n X'es and returns an +array referencing it (which works mostly like a pointer, like in C). +You can also use ``new('X[]', n)`` to allocate an array of a +non-constant length n. See the `detailed documentation`__ for other +valid initializers. + +.. __: using.html#working + +When the returned ```` object goes out of scope, the memory is +freed. In other words the returned ```` object has ownership of +the value of type ``cdecl`` that it points to. This means that the raw +data can be used as long as this object is kept alive, but must not be +used for a longer time. Be careful about that when copying the +pointer to the memory somewhere else, e.g. into another structure. +Also, this means that a line like ``x = ffi.new(...)[0]`` is *always +wrong:* the newly allocated object goes out of scope instantly, and so +is freed immediately, and ``x`` is garbage. + +The returned memory is initially cleared (filled with zeroes), before +the optional initializer is applied. For performance, see +`ffi.new_allocator()`_ for a way to allocate non-zero-initialized +memory. + +*New in version 1.12:* see also ``ffi.release()``. + + +ffi.cast() +++++++++++ + +**ffi.cast("C type", value)**: similar to a C cast: returns an +instance of the named C type initialized with the given value. The +value is casted between integers or pointers of any type. + + +.. _ffi-errno: +.. _ffi-getwinerror: + +ffi.errno, ffi.getwinerror() +++++++++++++++++++++++++++++ + +**ffi.errno**: the value of ``errno`` received from the most recent C call +in this thread, and passed to the following C call. (This is a thread-local +read-write property.) + +**ffi.getwinerror(code=-1)**: on Windows, in addition to ``errno`` we +also save and restore the ``GetLastError()`` value across function +calls. This function returns this error code as a tuple ``(code, +message)``, adding a readable message like Python does when raising +WindowsError. If the argument ``code`` is given, format that code into +a message instead of using ``GetLastError()``. +(Note that it is also possible to declare and call the ``GetLastError()`` +function as usual.) + + +.. _ffi-string: +.. _ffi-unpack: + +ffi.string(), ffi.unpack() +++++++++++++++++++++++++++ + +**ffi.string(cdata, [maxlen])**: return a Python string (or unicode +string) from the 'cdata'. + +- If 'cdata' is a pointer or array of characters or bytes, returns the + null-terminated string. The returned string extends until the first + null character. The 'maxlen' argument limits how far we look for a + null character. If 'cdata' is an + array then 'maxlen' defaults to its length. See ``ffi.unpack()`` below + for a way to continue past the first null character. *Python 3:* this + returns a ``bytes``, not a ``str``. + +- If 'cdata' is a pointer or array of wchar_t, returns a unicode string + following the same rules. *New in version 1.11:* can also be + char16_t or char32_t. + +- If 'cdata' is a single character or byte or a wchar_t or charN_t, + returns it as a byte string or unicode string. (Note that in some + situation a single wchar_t or char32_t may require a Python unicode + string of length 2.) + +- If 'cdata' is an enum, returns the value of the enumerator as a string. + If the value is out of range, it is simply returned as the stringified + integer. + +**ffi.unpack(cdata, length)**: unpacks an array of C data of the given +length, returning a Python string/unicode/list. The 'cdata' should be +a pointer; if it is an array it is first converted to the pointer +type. *New in version 1.6.* + +- If 'cdata' is a pointer to 'char', returns a byte string. It does + not stop at the first null. (An equivalent way to do that is + ``ffi.buffer(cdata, length)[:]``.) + +- If 'cdata' is a pointer to 'wchar_t', returns a unicode string. + ('length' is measured in number of wchar_t; it is not the size in + bytes.) *New in version 1.11:* can also be char16_t or char32_t. + +- If 'cdata' is a pointer to anything else, returns a list, of the + given 'length'. (A slower way to do that is ``[cdata[i] for i in + range(length)]``.) + + +.. _ffi-buffer: +.. _ffi-from-buffer: + +ffi.buffer(), ffi.from_buffer() ++++++++++++++++++++++++++++++++ + +**ffi.buffer(cdata, [size])**: return a buffer object that references +the raw C data pointed to by the given 'cdata', of 'size' bytes. What +Python calls "a buffer", or more precisely "an object supporting the +buffer interface", is an object that represents some raw memory and +that can be passed around to various built-in or extension functions; +these built-in functions read from or write to the raw memory directly, +without needing an extra copy. + +The 'cdata' argument +must be a pointer or an array. If unspecified, the size of the +buffer is either the size of what ``cdata`` points to, or the whole size +of the array. + +Here are a few examples of where buffer() would be useful: + +- use ``file.write()`` and ``file.readinto()`` with + such a buffer (for files opened in binary mode) + +- overwrite the content of a struct: if ``p`` is a cdata pointing to + it, use ``ffi.buffer(p)[:] = newcontent``, where ``newcontent`` is + a bytes object (``str`` in Python 2). + +Remember that like in C, you can use ``array + index`` to get the pointer +to the index'th item of an array. (In C you might more naturally write +``&array[index]``, but that is equivalent.) + +The returned object's type is not the builtin ``buffer`` nor ``memoryview`` +types, because these types' API changes too much across Python versions. +Instead it has the following Python API (a subset of Python 2's ``buffer``) +in addition to supporting the buffer interface: + +- ``buf[:]`` or ``bytes(buf)``: copy data out of the buffer, returning a + regular byte string (or ``buf[start:end]`` for a part) + +- ``buf[:] = newstr``: copy data into the buffer (or ``buf[start:end] + = newstr``) + +- ``len(buf)``, ``buf[index]``, ``buf[index] = newchar``: access as a sequence + of characters. + +The buffer object returned by ``ffi.buffer(cdata)`` keeps alive the +``cdata`` object: if it was originally an owning cdata, then its +owned memory will not be freed as long as the buffer is alive. + +Python 2/3 compatibility note: you should avoid using ``str(buf)``, +because it gives inconsistent results between Python 2 and Python 3. +(This is similar to how ``str()`` gives inconsistent results on regular +byte strings). Use ``buf[:]`` instead. + +*New in version 1.10:* ``ffi.buffer`` is now the type of the returned +buffer objects; ``ffi.buffer()`` actually calls the constructor. + +**ffi.from_buffer([cdecl,] python_buffer, require_writable=False)**: +return an array cdata (by default a ````) that +points to the data of the given Python object, which must support the +buffer interface. Note that ``ffi.from_buffer()`` turns a generic +Python buffer object into a cdata object, whereas ``ffi.buffer()`` does +the opposite conversion. Both calls don't actually copy any data. + +``ffi.from_buffer()`` is meant to be used on objects +containing large quantities of raw data, like bytearrays +or ``array.array`` or numpy +arrays. It supports both the old *buffer* API (in Python 2.x) and the +new *memoryview* API. Note that if you pass a read-only buffer object, +you still get a regular ````; it is your responsibility +not to write there if the original buffer doesn't expect you to. +*In particular, never modify byte strings!* + +The original object is kept alive (and, in case +of memoryview, locked) as long as the cdata object returned by +``ffi.from_buffer()`` is alive. + +A common use case is calling a C function with some ``char *`` that +points to the internal buffer of a Python object; for this case you +can directly pass ``ffi.from_buffer(python_buffer)`` as argument to +the call. + +*New in version 1.10:* the ``python_buffer`` can be anything supporting +the buffer/memoryview interface (except unicode strings). Previously, +bytearray objects were supported in version 1.7 onwards (careful, if you +resize the bytearray, the ```` object will point to freed +memory); and byte strings were supported in version 1.8 onwards. + +*New in version 1.12:* added the optional *first* argument ``cdecl``, and +the keyword argument ``require_writable``: + +* ``cdecl`` defaults to ``"char[]"``, but a different array type can be + specified for the result. A value like ``"int[]"`` will return an array of + ints instead of chars, and its length will be set to the number of ints + that fit in the buffer (rounded down if the division is not exact). Values + like ``"int[42]"`` or ``"int[2][3]"`` will return an array of exactly 42 + (resp. 2-by-3) ints, raising a ValueError if the buffer is too small. The + difference between specifying ``"int[]"`` and using the older code ``p1 = + ffi.from_buffer(x); p2 = ffi.cast("int *", p1)`` is that the older code + needs to keep ``p1`` alive as long as ``p2`` is in use, because only ``p1`` + keeps the underlying Python object alive and locked. (In addition, + ``ffi.from_buffer("int[]", x)`` gives better array bound checking.) + +* if ``require_writable`` is set to True, the function fails if the buffer + obtained from ``python_buffer`` is read-only (e.g. if ``python_buffer`` is + a byte string). The exact exception is raised by the object itself, and + for things like bytes it varies with the Python version, so don't rely on + it. (Before version 1.12, the same effect can be achieved with a hack: + call ``ffi.memmove(python_buffer, b"", 0)``. This has no effect if the + object is writable, but fails if it is read-only.) Please keep in mind + that CFFI does not implement the C keyword ``const``: even if you set + ``require_writable`` to False explicitly, you still get a regular + read-write cdata pointer. + +*New in version 1.12:* see also ``ffi.release()``. + + +ffi.memmove() ++++++++++++++ + +**ffi.memmove(dest, src, n)**: copy ``n`` bytes from memory area +``src`` to memory area ``dest``. See examples below. Inspired by the +C functions ``memcpy()`` and ``memmove()``---like the latter, the +areas can overlap. Each of ``dest`` and ``src`` can be either a cdata +pointer or a Python object supporting the buffer/memoryview interface. +In the case of ``dest``, the buffer/memoryview must be writable. +*New in version 1.3.* Examples: + +* ``ffi.memmove(myptr, b"hello", 5)`` copies the 5 bytes of + ``b"hello"`` to the area that ``myptr`` points to. + +* ``ba = bytearray(100); ffi.memmove(ba, myptr, 100)`` copies 100 + bytes from ``myptr`` into the bytearray ``ba``. + +* ``ffi.memmove(myptr + 1, myptr, 100)`` shifts 100 bytes from + the memory at ``myptr`` to the memory at ``myptr + 1``. + +In versions before 1.10, ``ffi.from_buffer()`` had restrictions on the +type of buffer, which made ``ffi.memmove()`` more general. + +.. _ffi-typeof: +.. _ffi-sizeof: +.. _ffi-alignof: + +ffi.typeof(), ffi.sizeof(), ffi.alignof() ++++++++++++++++++++++++++++++++++++++++++ + +**ffi.typeof("C type" or cdata object)**: return an object of type +```` corresponding to the parsed string, or to the C type of the +cdata instance. Usually you don't need to call this function or to +explicitly manipulate ```` objects in your code: any place that +accepts a C type can receive either a string or a pre-parsed ``ctype`` +object (and because of caching of the string, there is no real +performance difference). It can still be useful in writing typechecks, +e.g.: + +.. code-block:: python + + def myfunction(ptr): + assert ffi.typeof(ptr) is ffi.typeof("foo_t*") + ... + +Note also that the mapping from strings like ``"foo_t*"`` to the +```` objects is stored in some internal dictionary. This +guarantees that there is only one ```` object, so you +can use the ``is`` operator to compare it. The downside is that the +dictionary entries are immortal for now. In the future, we may add +transparent reclamation of old, unused entries. In the meantime, note +that using strings like ``"int[%d]" % length`` to name a type will +create many immortal cached entries if called with many different +lengths. + +**ffi.sizeof("C type" or cdata object)**: return the size of the +argument in bytes. The argument can be either a C type, or a cdata object, +like in the equivalent ``sizeof`` operator in C. + +For ``array = ffi.new("T[]", n)``, then ``ffi.sizeof(array)`` returns +``n * ffi.sizeof("T")``. *New in version 1.9:* Similar rules apply for +structures with a variable-sized array at the end. More precisely, if +``p`` was returned by ``ffi.new("struct foo *", ...)``, then +``ffi.sizeof(p[0])`` now returns the total allocated size. In previous +versions, it used to just return ``ffi.sizeof(ffi.typeof(p[0]))``, which +is the size of the structure ignoring the variable-sized part. (Note +that due to alignment, it is possible for ``ffi.sizeof(p[0])`` to return +a value smaller than ``ffi.sizeof(ffi.typeof(p[0]))``.) + +**ffi.alignof("C type")**: return the natural alignment size in bytes of +the argument. Corresponds to the ``__alignof__`` operator in GCC. + + +.. _ffi-offsetof: +.. _ffi-addressof: + +ffi.offsetof(), ffi.addressof() ++++++++++++++++++++++++++++++++ + +**ffi.offsetof("C struct or array type", \*fields_or_indexes)**: return the +offset within the struct of the given field. Corresponds to ``offsetof()`` +in C. + +You can give several field names in case of nested structures. You +can also give numeric values which correspond to array items, in case +of a pointer or array type. For example, ``ffi.offsetof("int[5]", 2)`` +is equal to the size of two integers, as is ``ffi.offsetof("int *", 2)``. + + +**ffi.addressof(cdata, \*fields_or_indexes)**: limited equivalent to +the '&' operator in C: + +1. ``ffi.addressof()`` returns a cdata that +is a pointer to this struct or union. The returned pointer is only +valid as long as the original ``cdata`` object is; be sure to keep it +alive if it was obtained directly from ``ffi.new()``. + +2. ``ffi.addressof(, field-or-index...)`` returns the address +of a field or array item inside the given structure or array. In case +of nested structures or arrays, you can give more than one field or +index to look recursively. Note that ``ffi.addressof(array, index)`` +can also be expressed as ``array + index``: this is true both in CFFI +and in C, where ``&array[index]`` is just ``array + index``. + +3. ``ffi.addressof(, "name")`` returns the address of the +named function or global variable from the given library object. +For functions, it returns a regular cdata +object containing a pointer to the function. + +Note that the case 1. cannot be used to take the address of a +primitive or pointer, but only a struct or union. It would be +difficult to implement because only structs and unions are internally +stored as an indirect pointer to the data. If you need a C int whose +address can be taken, use ``ffi.new("int[1]")`` in the first place; +similarly, for a pointer, use ``ffi.new("foo_t *[1]")``. + + +.. _ffi-cdata: +.. _ffi-ctype: + +ffi.CData, ffi.CType +++++++++++++++++++++ + +**ffi.CData, ffi.CType**: the Python type of the objects referred to +as ```` and ```` in the rest of this document. Note +that some cdata objects may be actually of a subclass of +``ffi.CData``, and similarly with ctype, so you should check with +``if isinstance(x, ffi.CData)``. Also, ```` objects have +a number of attributes for introspection: ``kind`` and ``cname`` are +always present, and depending on the kind they may also have +``item``, ``length``, ``fields``, ``args``, ``result``, ``ellipsis``, +``abi``, ``elements`` and ``relements``. + +*New in version 1.10:* ``ffi.buffer`` is now `a type`__ as well. + +.. __: #ffi-buffer + + +.. _ffi-gc: + +ffi.gc() +++++++++ + +**ffi.gc(cdata, destructor, size=0)**: +return a new cdata object that points to the +same data. Later, when this new cdata object is garbage-collected, +``destructor(old_cdata_object)`` will be called. Example of usage: +``ptr = ffi.gc(lib.custom_malloc(42), lib.custom_free)``. +Note that like objects +returned by ``ffi.new()``, the returned pointer objects have *ownership*, +which means the destructor is called as soon as *this* exact returned +object is garbage-collected. + +*New in version 1.12:* see also ``ffi.release()``. + +**ffi.gc(ptr, None, size=0)**: +removes the ownership on a object returned by a +regular call to ``ffi.gc``, and no destructor will be called when it +is garbage-collected. The object is modified in-place, and the +function returns ``None``. *New in version 1.7: ffi.gc(ptr, None)* + +Note that ``ffi.gc()`` should be avoided for limited resources, or (with +cffi below 1.11) for large memory allocations. This is particularly +true on PyPy: its GC does not know how much memory or how many resources +the returned ``ptr`` holds. It will only run its GC when enough memory +it knows about has been allocated (and thus run the destructor possibly +later than you would expect). Moreover, the destructor is called in +whatever thread PyPy is at that moment, which might be a problem for +some C libraries. In these cases, consider writing a wrapper class with +custom ``__enter__()`` and ``__exit__()`` methods, allocating and +freeing the C data at known points in time, and using it in a ``with`` +statement. In cffi 1.12, see also ``ffi.release()``. + +*New in version 1.11:* the ``size`` argument. If given, this should be +an estimate of the size (in bytes) that ``ptr`` keeps alive. This +information is passed on to the garbage collector, fixing part of the +problem described above. The ``size`` argument is most important on +PyPy; on CPython, it is ignored so far, but in the future it could be +used to trigger more eagerly the cyclic reference GC, too (see CPython +`issue 31105`__). + +The form ``ffi.gc(ptr, None, size=0)`` can be called with a negative +``size``, to cancel the estimate. It is not mandatory, though: +nothing gets out of sync if the size estimates do not match. It only +makes the next GC start more or less early. + +Note that if you have several ``ffi.gc()`` objects, the corresponding +destructors will be called in a random order. If you need a particular +order, see the discussion in `issue 340`__. + +.. __: http://bugs.python.org/issue31105 +.. __: https://bitbucket.org/cffi/cffi/issues/340/resources-release-issues + + +.. _ffi-new-handle: +.. _ffi-from-handle: + +ffi.new_handle(), ffi.from_handle() ++++++++++++++++++++++++++++++++++++ + +**ffi.new_handle(python_object)**: return a non-NULL cdata of type +``void *`` that contains an opaque reference to ``python_object``. You +can pass it around to C functions or store it into C structures. Later, +you can use **ffi.from_handle(p)** to retrieve the original +``python_object`` from a value with the same ``void *`` pointer. +*Calling ffi.from_handle(p) is invalid and will likely crash if +the cdata object returned by new_handle() is not kept alive!* + +See a `typical usage example`_ below. + +(In case you are wondering, this ``void *`` is not the ``PyObject *`` +pointer. This wouldn't make sense on PyPy anyway.) + +The ``ffi.new_handle()/from_handle()`` functions *conceptually* work +like this: + +* ``new_handle()`` returns cdata objects that contains references to + the Python objects; we call them collectively the "handle" cdata + objects. The ``void *`` value in these handle cdata objects are + random but unique. + +* ``from_handle(p)`` searches all live "handle" cdata objects for the + one that has the same value ``p`` as its ``void *`` value. It then + returns the Python object referenced by that handle cdata object. + If none is found, you get "undefined behavior" (i.e. crashes). + +The "handle" cdata object keeps the Python object alive, similar to +how ``ffi.new()`` returns a cdata object that keeps a piece of memory +alive. If the handle cdata object *itself* is not alive any more, +then the association ``void * -> python_object`` is dead and +``from_handle()`` will crash. + +*New in version 1.4:* two calls to ``new_handle(x)`` are guaranteed to +return cdata objects with different ``void *`` values, even with the +same ``x``. This is a useful feature that avoids issues with unexpected +duplicates in the following trick: if you need to keep alive the +"handle" until explicitly asked to free it, but don't have a natural +Python-side place to attach it to, then the easiest is to ``add()`` it +to a global set. It can later be removed from the set by +``global_set.discard(p)``, with ``p`` any cdata object whose ``void *`` +value compares equal. + +.. _`typical usage example`: + +Usage example: suppose you have a C library where you must call a +``lib.process_document()`` function which invokes some callback. The +``process_document()`` function receives a pointer to a callback and a +``void *`` argument. The callback is then invoked with the ``void +*data`` argument that is equal to the provided value. In this typical +case, you can implement it like this (out-of-line API mode):: + + class MyDocument: + ... + + def process(self): + h = ffi.new_handle(self) + lib.process_document(lib.my_callback, # the callback + h, # 'void *data' + args...) + # 'h' stays alive until here, which means that the + # ffi.from_handle() done in my_callback() during + # the call to process_document() is safe + + def callback(self, arg1, arg2): + ... + + # the actual callback is this one-liner global function: + @ffi.def_extern() + def my_callback(arg1, arg2, data): + return ffi.from_handle(data).callback(arg1, arg2) + + +.. _ffi-dlopen: +.. _ffi-dlclose: + +ffi.dlopen(), ffi.dlclose() ++++++++++++++++++++++++++++ + +**ffi.dlopen(libpath, [flags])**: opens and returns a "handle" to a +dynamic library, as a ```` object. See `Preparing and +Distributing modules`_. + +**ffi.dlclose(lib)**: explicitly closes a ```` object returned +by ``ffi.dlopen()``. + +**ffi.RLTD_...**: constants: flags for ``ffi.dlopen()``. + + +ffi.new_allocator() ++++++++++++++++++++ + +**ffi.new_allocator(alloc=None, free=None, should_clear_after_alloc=True)**: +returns a new allocator. An "allocator" is a callable that behaves like +``ffi.new()`` but uses the provided low-level ``alloc`` and ``free`` +functions. *New in version 1.2.* + +``alloc()`` is invoked with the size as sole argument. If it returns +NULL, a MemoryError is raised. Later, if ``free`` is not None, it will +be called with the result of ``alloc()`` as argument. Both can be either +Python function or directly C functions. If only ``free`` is None, then no +free function is called. If both ``alloc`` and ``free`` are None, the +default alloc/free combination is used. (In other words, the call +``ffi.new(*args)`` is equivalent to ``ffi.new_allocator()(*args)``.) + +If ``should_clear_after_alloc`` is set to False, then the memory +returned by ``alloc()`` is assumed to be already cleared (or you are +fine with garbage); otherwise CFFI will clear it. Example: for +performance, if you are using ``ffi.new()`` to allocate large chunks of +memory where the initial content can be left uninitialized, you can do:: + + # at module level + new_nonzero = ffi.new_allocator(should_clear_after_alloc=False) + + # then replace `p = ffi.new("char[]", bigsize)` with: + p = new_nonzero("char[]", bigsize) + +**NOTE:** the following is a general warning that applies particularly +(but not only) to PyPy versions 5.6 or older (PyPy > 5.6 attempts to +account for the memory returned by ``ffi.new()`` or a custom allocator; +and CPython uses reference counting). If you do large allocations, then +there is no hard guarantee about when the memory will be freed. You +should avoid both ``new()`` and ``new_allocator()()`` if you want to be +sure that the memory is promptly released, e.g. before you allocate more +of it. + +An alternative is to declare and call the C ``malloc()`` and ``free()`` +functions, or some variant like ``mmap()`` and ``munmap()``. Then you +control exactly when the memory is allocated and freed. For example, +add these two lines to your existing ``ffibuilder.cdef()``:: + + void *malloc(size_t size); + void free(void *ptr); + +and then call these two functions manually:: + + p = lib.malloc(n * ffi.sizeof("int")) + try: + my_array = ffi.cast("int *", p) + ... + finally: + lib.free(p) + +In cffi version 1.12 you can indeed use ``ffi.new_allocator()`` but use the +``with`` statement (see ``ffi.release()``) to force the free function to be +called at a known point. The above is equivalent to this code:: + + my_new = ffi.new_allocator(lib.malloc, lib.free) # at global level + ... + with my_new("int[]", n) as my_array: + ... + + +.. _ffi-release: + +ffi.release() and the context manager ++++++++++++++++++++++++++++++++++++++ + +**ffi.release(cdata)**: release the resources held by a cdata object from +``ffi.new()``, ``ffi.gc()``, ``ffi.from_buffer()`` or +``ffi.new_allocator()()``. The cdata object must not be used afterwards. +The normal Python destructor of the cdata object releases the same resources, +but this allows the releasing to occur at a known time, as opposed as at an +unspecified point in the future. +*New in version 1.12.* + +``ffi.release(cdata)`` is equivalent to ``cdata.__exit__()``, which means that +you can use the ``with`` statement to ensure that the cdata is released at the +end of a block (in version 1.12 and above):: + + with ffi.from_buffer(...) as p: + do something with p + +The effect is more precisely as follows: + +* on an object returned from ``ffi.gc(destructor)``, ``ffi.release()`` will + cause the ``destructor`` to be called immediately. + +* on an object returned from a custom allocator, the custom free function + is called immediately. + +* on CPython, ``ffi.from_buffer(buf)`` locks the buffer, so ``ffi.release()`` + can be used to unlock it at a known time. On PyPy, there is no locking + (so far); the effect of ``ffi.release()`` is limited to removing the link, + allowing the original buffer object to be garbage-collected even if the + cdata object stays alive. + +* on CPython this method has no effect (so far) on objects returned by + ``ffi.new()``, because the memory is allocated inline with the cdata object + and cannot be freed independently. It might be fixed in future releases of + cffi. + +* on PyPy, ``ffi.release()`` frees the ``ffi.new()`` memory immediately. It is + useful because otherwise the memory is kept alive until the next GC occurs. + If you allocate large amounts of memory with ``ffi.new()`` and don't free + them with ``ffi.release()``, PyPy (>= 5.7) runs its GC more often to + compensate, so the total memory allocated should be kept within bounds + anyway; but calling ``ffi.release()`` explicitly should improve performance + by reducing the frequency of GC runs. + +After ``ffi.release(x)``, do not use anything pointed to by ``x`` any longer. +As an exception to this rule, you can call ``ffi.release(x)`` several times +for the exact same cdata object ``x``; the calls after the first one are +ignored. + + +ffi.init_once() ++++++++++++++++ + +**ffi.init_once(function, tag)**: run ``function()`` once. The +``tag`` should be a primitive object, like a string, that identifies +the function: ``function()`` is only called the first time we see the +``tag``. The return value of ``function()`` is remembered and +returned by the current and all future ``init_once()`` with the same +tag. If ``init_once()`` is called from multiple threads in parallel, +all calls block until the execution of ``function()`` is done. If +``function()`` raises an exception, it is propagated and nothing is +cached (i.e. ``function()`` will be called again, in case we catch the +exception and try ``init_once()`` again). *New in version 1.4.* + +Example:: + + from _xyz_cffi import ffi, lib + + def initlib(): + lib.init_my_library() + + def make_new_foo(): + ffi.init_once(initlib, "init") + return lib.make_foo() + +``init_once()`` is optimized to run very quickly if ``function()`` has +already been called. (On PyPy, the cost is zero---the JIT usually +removes everything in the machine code it produces.) + +*Note:* one motivation__ for ``init_once()`` is the CPython notion of +"subinterpreters" in the embedded case. If you are using the +out-of-line API mode, ``function()`` is called only once even in the +presence of multiple subinterpreters, and its return value is shared +among all subinterpreters. The goal is to mimic the way traditional +CPython C extension modules have their init code executed only once in +total even if there are subinterpreters. In the example above, the C +function ``init_my_library()`` is called once in total, not once per +subinterpreter. For this reason, avoid Python-level side-effects in +``function()`` (as they will only be applied in the first +subinterpreter to run); instead, return a value, as in the following +example:: + + def init_get_max(): + return lib.initialize_once_and_get_some_maximum_number() + + def process(i): + if i > ffi.init_once(init_get_max, "max"): + raise IndexError("index too large!") + ... + +.. __: https://bitbucket.org/cffi/cffi/issues/233/ + + +.. _ffi-getctype: +.. _ffi-list-types: + +ffi.getctype(), ffi.list_types() +++++++++++++++++++++++++++++++++ + +**ffi.getctype("C type" or , extra="")**: return the string +representation of the given C type. If non-empty, the "extra" string is +appended (or inserted at the right place in more complicated cases); it +can be the name of a variable to declare, or an extra part of the type +like ``"*"`` or ``"[5]"``. For example +``ffi.getctype(ffi.typeof(x), "*")`` returns the string representation +of the C type "pointer to the same type than x"; and +``ffi.getctype("char[80]", "a") == "char a[80]"``. + +**ffi.list_types()**: Returns the user type names known to this FFI +instance. This returns a tuple containing three lists of names: +``(typedef_names, names_of_structs, names_of_unions)``. *New in +version 1.6.* + + +.. _`Preparing and Distributing modules`: cdef.html#loading-libraries + + +Conversions +----------- + +This section documents all the conversions that are allowed when +*writing into* a C data structure (or passing arguments to a function +call), and *reading from* a C data structure (or getting the result of a +function call). The last column gives the type-specific operations +allowed. + ++---------------+------------------------+------------------+----------------+ +| C type | writing into | reading from |other operations| ++===============+========================+==================+================+ +| integers | an integer or anything | a Python int or | int(), bool() | +| and enums | on which int() works | long, depending | `[6]`, | +| `[5]` | (but not a float!). | on the type | ``<`` | +| | Must be within range. | (ver. 1.10: or a | | +| | | bool) | | ++---------------+------------------------+------------------+----------------+ +| ``char`` | a string of length 1 | a string of | int(), bool(), | +| | or another | length 1 | ``<`` | ++---------------+------------------------+------------------+----------------+ +| ``wchar_t``, | a unicode of length 1 | a unicode of | | +| ``char16_t``, | (or maybe 2 if | length 1 | int(), | +| ``char32_t`` | surrogates) or | (or maybe 2 if | bool(), ``<`` | +| `[8]` | another similar | surrogates) | | ++---------------+------------------------+------------------+----------------+ +| ``float``, | a float or anything on | a Python float | float(), int(),| +| ``double`` | which float() works | | bool(), ``<`` | ++---------------+------------------------+------------------+----------------+ +|``long double``| another with | a , to | float(), int(),| +| | a ``long double``, or | avoid loosing | bool() | +| | anything on which | precision `[3]` | | +| | float() works | | | ++---------------+------------------------+------------------+----------------+ +| ``float`` | a complex number | a Python complex | complex(), | +| ``_Complex``, | or anything on which | number | bool() | +| ``double`` | complex() works | | `[7]` | +| ``_Complex`` | | | | ++---------------+------------------------+------------------+----------------+ +| pointers | another with | a |``[]`` `[4]`, | +| | a compatible type (i.e.| |``+``, ``-``, | +| | same type | |bool() | +| | or ``void*``, or as an | | | +| | array instead) `[1]` | | | ++---------------+------------------------+ | | +| ``void *`` | another with | | | +| | any pointer or array | | | +| | type | | | ++---------------+------------------------+ +----------------+ +| pointers to | same as pointers | | ``[]``, ``+``, | +| structure or | | | ``-``, bool(), | +| union | | | and read/write | +| | | | struct fields | ++---------------+------------------------+ +----------------+ +| function | same as pointers | | bool(), | +| pointers | | | call `[2]` | ++---------------+------------------------+------------------+----------------+ +| arrays | a list or tuple of | a |len(), iter(), | +| | items | |``[]`` `[4]`, | +| | | |``+``, ``-`` | ++---------------+------------------------+ +----------------+ +| ``char[]``, | same as arrays, or a | | len(), iter(), | +| ``un/signed`` | Python byte string | | ``[]``, ``+``, | +| ``char[]``, | | | ``-`` | +| ``_Bool[]`` | | | | ++---------------+------------------------+ +----------------+ +|``wchar_t[]``, | same as arrays, or a | | len(), iter(), | +|``char16_t[]``,| Python unicode string | | ``[]``, | +|``char32_t[]`` | | | ``+``, ``-`` | +| | | | | ++---------------+------------------------+------------------+----------------+ +| structure | a list or tuple or | a | read/write | +| | dict of the field | | fields | +| | values, or a same-type | | | +| | | | | ++---------------+------------------------+ +----------------+ +| union | same as struct, but | | read/write | +| | with at most one field | | fields | ++---------------+------------------------+------------------+----------------+ + +`[1]` ``item *`` is ``item[]`` in function arguments: + + In a function declaration, as per the C standard, a ``item *`` + argument is identical to a ``item[]`` argument (and ``ffi.cdef()`` + doesn't record the difference). So when you call such a function, + you can pass an argument that is accepted by either C type, like + for example passing a Python string to a ``char *`` argument + (because it works for ``char[]`` arguments) or a list of integers + to a ``int *`` argument (it works for ``int[]`` arguments). Note + that even if you want to pass a single ``item``, you need to + specify it in a list of length 1; for example, a ``struct point_s + *`` argument might be passed as ``[[x, y]]`` or ``[{'x': 5, 'y': + 10}]``. + + As an optimization, CFFI assumes that a + function with a ``char *`` argument to which you pass a Python + string will not actually modify the array of characters passed in, + and so passes directly a pointer inside the Python string object. + (On PyPy, this optimization is only available since PyPy 5.4 + with CFFI 1.8.) + +`[2]` C function calls are done with the GIL released. + + Note that we assume that the called functions are *not* using the + Python API from Python.h. For example, we don't check afterwards + if they set a Python exception. You may work around it, but mixing + CFFI with ``Python.h`` is not recommended. (If you do that, on + PyPy and on some platforms like Windows, you may need to explicitly + link to ``libpypy-c.dll`` to access the CPython C API compatibility + layer; indeed, CFFI-generated modules on PyPy don't link to + ``libpypy-c.dll`` on their own. But really, don't do that in the + first place.) + +`[3]` ``long double`` support: + + We keep ``long double`` values inside a cdata object to avoid + loosing precision. Normal Python floating-point numbers only + contain enough precision for a ``double``. If you really want to + convert such an object to a regular Python float (i.e. a C + ``double``), call ``float()``. If you need to do arithmetic on + such numbers without any precision loss, you need instead to define + and use a family of C functions like ``long double add(long double + a, long double b);``. + +`[4]` Slicing with ``x[start:stop]``: + + Slicing is allowed, as long as you specify explicitly both ``start`` + and ``stop`` (and don't give any ``step``). It gives a cdata + object that is a "view" of all items from ``start`` to ``stop``. + It is a cdata of type "array" (so e.g. passing it as an argument to a + C function would just convert it to a pointer to the ``start`` item). + As with indexing, negative bounds mean really negative indices, like in + C. As for slice assignment, it accepts any iterable, including a list + of items or another array-like cdata object, but the length must match. + (Note that this behavior differs from initialization: e.g. you can + say ``chararray[10:15] = "hello"``, but the assigned string must be of + exactly the correct length; no implicit null character is added.) + +`[5]` Enums are handled like ints: + + Like C, enum types are mostly int types (unsigned or signed, int or + long; note that GCC's first choice is unsigned). Reading an enum + field of a structure, for example, returns you an integer. To + compare their value symbolically, use code like ``if x.field == + lib.FOO``. If you really want to get their value as a string, use + ``ffi.string(ffi.cast("the_enum_type", x.field))``. + +`[6]` bool() on a primitive cdata: + + *New in version 1.7.* In previous versions, it only worked on + pointers; for primitives it always returned True. + + *New in version 1.10:* The C type ``_Bool`` or ``bool`` converts to + Python booleans now. You get an exception if a C ``_Bool`` happens + to contain a value different from 0 and 1 (this case triggers + undefined behavior in C; if you really have to interface with a + library relying on this, don't use ``_Bool`` in the CFFI side). + Also, when converting from a byte string to a ``_Bool[]``, only the + bytes ``\x00`` and ``\x01`` are accepted. + +`[7]` libffi does not support complex numbers: + + *New in version 1.11:* CFFI now supports complex numbers directly. + Note however that libffi does not. This means that C functions that + take directly as argument types or return type a complex type cannot + be called by CFFI, unless they are directly using the API mode. + +`[8]` ``wchar_t``, ``char16_t`` and ``char32_t`` + + See `Unicode character types`_ below. + + +.. _file: + +Support for FILE +++++++++++++++++ + +You can declare C functions taking a ``FILE *`` argument and +call them with a Python file object. If needed, you can also do ``c_f += ffi.cast("FILE *", fileobj)`` and then pass around ``c_f``. + +Note, however, that CFFI does this by a best-effort approach. If you +need finer control over buffering, flushing, and timely closing of the +``FILE *``, then you should not use this special support for ``FILE *``. +Instead, you can handle regular ``FILE *`` cdata objects that you +explicitly make using fdopen(), like this: + +.. code-block:: python + + ffi.cdef(''' + FILE *fdopen(int, const char *); // from the C + int fclose(FILE *); + ''') + + myfile.flush() # make sure the file is flushed + newfd = os.dup(myfile.fileno()) # make a copy of the file descriptor + fp = lib.fdopen(newfd, "w") # make a cdata 'FILE *' around newfd + lib.write_stuff_to_file(fp) # invoke the external function + lib.fclose(fp) # when you're done, close fp (and newfd) + +The special support for ``FILE *`` is anyway implemented in a similar manner +on CPython 3.x and on PyPy, because these Python implementations' files are +not natively based on ``FILE *``. Doing it explicity offers more control. + + +.. _unichar: + +Unicode character types ++++++++++++++++++++++++ + +The ``wchar_t`` type has the same signedness as the underlying +platform's. For example, on Linux, it is a signed 32-bit integer. +However, the types ``char16_t`` and ``char32_t`` (*new in version 1.11*) +are always unsigned. + +Note that CFFI assumes that these types are meant to contain UTF-16 or +UTF-32 characters in the native endianness. More precisely: + +* ``char32_t`` is assumed to contain UTF-32, or UCS4, which is just the + unicode codepoint; + +* ``char16_t`` is assumed to contain UTF-16, i.e. UCS2 plus surrogates; + +* ``wchar_t`` is assumed to contain either UTF-32 or UTF-16 based on its + actual platform-defined size of 4 or 2 bytes. + +Whether this assumption is true or not is unspecified by the C language. +In theory, the C library you are interfacing with could use one of these +types with a different meaning. You would then need to handle it +yourself---for example, by using ``uint32_t`` instead of ``char32_t`` in +the ``cdef()``, and building the expected arrays of ``uint32_t`` +manually. + +Python itself can be compiled with ``sys.maxunicode == 65535`` or +``sys.maxunicode == 1114111`` (Python >= 3.3 is always 1114111). This +changes the handling of surrogates (which are pairs of 16-bit +"characters" which actually stand for a single codepoint whose value is +greater than 65535). If your Python is ``sys.maxunicode == 1114111``, +then it can store arbitrary unicode codepoints; surrogates are +automatically inserted when converting from Python unicodes to UTF-16, +and automatically removed when converting back. On the other hand, if +your Python is ``sys.maxunicode == 65535``, then it is the other way +around: surrogates are removed when converting from Python unicodes +to UTF-32, and added when converting back. In other words, surrogate +conversion is done only when there is a size mismatch. + +Note that Python's internal representations is not specified. For +example, on CPython >= 3.3, it will use 1- or 2- or 4-bytes arrays +depending on what the string actually contains. With CFFI, when you +pass a Python byte string to a C function expecting a ``char*``, then +we pass directly a pointer to the existing data without needing a +temporary buffer; however, the same cannot cleanly be done with +*unicode* string arguments and the ``wchar_t*`` / ``char16_t*`` / +``char32_t*`` types, because of the changing internal +representation. As a result, and for consistency, CFFI always allocates +a temporary buffer for unicode strings. + +**Warning:** for now, if you use ``char16_t`` and ``char32_t`` with +``set_source()``, you have to make sure yourself that the types are +declared by the C source you provide to ``set_source()``. They would be +declared if you ``#include`` a library that explicitly uses them, for +example, or when using C++11. Otherwise, you need ``#include +`` on Linux, or more generally something like ``typedef +uint16_t char16_t;``. This is not done automatically by CFFI because +``uchar.h`` is not standard across platforms, and writing a ``typedef`` +like above would crash if the type happens to be already defined. -- cgit v1.2.3