This document roughly describes the high-level architecture of PyO3. If you want to become familiar with the codebase you are in the right place!
PyO3 provides a bridge between Rust and Python, based on the Python/C API.
Thus, PyO3 has low-level bindings of these API as its core.
On top of that, we have higher-level bindings to operate Python objects safely.
Also, to define Python classes and functions in Rust code, we have trait PyClass
and a set of
protocol traits (e.g., PyIterProtocol
) for supporting object protocols (i.e., __dunder__
methods).
Since implementing PyClass
requires lots of boilerplate, we have a proc-macro #[pyclass]
.
To summarize, there are six main parts to the PyO3 codebase.
- Low-level bindings of Python/C API.
- Bindings to Python objects.
PyClass
and related functionalities.src/pycell.rs
,src/pyclass.rs
, and more
- Protocol methods like
__getitem__
. - Procedural macros to simplify usage for users.
build.rs
src/ffi
contains wrappers of Python/C API.
We aim to provide straight-forward Rust wrappers resembling the file structure of
cpython/Include
.
However, we still lack some APIs and are continuously updating the the module to match the file contents upstream in CPython. The tracking issue is #1289, and contribution is welcome.
In the src/ffi
module, there is lots of conditional compilation such as #[cfg(Py_LIMITED_API)]
,
#[cfg(Py_37)]
, and #[cfg(PyPy)]
.
Py_LIMITED_API
corresponds to #define Py_LIMITED_API
macro in Python/C API.
With Py_LIMITED_API
, we can build a Python-version-agnostic binary called an
abi3 wheel.
Py_37
means that the API is available from Python >= 3.7.
There are also Py_38
, Py_39
, and so on.
PyPy
means that the API definition is for PyPy.
Those flags are set in build.rs
.
src/types
contains bindings to built-in types
of Python, such as dict
and list
.
For historical reasons, Python's object
is called PyAny
in PyO3 and located in src/types/any.rs
.
Currently, PyAny
is a straightforward wrapper of ffi::PyObject
, defined as:
#[repr(transparent)]
pub struct PyAny(UnsafeCell<ffi::PyObject>);
All built-in types are defined as a C struct.
For example, dict
is defined as:
typedef struct {
/* Base object */
PyObject ob_base;
/* Number of items in the dictionary */
Py_ssize_t ma_used;
/* Dictionary version */
uint64_t ma_version_tag;
PyDictKeysObject *ma_keys;
PyObject **ma_values;
} PyDictObject;
However, we cannot access such a specific data structure with #[cfg(Py_LIMITED_API)]
set.
Thus, all builtin objects are implemented as opaque types by wrapping PyAny
, e.g.,:
#[repr(transparent)]
pub struct PyDict(PyAny);
Note that PyAny
is not a pointer, and it is usually used as a pointer to the object in the
Python heap, as &PyAny
.
This design choice can be changed
(see the discussion in #1056).
Since we need lots of boilerplate for implementing common traits for these types
(e.g., AsPyPointer
, AsRef<PyAny>
, and Debug
), we have some macros in
src/types/mod.rs
.
src/pycell.rs
, src/pyclass.rs
, and src/type_object.rs
contain types and
traits to make #[pyclass]
work.
Also, src/pyclass_init.rs
and [src/pyclass_slots.rs
] have related functionalities.
To realize object-oriented programming in C, all Python objects must have the following two fields at the beginning.
#[repr(C)]
pub struct PyObject {
pub ob_refcnt: usize,
pub ob_type: *mut PyTypeObject,
...
}
Thanks to this guarantee, casting *mut A
to *mut PyObject
is valid if A
is a Python object.
To ensure this guarantee, we have a wrapper struct PyCell<T>
in src/pycell.rs
which is roughly:
#[repr(C)]
pub struct PyCell<T: PyClass> {
object: crate::ffi::PyObject,
inner: T,
}
Thus, when copying a Rust struct to a Python object, we first allocate PyCell
on the Python heap and then
move T
into it.
Also, PyCell
provides RefCell-like methods
to ensure Rust's borrow rules.
See the documentation for more.
PyCell<T>
requires that T
implements PyClass
.
This trait is somewhat complex and derives many traits, but the most important one is PyTypeObject
in src/type_object.rs
.
PyTypeObject
is also implemented for built-in types.
In Python, all objects have their types, and types are also objects of type
.
For example, you can see type({})
shows dict
and type(type({}))
shows type
in Python REPL.
T: PyTypeObject
implies that T
has a corresponding type object.
Python has some built-in special methods called dunder, such as __iter__
.
They are called abstract objects layer in
Python/C API.
We provide a way to implement those protocols by using #[pyproto]
and specific traits, such
as PyIterProtocol
.
src/class
defines these traits.
Each protocol method has a corresponding FFI function.
For example, PyIterProtocol::__iter__
has
pub unsafe extern "C" fn iter<T>(slf: *mut PyObject) -> *mut PyObject
.
When #[pyproto]
finds that T
implements PyIterProtocol::__iter__
, it automatically
sets iter<T>
on the type object of T
.
Also, src/class/methods.rs
has utilities for #[pyfunction]
and src/class/impl_.rs
has
some internal tricks for making #[pyproto]
flexible.
pyo3-macros
provides six proc-macro APIs: pymodule
, pyproto
, pyfunction
, pyclass
,
pymethods
, and #[derive(FromPyObject)]
.
pyo3-macros-backend
has the actual implementations of these APIs.
src/derive_utils.rs
contains some utilities used in code generated by these proc-macros,
such as parsing function arguments.
PyO3's build.rs
is relatively long
(about 900 lines) to support multiple architectures, interpreters, and usages.
Below is a non-exhaustive list of its functionality:
- Cross-compiling support.
- If
TARGET
architecture andHOST
architecture differ, we find cross compile information from environment variables (PYO3_CROSS_LIB_DIR
andPYO3_CROSS_PYTHON_VERSION
) or system files.
- If
- Find the interpreter for build and detect the Python version.
- We have to set some version flags like
Py_37
. - If the interpreter is PyPy, we set
PyPy
. - If
PYO3_NO_PYTHON
environment variable is set then the interpreter detection is bypassed entirely and only abi3 extensions can be built.
- We have to set some version flags like
- Check if we are building a Python extension.
- If we are building an extension (e.g., Python library installable by
pip
), we don't linklibpython
. Currently we use theextension-module
feature for this purpose. This may change in the future. See #1123.
- If we are building an extension (e.g., Python library installable by