Interfacing Python with C or C++

Originally posted on JaggedVerge: http://www.jaggedverge.com/2017/08/interfacing-python-with-c-or-c/ (please ask questions or leave comments over there)

One of the main downsides of writing applications in Python is that they can run slow. When you run into an insurmountable runtime speed issue the commonly accepted wisdom is to either use a more efficient Python implementation such as PyPy or to write the performance critical code in a more efficient compiled language (usually C or C++). In this article I'm going to focus on ways to integrate fast compiled code for use by Python code, I usually make a point of consider the feasibility of PyPy vs the costs of using a different implementation as it can give you great performance increases with less added complexity. If you want to use C or C++ compiled code in your Python project here's a quick summary of the options available to you:

If there’s an important option missing please leave a comment over on the post at JaggedVerge with information on it and I'll update this!

Examples

A good way to demonstrate some of these options is via a simple example. Lets say we want to compute a factorial recursively and we are using a fairly naive recursive approach. (Note that function calls in Python are slow which is a non-accidental part of this example, sometimes we could choose a better algorithm, but from the point of view of an example this is a fairly simple function where choosing a better language implementation would result in a win.) Here's the python code equivalent:

def factorial(n):
    """Compute a factorial of n"""
    if n <= 0:
        return 1
    else:
        return n * factorial(n - 1)

Use the Python C interface

If you are using the cPython implementation you can use the C interface to call out to C code. Here’s an example:

#include <Python.h>

/* Computes the factorial value we need for the example */
int
compute_it(int x){
    if(x <= 0){
        return 1;
    }else{
        return compute_it(x - 1) * x;
    }
}

/* Python binding for our compute_it function */
static PyObject *
expensive_computations_compute_it(PyObject *self, PyObject *args){
    int input, result;

    if(!PyArg_ParseTuple(args, "i", &input)){
        return NULL;
    }
    result = compute_it(input);
    return Py_BuildValue("i", result);
}

/* Methods for our expensive_computations object*/
static PyMethodDef ExpensiveComputationMethods[] = {
    {"compute_it", expensive_computations_compute_it, METH_VARARGS, "Computes our test factorial function"},
    {NULL, NULL, 0, NULL} /* End of methods sential value */
};


/* define the module */
static struct PyModuleDef expensive_computations_module = {
    PyModuleDef_HEAD_INIT,
    "demo_module",
    -1,
    ExpensiveComputationMethods
};

/* Init code */
PyMODINIT_FUNC
PyInit_demo_module(void){
    return PyModule_Create(&expensive_computations_module);
}

int main(int argv, char* argc[]){
    /*Init the python interpreter */
    Py_Initialize();

    return 0;
}

More can be found about this way of going about extending Python in the docs. Note that if your use case is just calling C library functions or system calls this is a poor choice, if you aren't writing any custom code CFFI is a lot better in that case (not to mention more portable). One other thing that is worth mentioning is that there's a lot of boilerplate here to convert between Python and C types. This is something that is handled better with some of the other libraries.

Interface Python with C++

Previously the easiest way to do this was with Boost.Python as the library greatly reduced the boilerplate. However boost is a huge dependency to introduce to a project and can be a massive pain to compile. (This is a fairly substantial downside if you weren’t already using boost with your c++ code.) Thankfully there’s another option now that lets you get interoperability without introducing a heavy dependency on boost: pybind11 This library is header only which makes it much easier to compile as part of your c++ project. See the pybind11 tutorial for more about using this option. Roughly it will look like this:

#include <vector>
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
namespace py = pybind11;
/**
 * computes factorials
 * @param n   The input number
 */
int compute_factorial(int n){
    if(x <= 0){
        return 1;
    }else{
        return compute_factorial(x - 1) * x;
    }
}

/**
 * Generate the python bindings for this C++ function
 */
PYBIND11_PLUGIN(factorials_example) {
    py::module m("factorial_example", "Computing factorials in c++ with python bindings using pybind11");
    m.def("factorial_from_cpp", &compute_factorial, "A function which computes factorials in C++");
    return m.ptr();
}

Use CFFI

This is one of the biggest contributions from the PyPy ecosystem to the broader Python universe, it works on PyPy and cPython. The main idea behind CFFIis to generate an API to the C code without needing to worry about ABI issues. This gives you a rather large amount of efficiency in writing the code because a lot of the nasty details are hidden away from you. If you only need to make a call to existing C you don’t even need a C compiler run as part of the process, CFFI takes care of that for you. Essentially you can write out your C code from within Python and CFFI will then go and compile the code and automatically create an API for you to use the C code from within Python. Honestly the whole idea is brilliant as it cuts down the time and effort required to consume existing C code. It also makes porting your code between different python implementations substantially easier because you don’t end up tied to the implementation specific way of creating extensions. You still need a C compiler and there’s a few limitations but overall this approach tends to work well. Going with the factorial example again it looks something like this:

# file "cffi_factorials.py"

from cffi import FFI
ffibuilder = FFI()

ffibuilder.cdef("int factorial_fast(int);")

ffibuilder.set_source("_example",
r"""
    static int factorial_fast(int n){
        if(x <= 0){
            return 1;
        }else{
            return compute_it(x - 1) * x;
        }
    }
""")

if __name__ == "__main__":
    ffibuilder.compile(verbose=True)

We then need to deal with actually building this as part of our build script because we need to call the C compiler to build our code before we can call it from python.

# file "setup.py"
from setuptools import setup

setup(
    ...
    setup_requires=["cffi>=1.0.0"],
    cffi_modules=["cffi_factorials.py:ffibuilder"],
    install_requires=["cffi>=1.0.0"],
)

As you can see this is much less verbose in the code than using the python C interface.

Embedding Python in another language

If the majority of your project is in another language but you need some sort of scripting ability embedding Python into your project can be a good option. This can be substantially easier overall than needing to write an interpreter in the main language. This is more of a high level decision but might be the right way for your project depending on what you are doing.

blogroll

social