SoFunction
Updated on 2024-11-13

Accelerating Python to "take off" with Cython (Recommended)

For the record, the title doesn't mistype "Python" as "Cython", because it's about something called "Cython".

CythonCython is a compiler that allows Python scripts to support C extensions. Cython can convert Python+C mixed-code .pyx scripts to C code, mainly used to optimize the performance of Python scripts or Python calls to C function libraries. Due to the inherent poor performance of Python, using C to extend Python has become a common way to improve the performance of Python, and Cython is considered one of the more common ways of extension.

We can compare several of the industry's leading solutions for extending Python to support the C language:

There is a trial version watermark because of poverty T_T

ctypes is a Python standard library support program, directly in the Python script to import C's .so library to call, simple and direct. swig is a general-purpose to allow high-level scripting language extension to support C's tool, naturally, also supports Python. ctypes did not play, do not comment. C language program performance as a benchmark, then, cython encapsulated down 20%, swig encapsulated down 70%. Functionality, swig structure and callback function to use typemap to manually write the conversion rules, typemap rules are slightly complex to write, the experience is not very good. cython in the structure and callbacks should also be hand-coded processing, but relatively simple.

Cython Simple Examples

Let's try to familiarize ourselves with Cython by having a Python script call a function written in C that prints "Hello World". Note: The full code for all the examples in this article can be found in gihub >>> cython_tutorials

/*filename: hello_world.h */
void print_hello_world();
/*filename: hello_world.c */
#include <>
#include "hello_world.h"

void print_hello_world()
{
 printf("hello world...");
}

int main(int arch, char *argv[])
{
 print_hello_world();
 return (0);
}

#file: hello_world.pyx

cdef extern from "hello_world.h":
 void print_hello_world()

def cython_print_hello_world():
 print_hello_world()
#filename: Makefile
all: hello_world cython_hello_world

hello_world:
 gcc hello_world.c -c hello_world.c
 gcc hello_world.o -o hello_world 

cython:
 cython cython_hello_world.pyx

cython_hello_world: cython
 gcc cython_hello_world.c -fPIC -c
 gcc -shared -lpython2.7 -o cython_hello_world.so hello_world.o cython_hello_world.o

clean:
 rm -rf hello_world hello_world.o cython_hello_world.so cython_hello_world.c cython_hello_world.o

The most important thing you can do to extend C with Cython is to write .pyx script files. A .pyx script is a bridge between Python and C. A .pyx script can be written in either Python syntax or C-like syntax.

$ make all # The detailed compilation process can be seen in the Makefile with the relevant commands
$ python
>>> import cython_hello_world
>>> cython_hello_world.cython_print_hello_world()
hello world...
>>>

As you can see, we successfully called the C implementation of the function in the Python interpreter.

Cython's Do's and Don'ts

All tools/languages are pleasantly simple to use, but in-depth details will find everywhere "hidden killers". Recently, the project needs to extend the underlying C library to Python call, so the introduction of Cython, the practice of the process of stepping on a lot of pitfalls, stay up late a lot of night T_T. encountered the following points need to pay special attention to the point:

  1. Things defined in .pyx with cdef are invisible to .py except for classes;
  2. It is not possible to manipulate C types in .py, if you want to manipulate C types in .py you have to convert from python object to C types in .pyx or wrap the class with C types containing set/get methods;
  3. Although Cython can do automatic type conversion between Python's str and C's "char *", it can't do it for fixed-length strings like "char a[n]". You need to use Cython's explicit copy;
  4. The callback function needs to be wrapped in a function and then forced to be converted by C's "void *" before it can be passed into a C function.

1. Types defined with cdef in .pyx are invisible to .py except for classes.

Let's look at an example:

#file: 
cdef inline cdef_function():
 print('cdef_function')

def def_function():
 print('def_function')

cdef int cdef_value

def_value = 999

cdef class cdef_class:
 def __init__(self):
   = 1

class def_class:
 def __init__(self):
   = 1

#file: test_visible.py
import invisible

if __name__ == '__main__':
 print('invisible.__dict__', invisible.__dict__)

The output invisible module has the following members:

$ python 
{
'__builtins__': <module '__builtin__' (built-in)>, 
'def_class': <class invisible.def_class at 0x10feed1f0>, 
'__file__': '/git/EasonCodeShare/cython_tutorials/invisible-for-py/', 
'call_all_in_pyx': <built-in function call_all_in_pyx>, 
'__pyx_unpickle_cdef_class': <built-in function __pyx_unpickle_cdef_class>, 
'__package__': None, 
'__test__': {}, 
'cdef_class': <type 'invisible.cdef_class'>, 
'__name__': 'invisible', 
'def_value': 999, 
'def_function': <built-in function def_function>, 
'__doc__': None}

The function cdef_function and the variable cdef_value that we defined with cdef in .pyx are not visible, only the class cdef_class is visible. So, be aware of the visibility issue during use and don't make the mistake of trying to use invisible module members in .py.

2. .py passes C structure types

Cython's ability to extend C is limited to .pyx scripts, and .py scripts can still only be used with pure Python. if you define a structure in C, passing it in from a Python script is only possible by converting it manually once in .pyx or passing it in with a wrapper class. Let's look at an example:

/*file: person_info.h */
typedef struct person_info_t
{
 int age;
 char *gender;
}person_info;

void print_person_info(char *name, person_info *info);
//file: person_info.c
#include <>
#include "person_info.h"

void print_person_info(char *name, person_info *info)
{
 printf("name: %s, age: %d, gender: %s\n",
   name, info->age, info->gender);
}

#file: cython_person_info.pyx
cdef extern from "person_info.h":
 struct person_info_t:
  int age
  char *gender
 ctypedef person_info_t person_info

 void print_person_info(char *name, person_info *info)

def cyprint_person_info(name, info):
 cdef person_info pinfo
  = 
  = 
 print_person_info(name, &pinfo)

Since the argument to "cyprint_person_info" can only be a python object, we have to hand-code the type conversion in the function before calling the C function.

#file: test_person_info.py
from cython_person_info import cyprint_person_info

class person_info(object):
 age = None
 gender = None

if __name__ == '__main__':
 info = person_info()
  = 18
  = 'male'
 
 cyprint_person_info('handsome', info)
$ python test_person_info.py
name: handsome, age: 18, gender: male

We can call C functions normally. However, there is a problem: if we have a lot of fields in our C structure, it would be a pain in the ass to have to manually encode and convert the type data every time we call a C function from a .py script. A better approach is to provide a wrapper class for C structures.

#file: cython_person_info.pyx
from  cimport malloc, free
cdef extern from "person_info.h":
 struct person_info_t:
  int age
  char *gender
 ctypedef person_info_t person_info

 void print_person_info(char *name, person_info *info)

def cyprint_person_info(name, person_info_wrap info):
 print_person_info(name, )


cdef class person_info_wrap(object):
 cdef person_info *ptr
 
 def __init__(self):
   = <person_info *>malloc(sizeof(person_info))
 
 def __del__(self):
  free()
 
 @property
 def age(self):
  return 
 @
 def age(self, value):
   = value
 
 @property
 def gender(self):
  return 
 @
 def gender(self, value):
   = value

We define a wrapper class "person_info_wrap" for the "person_info" structure and provide member set/get methods so that you can assign values directly in .py. This reduces the need to convert data types in .pyx and improves performance.

#file: test_person_info.py
from cython_person_info import cyprint_person_info, person_info_wrap

if __name__ == '__main__':
 info_wrap = person_info_wrap()
 info_wrap.age = 88
 info_wrap.gender = 'mmmale'
 
 cyprint_person_info('hhhandsome', info_wrap)

$ python test_person_info.py 
name: hhhandsome, age: 88, gender: mmmale

3. python's str passed to C fixed-length strings using strcpy

Just as in C, strings can't be copied directly by assignment to each other, but have to be copied using strcpy, python str and C strings have to be copied to each other using a cython wrapped function. Let's modify the previous example slightly so that the gender member of the person_info structure is a 16-byte long string:

/*file: person_info.h */
typedef struct person_info_t
{
 int age;
 char gender[16];
}person_info;
#file: cython_person_info.pyx
cdef extern from "person_info.h":
  struct person_info_t:
    int age
    char gender[16]
  ctypedef person_info_t person_info
#file: test_person_info.py
from cython_person_info import cyprint_person_info, person_info_wrap

if __name__ == '__main__':
  info_wrap = person_info_wrap()
  info_wrap.age = 88
  info_wrap.gender = 'mmmale'
  
  cyprint_person_info('hhhandsome', info_wrap)

$ make
$ python test_person_info.py 
Traceback (most recent call last):
 File "test_person_info.py", line 7, in <module>
  info_wrap.gender = 'mmmale'
 File "cython_person_info.pyx", line 39, in cython_person_info.person_info_wrap.gender.__set__
   = value
 File "stringsource", line 93, in carray.from_py.__Pyx_carray_from_py_char
IndexError: not enough values found during array assignment, expected 16, got 6

There is no error when cython converts and makes, but when it runs, it says "IndexError: not enough values found during array assignment, expected 16, got 6", which is actually 6 bytes long, "mmmale" is assigned to "char gender[16]" member of person_info structure. The byte-long "mmmale" is assigned to the "char gender[16]" member of the person_info structure. We use strcpy to realize the copy between strings is ok.

#file: cython_person_info.pyx
from  cimport strcpy
…… ……
cdef class person_info_wrap(object):
  cdef person_info *ptr
  …… ……
  @property
  def gender(self):
    return 
  @
  def gender(self, value):
    strcpy(, value)
$ make
$ python test_person_info.py 
name: hhhandsome, age: 88, gender: mmmale

The assignment copy works fine, and the "mmmale" is successfully copied to the gender member of the structure.

4. Wrapping C functions with callback functions as arguments

Callback functions in C are special in that the user passes in a callback function to customize the processing of the data.Cython officially provides a wrapper with a callback function argument for the(for) instance

//file: 
typedef void (*cheesefunc)(char *name, void *user_data);
void find_cheeses(cheesefunc user_func, void *user_data);
//file: 
#include ""

static char *cheeses[] = {
 "cheddar",
 "camembert",
 "that runny one",
 0
};

void find_cheeses(cheesefunc user_func, void *user_data) {
 char **p = cheeses;
 while (*p) {
  user_func(*p, user_data);
  ++p;
 }
}

#file: 
cdef extern from "":
  ctypedef void (*cheesefunc)(char *name, void *user_data)
  void find_cheeses(cheesefunc user_func, void *user_data)

def find(f):
  find_cheeses(callback, <void*>f)

cdef void callback(char *name, void *f):
  (<object>f)(('utf-8'))

import cheese

def report_cheese(name):
  print("Found cheese: " + name)

(report_cheese)

The key step is to define a callback wrapper function in .pyx that is the same as C's callback function, such as "cdef void callback(char *name, void *f)" above. After that, the function in .py is passed as an argument to the wrapper function and converted into a function object to be called in the wrapper function.

Expanded Reading

For further study of Cython you can refer to the official documentation and related books:

Cython 0.28a0 documentation

Cython A Guide for Python Programmers

Learning Cython Programming

This is the whole content of this article.