SoFunction
Updated on 2024-11-10

Python built-in type float source code learning

This section of "Getting to Know Python's Built-In Types" introduces you to a variety of common built-in types in Python from a source code perspective.

1 Review the basics of float

1.1 PyFloatObject

1.2 PyFloat_Type

C source code (only some fields are listed):

PyTypeObject PyFloat_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "float",
    sizeof(PyFloatObject),
    0,
    (destructor)float_dealloc,                  /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    (reprfunc)float_repr,                       /* tp_repr */
    &float_as_number,                           /* tp_as_number */
    0,                                          /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    (hashfunc)float_hash,                       /* tp_hash */
    0,                                          /* tp_call */
    (reprfunc)float_repr,                       /* tp_str */
    // ...
    0,                                          /* tp_init */
    0,                                          /* tp_alloc */
    float_new,                                  /* tp_new */
};

PyFloat_Type holds a lot of meta information about floating point objects, key fields include:

tp_name: save type name, constant float

tp_dealloc, tp_init, tp_alloc, and tp_new: object creation and destruction related functions

tp_repr: function that generates a representation of the syntax string

tp_str: function to generate a normal string representation.

tp_as_number: set of numeric operations

tp_hash: hash generation function

1.3 Object Creation

Create instance objects from type objects:

Create instance objects via the C API:

PyObject *
PyFloat_FromDouble(double fval);
PyObject *
PyFloat_FromString(PyObject *v);

1.4 Destruction of objects

Python reduces the reference count when the object is no longer needed with the Py_DECREF or Py_XDECREF macros;

When the reference count drops to 0, Python recycles the object with the _Py_Dealloc macro. (More on reference counting to follow)

The _Py_Dealloc macro actually calls *Py_TYPE(op)->tp_dealloc, or float_dealloc for floats:

#define _Py_Dealloc(op) (                               \
    _Py_INC_TPFREES(op) _Py_COUNT_ALLOCS_COMMA          \
    (*Py_TYPE(op)->tp_dealloc)((PyObject *)(op)))

1.5 Summary

Finally, the key functions, macros and call relationships involved in the entire life cycle of the object from creation to destruction are organized as follows:

2 Idle Object Cache Pool

Problem: A lot of temporary object creation and destruction is involved behind floating point operations.

area = pi * r ** 2

This statement first computes the square of the radius r, with the intermediate result being held by a temporary object, assumed to be t; it then computes the product of pi and t, obtaining the final result and assigning it to the variable area;

Finally, destroy the temporary object t. Memory needs to be allocated when creating the object and reclaimed when destroying the object, and a large number of temporary object creation and destruction implies a large number of memory allocation and reclamation operations, which is unacceptable.

Therefore, Python in the floating-point object after the destruction of the object is not in a hurry to reclaim memory, but the object into a free chain table, the subsequent need to create a floating-point object, the first to the free chain table to take, eliminating some of the overhead of allocating memory.

2.1 Idle Chain List for Floating Point Objects

C source code:

#ifndef PyFloat_MAXFREELIST
#define PyFloat_MAXFREELIST    100
#endif
static int numfree = 0;
static PyFloatObject *free_list = NULL;

Source Code Interpretation:

free_list variable: a pointer to the head node of the free list.

numfree variable: maintains the current length of the free link table.

PyFloat_MAXFREELIST macro: Limit the maximum length of the free link table to avoid taking up too much memory

To keep things simple, Python uses the ob_type field as a next pointer to string the free objects into a chained list. float free chained list is illustrated below:

Personal Experience:

Such a pooling technique in Python is used in many places, and it's a widely used approach in real-world projects, so you can experience it specifically.

"Use the ob_type field as a NEXT pointer" is a way to learn, but it also has to be contextualized: readability, whether you need to save this memory, and so on.

2.2 Use of Idle Chain Tables

With a free chain table, when you need to create a floating-point object, you can take the free object out of the chain table, saving the overhead of requesting memory, take PyFloat_FromDouble() as an example: (only part of the code is listed)

PyObject *
PyFloat_FromDouble(double fval)
{
    PyFloatObject *op = free_list;
    if (op != NULL) {
        free_list = (PyFloatObject *) Py_TYPE(op);
        numfree--;
    } else {
        op = (PyFloatObject*) PyObject_MALLOC(sizeof(PyFloatObject));
        if (!op)
            return PyErr_NoMemory();
    }
    /* Inline PyObject_New */
    (void)PyObject_INIT(op, &PyFloat_Type);
    op->ob_fval = fval;
    return (PyObject *) op;
}

Check if free_list is empty

If free_list is non-empty, take out the head node for backup, and free_list points to the second node (see here that the code calls Py_TYPE(), the

(that is, the op's ob_type field, which is the second node), and subtract numfree by 1

If free_list is empty, call PyObject_MALLOC to allocate memory

Finally the op will be set accordingly (including modifying the ob_type) via PyObject_INIT, and then the ob_fval will be set to fval

The diagram is as follows: (Compare the diagram in 2.1, you can see that free_list points to the second node, and the ob_type field of the first node no longer points to the second node, but to the corresponding type object)

When an object is destroyed, Python caches it in a free chain table for later use. float_dealloc function source code is as follows:

static void
float_dealloc(PyFloatObject *op)
{
    if (PyFloat_CheckExact(op)) {
        if (numfree >= PyFloat_MAXFREELIST)  {
            PyObject_FREE(op);
            return;
        }
        numfree++;
        Py_TYPE(op) = (struct _typeobject *)free_list;
        free_list = op;
    }
    else
        Py_TYPE(op)->tp_free((PyObject *)op);
}

Call PyObject_FREE to reclaim object memory if the length of the free link table reaches the limit value

If the length of the free link table does not reach the limit, then the object is inserted into the head of the free link table (this is a good place to review the head insertion method in passing, hh)

3 Other

Question: In the following example, why is the id value of variable e the same as the destroyed variable pi?

>>> pi = 3.14
>>> id(pi)
4565221808
>>> del pi
>>> e = 2.71
>>> id(e)
4565221808

A: When 3.14, a floating-point object, is destroyed, the memory is not reclaimed directly, but the object is cached in a free chain table, at which point 3.14, a floating-point object, is the head node of the free chain table;

When the floating-point object 2.71 is created, when the free link table is non-empty, the head node of the free link table is taken out, and the ob_fval value is modified to 2.71, so that the ids of the two objects are the same.

The above is Python built-in type float source code to learn the details, more information about Python built-in type float please pay attention to my other related articles!