A Deeper Understanding of the Use of Magic Methods in the Python Virtual Machine

In this post, we will introduce you to some fancy magic methods in cpython to help us realize our own fancy functions, which of course also include some very practical magic methods.

Analyzing the hash method in depth

In Python, the__hash__() method is a special method (also known as a magic method or double underscore method) for returning the hash value of an object. The hash value is an integer that is used in the dictionary (dict) and the set (set) and other data structures for fast lookups and comparisons.__hash__() method is useful when creating custom hashable objects, such as instances of custom classes, so that they can be used as keys for dictionaries or elements of collections.

Here are some things to keep in mind and examples to help with understanding__hash__() Methods:

If two objects are equal (according to__eq__() method definition), their hash values should be equal. That is, if thea == b is true, thenhash(a) == hash(b) This is important because when we use collections and dictionaries, we need to make sure that there is only one of each type of object in the container, and if this is not the case, then there may be more than one of the same type of object in the container.
rewrite__hash__() method usually needs to be overridden at the same time__eq__() method to ensure object equality and hash consistency.
If the object is not defined__eq__method, then don't define the__hash__method, because if you encounter objects with equal hash values, if you can't compare the two objects, then it's easy to have more than one identical object.

import random
class Person:
    def __init__(self, name, age):
         = name
         = age
    def __eq__(self, other):
        return  ==  and  == 
    def __hash__(self):
        return hash((, )) + (0, 1024)
    def __repr__(self):
        return f"[name={}, age={}]"
person1 = Person("Alice", 25)
person2 = Person("Alice", 25)
print(hash(person1))  
print(hash(person2))  
container = set()
(person1)
(person2)
print(container)

In the above code we have rewritten the__hash__ function, but the hash value of the object is added to a random number each time we call it, so even if the name and age are equal, if the hash value does not want to be equal, then it may result in the existence of more than one of the same object in the container, and the above code will result in the same object, and the output of the above program is shown below:

1930083569156318318
1930083569156318292
{[name=Alice, age=25], [name=Alice, age=25]}

If you rewrite the class object above:

class Person:
    def __init__(self, name, age):
         = name
         = age
    def __eq__(self, other):
        return  ==  and  == 
    def __hash__(self):
        return hash((, ))
    def __repr__(self):
        return f"[name={}, age={}]"

Then there will only be one object in the containerizer.

If we only override the __hash__ method, we will also have multiple identical objects in the container.

class Person:
    def __init__(self, name, age):
         = name
         = age
    # def __eq__(self, other):
    #     return  ==  and  == 
    def __hash__(self):
        return hash((, )) # + (0, 1024)
    def __repr__(self):
        return f"[name={}, age={}]"

This is because if the hash value of the same time also need to compare the two objects are equal, if equal then do not need to save the object to the container, if not equal then the object will be added to the container.

bool method

In Python, the object.__bool__() method is a special method for defining boolean values for objects. It is called automatically when you use Boolean operators such as if statements and logical operations. The __bool__() method should return a boolean value that represents the true value of the object. If the __bool__() method is undefined, Python tries to call the __len__() method to determine the object's true value. If the __len__() method returns zero, the object is considered false; otherwise, the object is considered true.

Here are some things to keep in mind to help understand the __bool__() method:

__bool__() method is automatically called when a Boolean operation is applied to an object. For example, theif statement, the truth value of the object is determined by the__bool__() Methodology Determination.
__bool__() method should return a boolean value (True maybeFalse）。
in the event that__bool__() method is undefined, Python will try to call the__len__() method to determine the object's truth value.
When the length of the object is zero, i.e.__len__() method returns zero, the object is considered false; otherwise, the object is considered true.
If neither defined__bool__() method, nor is the__len__() method, the object defaults to true.

The following is an example showing how to use the __bool__() method in a custom class:

class NonEmptyList:
    def __init__(self, items):
         = items
    def __bool__(self):
        return len() > 0
my_list = NonEmptyList([1, 2, 3])
if my_list:
    print("The list is not empty.")
else:
    print("The list is empty.")

Attribute Access to Objects

In Python, we can customize the behavior of attribute access through a number of special methods. In this article, we will introduce these special methods in depth, including __getitem__(), __setitem__(), __delitem__(), and __getattr__() methods, to help better understand the mechanism and application scenarios of attribute access.

The __getitem__() method is a special method for indexing operations. When we access an object's property by index, Python automatically calls this method and passes the index value as an argument. We can implement the get operation on the property in this method and return the corresponding value.

class MyList:
    def __init__(self):
         = []
    def __getitem__(self, index):
        return [index]
my_list = MyList()
my_list.data = [1, 2, 3]
print(my_list[1])  # exports: 2

In the above example, we defined a class named MyList that has a property data which is a list. By overriding the __getitem__() method, we make it possible to access the data property of the MyList object by index. When we perform an indexing operation using the form my_list[1], Python automatically calls the __getitem__() method and passes the index value 1 as an argument.

The __setitem__() method is used for attribute setting operations, i.e., assigning values to an object's attributes by index. When we use the index operation and assign a value to an object's property, Python automatically calls the __setitem__() method and passes the index value and the assigned value as arguments.

class MyList:
    def __init__(self):
         = [0 for i in range(2)]
    def __setitem__(self, index, value):
        [index] = value
my_list = MyList()
my_list[0] = 1
my_list[1] = 2
print(my_list.data)  # exports: [1, 2]

In the above example, we have overridden the __setitem__() method to implement the set operation on object properties. When we perform the assignment operations my_list[0] = 1 and my_list[1] = 2, Python automatically calls the __setitem__() method and passes the indexed and assigned values to the method. In the __setitem__() method, we assign the value to the corresponding index position of the object's data property.

The __delitem__() method is used as a special method for deleting object attributes. When we delete an object property using the del statement, Python automatically calls the __delitem__() method and passes in the index value of the property to be deleted as an argument.

class MyDict:
    def __init__(self):
         = dict()
    def __delitem__(self, key):
        print("In __delitem__")
        del [key]
obj = MyDict()
["key"] = "val"
del obj["key"] # exports In __delitem__

__getattr__() is a special method that is called automatically when accessing a non-existent attribute. It takes one argument, the name of the attribute, and either returns the corresponding value or raises an AttributeError exception.

class MyClass:
    def __getattr__(self, name):
        if name == 'color':
            return 'blue'
        else:
            raise AttributeError(f"'MyClass' object has no attribute '{name}'")
my_obj = MyClass()
print(my_obj.color)  # Output: blue
print(my_obj.size)   # cause AttributeError: 'MyClass' object has no attribute 'size'

In the above example, when accessing my_obj.color, since the color attribute does not exist, Python automatically calls the __getattr__() method and returns the predefined value 'blue'. When accessing my_obj.size, the __getattr__() method raises an AttributeError exception because the attribute does not exist either.

__setattr__() is a special method to be called automatically when setting an attribute value. It takes two arguments, the attribute name and the attribute value. We can process, validate or log the attribute in this method.

class MyClass:
    def __init__(self):
         = 'red' # exports：Setting attribute 'color' to 'red'
    def __setattr__(self, name, value):
        print(f"Setting attribute '{name}' to '{value}'")
        super().__setattr__(name, value)
my_obj = MyClass()
my_obj.color = 'blue'  # exports: Setting attribute 'color' to 'blue'

When we access an object's attributes using . method to access an object's attributes, the object's __getattribute__ function is called first, and only if the attribute doesn't exist is __getattr__ called. If the __getattribute__ method cannot find the specified attribute, Python calls the __getattr__ method. Here is the code to add the __getattr__ method to the previous example class, CustomClass:

class CustomClass:
    def __init__(self):
         = "Hello, world!"
    def __getattribute__(self, name):
        print(f"Accessing attribute: {name}")
        return super().__getattribute__(name)
    def __getattr__(self, name):
        print(f"Attribute {name} not found")
        return None

In this example, we have added the __getattr__ method to CustomClass. When the __getattribute__ method fails to find the specified attribute, it will automatically call the __getattr__ method and print the name of the attribute, "attribute", along with a message that the attribute was not found.

We execute the following code:

obj = CustomClass()
print()
print(obj.nonexistent_attribute)

The output is shown below:

Accessing attribute: attribute
Hello, world!
Accessing attribute: nonexistent_attribute
Attribute nonexistent_attribute not found
None

First, we access the existent attribute attribute, at which point the __getattribute__ method is called and prints out the attribute name "attribute" and then returns the actual value of the attribute "Hello, world!". Next, we try to access the nonexistent attribute nonexistent_attribute, and since the __getattribute__ method is unable to find the attribute, the __getattr__ method is called and prints the attribute name "nonexistent_attribute" along with the message that the attribute was not found, and then returns None.

context manager (computing)

A context manager is a very useful tool when we need to do something before or after the execution of a specific block of code. A context manager ensures that resources are allocated and released correctly, regardless of whether or not the code block throws an exception. In Python, we can create custom context managers by implementing the __enter__ and __exit__ methods.

The following is a simple context manager example that shows how to use the object.__enter__ and object.__exit__ methods to create a context manager for file operations:

class FileContextManager:
    def __init__(self, filename, mode):
         = filename
         = mode
         = None
    def __enter__(self):
         = open(, )
        return 
    def __exit__(self, exc_type, exc_value, traceback):
        ()
with FileContextManager('', 'w') as file:
    ('Hello, world!')

In the above example, the FileContextManager class implements the __enter__ and __exit__ methods. In the __enter__ method, we open the file and return the file object so that we can use it in the with statement block. In the __exit__ method, we close the file.

The __exit__ method is called regardless of whether the block throws an exception or not to ensure that the file is closed correctly. This avoids problems such as resource leaks and file locking. Using a context manager simplifies code and provides a consistent approach to resource management, especially in situations where resources need to be opened and closed, such as file operations, database connections, and so on.

The __exit__ method of the above context manager has three parameters: exc_type, exc_value, and traceback, which are described in detail below:

exc_type (exception type): this parameter indicates the type of exception raised. If no exception is raised in the context manager's code block, it will be None. if an exception is raised, exc_type will be the type of the raised exception.
exc_value (exception value): this parameter represents an instance of the exception raised. It contains detailed information about the exception, such as an error message. If no exception was raised, its value will also be None.
traceback: This parameter is a traceback object that contains stack trace information about the exception. It provides the code path and call relationship that led to the exception. If no exception was raised, its value will be None.

Above is an in-depth understanding of the use of magic methods in the Python virtual machine in detail, more information about Python magic methods please pay attention to my other related articles!