SoFunction
Updated on 2025-05-15

Python converts special invisible character Unicode encoding into visible string

When an immutable object changes the value, the new value will be placed in another memory address and the original memory address will be released. From the programmer's perspective, the variable name has not changed, but the memory address pointed to by the variable name has changed. For mutable objects, such as list, change the value of list, its memory start address remains unchanged, and the output value of id (list) is the same. Therefore, when using list as the default value of a function parameter, using the function again will not reassign a default list to the parameter, but will use the previous list, but this list may have changed a lot, so it will cause a bug.

If not specified, the following are based on Python3

1. Default parameters

In order to simplify the call of functions, python provides a default parameter mechanism:

def pow(x, n = 2):
     
        r = 1
        while n > 0:
            r *= x
            n -= 1
        return r

In this way, when calling the pow function, the last parameter can be omitted and not written:

print(pow(5)) # output: 25

When defining a function with default parameters, you need to pay attention to the following:

  • The required parameters must be in front and the default parameters must be in the back;
  • What parameters are set as default parameters? Generally speaking, setting the default parameter with small changes in parameter values ​​is set.

For example, python built-in functions:

print(*objects, sep=' ', end='\n', file=, flush=False)

As can be seen from the function signature, a print statement using a simple call such as print (‘hello python’) actually passes in many default values, and the default parameters make the call of the function very simple.

2. List as a pit of default parameters

Quote an official classic example address:

def bad_append(new_item, a_list=[]):
        a_list.append(new_item)
        return a_list
     
    print(bad_append('1'))
    print(bad_append('2'))

This example is not printed as expected:

['1']
    ['2']

Instead, it printed:

['1']
    ['1', '2']

In fact, this error problem is not on the default parameters, but on our incorrect understanding of the initialization of the default parameters.

3. Function initialization

According to Python philosophy: Everything is an object

A function is also an object, as shown in the following example:

import types
     
    def test():
        pass
     
    print(type(test)) # <class 'function'>
    print(isinstance(test, )) # True

In this way, a function is an instance object of a class or its subclass. Then the object must have its initialization. Generally speaking, the interpreter completes the initialization of the function instance when reading the end of the function. After initialization, there is a mapping relationship such as function name to function object, which allows you to access the function object through function name, and all the properties of the function are also determined, including the required parameters and the values ​​of the default parameters. Therefore, every time the function is called, the default parameter value is the same (if there are default parameters).

Let's use an intuitive example to illustrate:

import datetime as dt
    from time import sleep
     
     
    def log_time(msg, time=()):
     
        sleep(1) # Thread pauses for one second        print("%s: %s" % ((), msg))
     
    log_time('msg 1')
    log_time('msg 2')
    log_time('msg 3')

Run this program and the output is:

2017-05-17T12:23:46.327258: msg 1
    2017-05-17T12:23:46.327258: msg 2
    2017-05-17T12:23:46.327258: msg 3

Even if sleep(1) is used to pause the thread for one second, the factor of fast execution of the program is ruled out. The print time of the three calls in the output is still the same, that is, the value of the default parameter time in the three calls is the same.

The above example may not fully illustrate the problem. The following is to observe the memory address of the default parameters.

First, you need to understand the built-in function id (object):

id(object)
 Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
 CPython implementation detail: This is the address of the object in memory.

That is, the id(object) function returns the unique identity of an object. This identifier is an integer that is guaranteed to be unique and unchanged during the life of an object. During overlapping life cycles, two objects may have the same id value.
In the CPython interpreter implementation, the value of id (object) is the memory address of the object.

The following example uses the id (object) function to clearly illustrate the problem:

def bad_append(new_item, a_list=[]):
        
        print('address of a_list:', id(a_list))
        a_list.append(new_item)
        return a_list
     
    print(bad_append('1'))
    print(bad_append('2'))

output:

address of a_list: 31128072
    ['1']
    address of a_list: 31128072
    ['1', '2']

When bad_append is called twice, the address of the default parameter a_list is the same.

Moreover, a_list is a mutable object. Adding new elements using the append method will not cause the list object to be recreated and address reassigned. In this way, 'just' changes the object at the address pointed to by the default parameter. When you call this address again, you can see the last modification.

Then, it is not surprising that the above outputs appear, because they originally point to the same memory address.

4. Variable and immutable

Different behaviors are shown when the default parameters point to mutable and immutable objects.

The variable default parameters behave like the appeal example.

Immutable default parameters

Let’s first look at an example:

def immutable_test(i = 1):
        print('before operation, address of i', id(i))
        i += 1
        print('after operation, address of i', id(i))
        return i
        
    print(immutable_test())
    print(immutable_test())

Output:

before operation, address of i 1470514832
    after operation, address of i 1470514848
    2
    before operation, address of i 1470514832
    after operation, address of i 1470514848
    2

It is obvious that the value of the default parameter i on the second call will not be affected by the first call. Because i points to an immutable object, the operation on i will cause memory re-allocation and object re-creation, then after i += 1 in the function, the name i points to another address; according to the rules of the default parameters, the next time the address is called, the address i points to is still the address assigned when the function is defined, and the value 1 of this address has not been changed.

In fact, mutable default parameters and immutable default parameters are not of much value to discuss here. Just like the so-called value passing or reference passing in other languages, it will not only affect the default parameters.

V. Best Practices

Multiple calls to immutable default parameters will not have any impact, and the results of multiple calls to mutable default parameters will not be in line with expectations. Then when using mutable default parameters, it cannot be initialized only once when the function is defined, but should be initialized every time it is called.

The best practice is to specify the value of the mutable default parameter when defining a function, and rebind the value of the default parameter inside the function body. Here are the best practices for the above two mutable default parameters examples:

def good_append(new_item, a_list = None):
     
        if a_list is None:
            a_list = []
     
        a_list.append(new_item)
        return a_list
     
    print(good_append('1'))
    print(good_append('2'))
    print(good_append('c', ['a', 'b']))

​​​​​​​    import datetime as dt
    from time import sleep
     
    def log_time(msg, time = None):
     
        if time is None:
            time = ()
     
        sleep(1)
        print("%s: %s" % ((), msg))
     
    log_time('msg 1')
    log_time('msg 2')
    log_time('msg 3')

This is the article about converting the special invisible character Unicode encoding into visible strings in Python. For more related contents of python Unicode encoding to visible strings, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!