Reasons why python multithreading is less efficient than single threading and its solutions

Reasons why python multithreading is less efficient than single-threading

The standard implementation of the Python language is called CPython, which runs Python programs in two steps

Step 1: Parsing source code text and compiling it into bytecode

Bytecode is an underlying code that allows a program to be represented as an 8-bit instruction
Starting with Python 3.6, this underlying code has actually become 16-bit

Step 2: CPython uses a stack-based interpreter to run bytecode.

The bytecode interpreter must ensure that the relevant state is not disturbed during the execution of a Python program.
CPython uses a mechanism called global interpreter lock (GIL) to keep the relevant state of a running python program undisturbed.

GIL

GIL is actually a mutual-exclusion lock (mutex) that prevents CPython's state from being disturbed in preemptive multithreading environments, where it is possible for one thread to suddenly interrupt another thread to seize control of the program. control of the program. If this preemption comes at a bad time, the state of the interpreter (e.g., reference counting for garbage collection) can be corrupted.

CPython wants to prevent such actions via GIL to ensure that it itself and those of its C extension modules execute every bytecode instruction correctly.

GIL can have a bad effect. In languages like C++ and Java, the CPU's cores can be fully utilized if there are multiple threads within the program that can split up to perform tasks. Although Python supports multiple threads, these threads are bound by GIL, so perhaps only one thread can move forward at a time, rather than multiple threads.

So developers who want to do parallel computing or speed up their programs through multithreading will be disappointed.

concurrency : the ability of a computer to seemingly do many different things at the same time
parallelism : the idea that computers are indeed capable of doing many different things at the same time

Thread execution under multithreading

Get GIL
Execute the code until sleep or the python VM hangs it.
Release of GIL

Reasons why multi-threading is less efficient than single-threading

As we can see above, in python, if you want a thread to execute, you have to get the GIL lock first, and python has only one GIL, and you can only get it into the CPU to execute, and it will be released when it encounters an I/O operation. If the program is purely computational, with no I/O operations, the interpreter will release the lock every 100 operations to give other threads a chance to execute (this number can be adjusted). So although CPython's threading library encapsulates the operating system's native threads, CPython processes as a whole will only have one thread running at a time that has acquired the GIL, while the other threads are waiting for the GIL to be released.

And every time the GIL lock is released, threads compete for the lock and switch threads, which consumes resources. And because of the GIL lock, a process in python can only ever execute one thread at a time (the thread that gets the GIL), which is why python's multi-threading is not very efficient on multi-core CPUs.

Reasons why multi-threading is less or more efficient than single-threading

Why is multi-threading sometimes slower than single-threading and sometimes faster than single-threading for the same code? This is mainly related to the code being run:

CPU-intensive code (all kinds of loop processing, counting, etc.), in this case, due to the computational work, the ticks count will soon reach the 100 threshold, and then trigger the release of the GIL and re-compete (multiple threads switch back and forth of course, it is resource-consuming), so multi-threading under python encounters CPU-intensive code, the single-threaded is more efficient than multi-threaded.

IO-intensive code (file processing, web crawler, etc.), multi-threading can effectively improve the efficiency of single-threaded IO operations will be IO waiting, resulting in unnecessary waste of time. Enabling multithreading can automatically switch to thread B when thread A is waiting, which can improve program execution efficiency without wasting CPU resources. For IO-intensive operations, you can do time-sharing switching, which is faster than single-threading.

If python wants to take full advantage of multi-core CPUs, it can use multiprocessing

Each process has its own independent GIL and does not interfere with each other, so that execution can be truly parallel.

In python, multi-process execution efficiency is better than multi-thread (only for multi-core CPU). So if you want to do parallelism on a multicore CPU, a more common approach is to use multiprocessing, which can effectively improve execution efficiency.

Code Example:

# Multi-threaded
# Elapsed time of last completed thread
# [TIME MEASURE] execute function: gene_1000_field took 3840.604ms
@time_measure
def mult_thread(rows):
    # of total rows
    rows = rows
    # of threads
    batch_size = 4
    cell = (rows / batch_size)
    # Processing data generation
    print('Data generation in progress, number of threads:' + str(batch_size))
    threads = []
    for i in range(batch_size):
        starts = i * cell
        ends = (i + 1) * cell
        file = f"my_data_{str(i)}.csv"
        # t = (target=gene_1000_field_test, args=(starts, ends, file))
        t = (target=gene_1000_field, args=(starts, ends, file))
        ()
        (t)
    # for t in threads:
    #     ()

# Multiple processes
# [TIME MEASURE] execute function: gene_1000_field took 1094.776ms
# Execution time is about the same as for a single thread, for the purpose of
@time_measure
def mult_process(rows):
    # of total rows
    rows = rows
    # of threads
    batch_size = 4
    cell = (rows / batch_size)
    # Processing data generation
    print('Data generation in progress, number of threads:' + str(batch_size))
    process = []
    for i in range(batch_size):
        starts = i * cell
        ends = (i + 1) * cell
        file = f"my_data_{str(i)}.csv"
        # p = Process(target=f, args=('bob',))
        # ()
        # p_lst.append(p)
        # t = (target=gene_1000_field_test, args=(starts, ends, file))
        p = Process(target=gene_1000_field, args=(starts, ends, file))
        ()
        (p)

Multi-threading vs. single-threading in python

# Make a simple crawler:
import threading
import time
import functools
from  import urlopen
# Write a decorator for the time function
def timeit(f):
    @(f)
    def wrapper(*args,**kwargs):
        start_time=()
        res=f(*args,**kwargs)
        end_time=()
        print("%s function runtime:%.2f" % (f.__name__, end_time - start_time))
        return res
    return wrapper
def get_addr(ip):
    url="/json/%s"%(ip)
    urlobj=urlopen(url)
    # The page information returned by the server, which is a string type.
    pagecontent=().decode('utf-8')
    # 2. Processing Json data
    import json
    # decoding: decode the json data format into python recognizable objects.
    dict_data = (pagecontent)
    print("""
    ip : %s
    City: %s
    Country: %s
    """ % (ip, dict_data['city'], dict_data['country']))
#No multithreading
@timeit
def main1():
    ips = ['12.13.14.%s' % (i + 1) for i in range(10)]
    for ip in ips:
        get_addr(ip)
# Multi-threaded approach I
@timeit
def main2():
    ips=['12.13.14.%s'%(i+1) for i in range(10)]
    threads=[]
    for ip in ips:
        t=(target=get_addr,args=(ip,))
        (t)
        ()
    [() for thread in threads]
# Multi-threaded approach II
class MyThread():
    def __init__(self, ip):
        super(MyThread, self).__init__()
         = ip
    def run(self):
        url = "/json/%s" % ()
        urlObj = urlopen(url)
        # The page information returned by the server, which is a string type.
        pageContent = ().decode('utf-8')
        # 2. Processing Json data
        import json
        # decoding: decode the json data format into python recognizable objects.
        dict_data = (pageContent)
        print("""
                            %s
        City: %s
        Country: %s
        """ % (, dict_data['city'], dict_data['country']))
@timeit
def main3():
    ips = ['12.13.14.%s' % (i + 1) for i in range(10)]
    threads = []
    for ip in ips:
        t = MyThread(ip)
        (t)
        ()
    [() for thread in threads]
if __name__ == '__main__':
    main1()
    main2()
    main3()

----> output:
# main1 function runtime:55.06
# main2 function runtime: 5.64
# main3 function runtime:11.06

As you can see from the next, multithreading is indeed much faster, however, it is only suitable for I/O intensive, when the cpu is always occupied in computationally intensive, multithreading is rather slower.

Here's an example.

import threading
import time
def my_counter():
    i = 1
    for count in range(200000000):
        i = i + 2*count
    return True
# Single-threaded
@timeit
def main1():
    thread_array = {}
    for tid in range(2):
        t = (target=my_counter)
        ()
        ()
# Use of multi-threading
@timeit
def main2():
    thread_array = {}
    for tid in range(2):
        t = (target=my_counter)
        ()
        thread_array[tid] = t
    for i in range(2):
        thread_array[i].join()
if __name__ == '__main__':
    main1()
    main2()

-----> output:
Running time of main1 function:27.57
Running time of main2 function:28.19

This is the time to show the multithreaded adaptation scenario

summarize

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.