SoFunction
Updated on 2024-11-12

Python Learning Notes on Threads

1. Customized processes

Customize the Process class by inheriting from Process class and overriding the run method (overriding Process's run method).

from multiprocessing import Process
import time
import os
class MyProcess(Process):
    def __init__(self, name):  ## Rewrite, need __init__, also added new parameters.        ##Process.__init__(self) cannot be omitted, otherwise it reports an error: AttributeError:'XXXX'object has no attribute '_colsed'
        Process.__init__(self)
         = name
    def run(self):
        print("subprocess(%s-%s)activate (a plan)" % (, ()))
        (3)
        print("subprocess(%s-%s)close" % (, ()))
if __name__ == '__main__':
    print("Parent process started")
    p = MyProcess("Ail")
    # Automatically call MyProcess's run() method
    ()
    ()
    print("Parent process terminated.")

# Output results
Parent process starts
Sub-process (Ail-38512) initiated
End of subprocess (Ail-38512)
End of parent process

2. Processes and Threads

Multiprocessing is suitable for CPU-intensive operations (more CPU-operated instructions, such as scientific computing, bit-heavy floating-point calculations);

Multi-threading is suitable for IO-intensive operations (more read and write data operations, such as crawlers, file uploads, downloads)

Threads are concurrent, processes are parallel: processes are independent of each other and are the smallest unit of the system that allocates resources; all threads in the same process share resources.

step: A running program or code is a process, a code that is not running is called a program. A process is the smallest unit of the system for resource allocation, processes have their own memory space, so there is no sharing of data between processes and high overhead.

A process is a dynamic execution of a program. Each process has its own address space, memory, data stack, and other auxiliary data used to track execution. The operating system is responsible for the execution of all processes on it, and the operating system allocates execution time appropriately for these processes.

Thread: the smallest unit of scheduling execution, also called the execution path, can not exist independently, dependent on the existence of the process exists, a process has at least one thread, called the main thread, multiple threads to share memory (data sharing and global variables), and therefore to improve the efficiency of the program running.

A thread is the smallest unit of computing scheduling that an operating system can perform, and it is contained within a process, the actual unit of operation within the process. A thread is a single sequential flow of control within a process. A process can have multiple threads concurrently, each performing a different task in parallel. A thread is an execution context, i.e., a sequence of instructions that a CPU needs to execute.

main thread: The main thread is the first thread spawned in the thread creation process, which is the thread corresponding to the main function.

concurrent program: User-state lightweight threads, scheduling controlled by the user, with their own register context and stack, switching basically no kernel switching overhead, switching flexible.

The relationship between processes and threads

3. Multi-threading

The operating system schedules threads by assigning time slices (CPU runtime) to different threads. When the CPU finishes executing a thread's time slice, it will quickly switch to the next thread, the time slice is very short and the switching speed is so fast that the user does not notice it at all. Multiple threads are executed by the CPU in turn according to the allocated time slice. Most computers today have multi-core CPUs, and multiple threads can be executed by multiple CPUs in parallel under the scheduling of the operating system, which greatly improves the speed of program execution and the efficiency of CPU utilization. The vast majority of mainstream programming languages support multithreading well, however, Python is unable to achieve true multithreading due to GIL locks.

Threads in memory

class method

(1)start() --Start execution of the thread;

(2)run() -- Methods that define a thread (which developers can override in subclasses); the standard run() method initiates a call on the callable object (if one exists) passed as the target argument to the constructor for that object, with positional and keyword arguments taken from the args and kwargs arguments, respectively.

(3)join(timeout=None) -- hangs until the thread started terminates; blocks unless a timeout (in seconds) is given; since join() always returns None, you have to call is_alive() after join() to determine whether a timeout has occurred -- if the thread is still alive, join() times out. A thread can be joined() many times. If attempting to join the current thread results in a deadlock, join() raises a RuntimeError exception. The same exception is thrown if you try to join() a thread that has not yet been started.

(4)is_alive() --Boolean indicating whether the thread is alive or not; this method returns True when the run() method has just started until the run() method has just finished.

(5)threading.current_thread()--Returns the Thread object that currently corresponds to the caller's thread of control. For example, getting the name of the current thread could becurrent_thread().name

5. Multi-threading and multi-process small Case

from threading import Thread
from multiprocessing import Process
import os
def work():
    print('hello,',())
if __name__ == '__main__':
    # Open multiple threads under the main process, each with the same pid as the main process
    t1 = Thread(target=work)  # Start a thread
    t2 = Thread(target=work)  # Start two threads
    ()  ##start()--It must be called at most once per thread  arranges for the object's run() method to be                ## invoked in a separate thread of  method will raise a RuntimeError if called more than once on the                ## same thread object.
    ()
    print('Main thread/main process pid', ())
    # Run multiple processes, each with a different pid
    p1 = Process(target=work)
    p2 = Process(target=work)
    ()
    ()
    print('Main thread/main process pid',())

life cycle

The states of a thread include: created, ready, running, blocked, and finished.

(1) When the object is created, it represents that Thread is initialized internally;

(2) After calling start() method, thread will start to enter the queue ready to run, before obtaining the CPU, memory resources, known as the ready state; polling to obtain the resources, enter the running state; if encountered sleep, is to enter the blocking state;

(3) thread The thread is terminated at the end of the normal operation of the code or when an exception is encountered.

7. Customized threads

(1) Define a class that inherits Thread;

(2) Rewrite __init__ and run().

(3) Create the thread class object;

(4) Start the thread.

import time
import threading
class MyThread():
    def __init__(self,num):
        super().__init__() ### or Thread.__init__()
         = num
    def run(self):
        print('Thread name:', threading.current_thread().getName(), 'Parameters:', , 'Start time:', ('%Y-%m-%d %H:%M:%S'))
if __name__ == '__main__':
    print('Main thread started:',('%Y-%m-%d %H:%M:%S'))
    t1 = MyThread(1)
    t2 = MyThread(2)
    ()
    ()
    ()
    ()
    print('End of main thread:', ('%Y-%m-%d %H:%M:%S'))

8. Thread shared data and GIL (Global Interpreter Lock)

If it is a global variable, it is shared by every thread;

GIL lock: you can use the basketball game scene to simulate, the basketball court as a CPU, a basketball game as a thread, if there is only a basketball court, multiple games have to be queued up, similar to a simple single-core multi-threaded program; if by more than one basketball court, multiple games at the same time, it is a simple multi-core multi-threaded program. However, Python has a special rule: each game must be supervised by a referee, and there is only one referee. This way, no matter how many basketball courts you have, only one court is allowed to play at the same time, all other courts will be idle, and all other games will have to wait.

and Lock

GIL guarantees that there can be multiple threads in a process at the same time, but only one thread is executing; the purpose of locks is to protect shared data, and only one thread can modify the shared data at the same time.

classify as

It has two basic methods, acquire() and release().

When the state is unlocked, acquire() changes the state to locked and returns immediately. When the state is locked, acquire() blocks until another thread calls release() to change it to a non-locked state, then acquire() resets it to a locked state and returns.

release() is called only in the locked state; it changes the state to unlocked and returns immediately. If an attempt is made to release a non-locked lock, a RuntimeError exception is thrown.

Caese as follows:

from threading import Thread
from threading import Lock
import time
number = 0
def task(lock):
    global number
    () ## Holding the lock ##
    for i in range(100000)      number += 1
    () ## Release the lock ##
if __name__ == '__main__':
    lock=Lock()
    t1 = Thread(target=task,args=(lock,))
    t2 = Thread(target=task,args=(lock,))    t3 = Thread(target=task,args=(lock,))
    ()    ()    ()    ()    ()    ()        print('number:',number)

10. Threaded semaphores

class ([values])

values is an internal count, values is 1 by default, if less than 0, it will throw ValueError exception, can be used to control the number of concurrent threads.

Signal implementations.

s=Semaphore(?)

Internally there is a counter counter, the value of the counter is the number of threads that can be opened at the same time. Every time we (), the counter is processed by minus 1, and every time we (), the counter is processed by plus 1. When the counter is 0, the other threads are in a waiting state.

Programs add a counter function (semaphore) to limit the number of threads at a point in time to prevent program crashes or other exceptions.

Case

import time
import threading
s=(5)    # Add a counter
def task():
    ()    # Counter to acquire a lock
    (2)    # Programs hibernate for 2 seconds
    print("The task run at ",())
    ()    # Counter release lock

for i in range(40):
    t1=(target=task,args=())    #Creating threads
    ()    #Starting a thread

It is also possible to use the with operation instead of acquire () and release (), and the above code can be adapted as follows.

import time
import threading
s=(5)    # Add a counter
def task():    with s:   ## With operations like opening files
    ##() #Counter acquires a lock
      (2)    # Programs hibernate for 2 seconds
      print("The task run at ",())
    ##() #Counter releases the lock

for i in range(40):
    t1=(target=task,args=())    #Creating threads
    ()    #Starting a thread

The use of WITH is recommended.

summarize

That's all for this post, I hope it helped you and I hope you'll check back for more from me!