1. Customized processes
Customize the Process class by inheriting from Process class and overriding the run method (overriding Process's run method).
from multiprocessing import Process import time import os class MyProcess(Process): def __init__(self, name): ## Rewrite, need __init__, also added new parameters. ##Process.__init__(self) cannot be omitted, otherwise it reports an error: AttributeError:'XXXX'object has no attribute '_colsed' Process.__init__(self) = name def run(self): print("subprocess(%s-%s)activate (a plan)" % (, ())) (3) print("subprocess(%s-%s)close" % (, ())) if __name__ == '__main__': print("Parent process started") p = MyProcess("Ail") # Automatically call MyProcess's run() method () () print("Parent process terminated.")
# Output results
Parent process starts
Sub-process (Ail-38512) initiated
End of subprocess (Ail-38512)
End of parent process
2. Processes and Threads
Multiprocessing is suitable for CPU-intensive operations (more CPU-operated instructions, such as scientific computing, bit-heavy floating-point calculations);
Multi-threading is suitable for IO-intensive operations (more read and write data operations, such as crawlers, file uploads, downloads)
Threads are concurrent, processes are parallel: processes are independent of each other and are the smallest unit of the system that allocates resources; all threads in the same process share resources.
step: A running program or code is a process, a code that is not running is called a program. A process is the smallest unit of the system for resource allocation, processes have their own memory space, so there is no sharing of data between processes and high overhead.
A process is a dynamic execution of a program. Each process has its own address space, memory, data stack, and other auxiliary data used to track execution. The operating system is responsible for the execution of all processes on it, and the operating system allocates execution time appropriately for these processes.
Thread: the smallest unit of scheduling execution, also called the execution path, can not exist independently, dependent on the existence of the process exists, a process has at least one thread, called the main thread, multiple threads to share memory (data sharing and global variables), and therefore to improve the efficiency of the program running.
A thread is the smallest unit of computing scheduling that an operating system can perform, and it is contained within a process, the actual unit of operation within the process. A thread is a single sequential flow of control within a process. A process can have multiple threads concurrently, each performing a different task in parallel. A thread is an execution context, i.e., a sequence of instructions that a CPU needs to execute.
main thread: The main thread is the first thread spawned in the thread creation process, which is the thread corresponding to the main function.
concurrent program: User-state lightweight threads, scheduling controlled by the user, with their own register context and stack, switching basically no kernel switching overhead, switching flexible.
The relationship between processes and threads
3. Multi-threading
The operating system schedules threads by assigning time slices (CPU runtime) to different threads. When the CPU finishes executing a thread's time slice, it will quickly switch to the next thread, the time slice is very short and the switching speed is so fast that the user does not notice it at all. Multiple threads are executed by the CPU in turn according to the allocated time slice. Most computers today have multi-core CPUs, and multiple threads can be executed by multiple CPUs in parallel under the scheduling of the operating system, which greatly improves the speed of program execution and the efficiency of CPU utilization. The vast majority of mainstream programming languages support multithreading well, however, Python is unable to achieve true multithreading due to GIL locks.
Threads in memory
class method
(1)start()
--Start execution of the thread;
(2)run()
-- Methods that define a thread (which developers can override in subclasses); the standard run() method initiates a call on the callable object (if one exists) passed as the target argument to the constructor for that object, with positional and keyword arguments taken from the args and kwargs arguments, respectively.
(3)join(timeout=None)
-- hangs until the thread started terminates; blocks unless a timeout (in seconds) is given; since join() always returns None, you have to call is_alive() after join() to determine whether a timeout has occurred -- if the thread is still alive, join() times out. A thread can be joined() many times. If attempting to join the current thread results in a deadlock, join() raises a RuntimeError exception. The same exception is thrown if you try to join() a thread that has not yet been started.
(4)is_alive()
--Boolean indicating whether the thread is alive or not; this method returns True when the run() method has just started until the run() method has just finished.
(5)threading.current_thread()--
Returns the Thread object that currently corresponds to the caller's thread of control. For example, getting the name of the current thread could becurrent_thread().name
。
5. Multi-threading and multi-process small Case
from threading import Thread from multiprocessing import Process import os def work(): print('hello,',()) if __name__ == '__main__': # Open multiple threads under the main process, each with the same pid as the main process t1 = Thread(target=work) # Start a thread t2 = Thread(target=work) # Start two threads () ##start()--It must be called at most once per thread arranges for the object's run() method to be ## invoked in a separate thread of method will raise a RuntimeError if called more than once on the ## same thread object. () print('Main thread/main process pid', ()) # Run multiple processes, each with a different pid p1 = Process(target=work) p2 = Process(target=work) () () print('Main thread/main process pid',())
life cycle
The states of a thread include: created, ready, running, blocked, and finished.
(1) When the object is created, it represents that Thread is initialized internally;
(2) After calling start() method, thread will start to enter the queue ready to run, before obtaining the CPU, memory resources, known as the ready state; polling to obtain the resources, enter the running state; if encountered sleep, is to enter the blocking state;
(3) thread The thread is terminated at the end of the normal operation of the code or when an exception is encountered.
7. Customized threads
(1) Define a class that inherits Thread;
(2) Rewrite __init__ and run().
(3) Create the thread class object;
(4) Start the thread.
import time import threading class MyThread(): def __init__(self,num): super().__init__() ### or Thread.__init__() = num def run(self): print('Thread name:', threading.current_thread().getName(), 'Parameters:', , 'Start time:', ('%Y-%m-%d %H:%M:%S')) if __name__ == '__main__': print('Main thread started:',('%Y-%m-%d %H:%M:%S')) t1 = MyThread(1) t2 = MyThread(2) () () () () print('End of main thread:', ('%Y-%m-%d %H:%M:%S'))
8. Thread shared data and GIL (Global Interpreter Lock)
If it is a global variable, it is shared by every thread;
GIL lock: you can use the basketball game scene to simulate, the basketball court as a CPU, a basketball game as a thread, if there is only a basketball court, multiple games have to be queued up, similar to a simple single-core multi-threaded program; if by more than one basketball court, multiple games at the same time, it is a simple multi-core multi-threaded program. However, Python has a special rule: each game must be supervised by a referee, and there is only one referee. This way, no matter how many basketball courts you have, only one court is allowed to play at the same time, all other courts will be idle, and all other games will have to wait.
and Lock
GIL guarantees that there can be multiple threads in a process at the same time, but only one thread is executing; the purpose of locks is to protect shared data, and only one thread can modify the shared data at the same time.
classify as
It has two basic methods, acquire() and release().
When the state is unlocked, acquire() changes the state to locked and returns immediately. When the state is locked, acquire() blocks until another thread calls release() to change it to a non-locked state, then acquire() resets it to a locked state and returns.
release() is called only in the locked state; it changes the state to unlocked and returns immediately. If an attempt is made to release a non-locked lock, a RuntimeError exception is thrown.
Caese as follows:
from threading import Thread from threading import Lock import time number = 0 def task(lock): global number () ## Holding the lock ## for i in range(100000) number += 1 () ## Release the lock ## if __name__ == '__main__': lock=Lock() t1 = Thread(target=task,args=(lock,)) t2 = Thread(target=task,args=(lock,)) t3 = Thread(target=task,args=(lock,)) () () () () () () print('number:',number)
10. Threaded semaphores
class ([values])
values is an internal count, values is 1 by default, if less than 0, it will throw ValueError exception, can be used to control the number of concurrent threads.
Signal implementations.
s=Semaphore(?)
Internally there is a counter counter, the value of the counter is the number of threads that can be opened at the same time. Every time we (), the counter is processed by minus 1, and every time we (), the counter is processed by plus 1. When the counter is 0, the other threads are in a waiting state.
Programs add a counter function (semaphore) to limit the number of threads at a point in time to prevent program crashes or other exceptions.
Case
import time import threading s=(5) # Add a counter def task(): () # Counter to acquire a lock (2) # Programs hibernate for 2 seconds print("The task run at ",()) () # Counter release lock for i in range(40): t1=(target=task,args=()) #Creating threads () #Starting a thread
It is also possible to use the with operation instead of acquire () and release (), and the above code can be adapted as follows.
import time import threading s=(5) # Add a counter def task(): with s: ## With operations like opening files ##() #Counter acquires a lock (2) # Programs hibernate for 2 seconds print("The task run at ",()) ##() #Counter releases the lock for i in range(40): t1=(target=task,args=()) #Creating threads () #Starting a thread
The use of WITH is recommended.
summarize
That's all for this post, I hope it helped you and I hope you'll check back for more from me!