SoFunction
Updated on 2024-11-21

Python multithreading principles and usage examples of analysis

This article is an example of Python multithreading principles and usage. Shared for your reference, as follows:

Let's look at a chestnut first:

Here's a look at the I / O secret type of thread, to give a chestnut - crawler, the following is to climb down the picture with 4 threads to write the file

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import re
import urllib
import threading
import Queue
import timeit
def getHtml(url):
  html_page = (url).read()
  return html_page
# Extract URLs of images from web pages
def getUrl(html):
  pattern = r'src="(http://img.*?)"' # Regular expressions
  imgre = (pattern)
  imglist = (imgre, html) # (pattern,string) find all successful matches in string and return the value as a list
  return imglist
class getImg():
  def __init__(self, queue, thread_name=0): # Threads share a queue
    .__init__(self)
     = queue
    self.thread_name = thread_name
    () # Starting threads
  # Use queues for inter-process communication
  def run(self):
    global count
    while (True):
      imgurl = () # Call the get() method of the queue object to remove and return an item from the queue header
      (imgurl, 'E:\mnt\girls\%' % count)
      count += 1
      if ():
        break
      .task_done() # When the user thread calls task_done() to indicate that it retrieved the item and completed all the work, then the total number of unfinished tasks is reduced.
imglist = []
def main():
  global imglist
  url = "/favorite/beauty/" # The address of the page to be crawled
  html = getHtml(url)
  imglist = getUrl(html)
def main_1():
  global count
  threads = []
  count = 0
  queue = ()
  # Add all tasks to the queue
  for img in imglist:
    (img)
  # Multi-threaded crawl to the pictures
  for i in range(4):
    thread = getImg(queue, i)
    (thread)
  # Block threads until thread execution is complete
  for thread in threads:
    ()
if __name__ == '__main__':
  main()
  t = (main_1)
  print (1)

The execution time for 4 threads is: 0.421320716723 seconds

Modify main_1 for single threaded:

def main_1():
  global count
  threads = []
  count = 0
  queue = ()
  # Add all tasks to the queue
  for img in imglist:
    (img)
  # Multi-threaded crawl to the pictures
  for i in range(1):
    thread = getImg(queue, i)
    (thread)
  # Block threads until thread execution is complete
  for thread in threads:
    ()

The execution time for a single thread is: 1.35626623274 seconds

Take a look at another one:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import threading
import timeit
def countdown(n):
  while n > 0:
    n -= 1
def task1():
  COUNT = 100000000
  thread1 = (target=countdown, args=(COUNT,))
  ()
  ()
def task2():
  COUNT = 100000000
  thread1 = (target=countdown, args=(COUNT // 2,))
  thread2 = (target=countdown, args=(COUNT // 2,))
  ()
  ()
  ()
  ()
if __name__ == '__main__':
  t1 = (task1)
  print "countdown in one thread ", (1)
  t2 = (task2)
  print "countdown in two thread ", (1)

Task1 is single threaded and task2 is dual threaded, the result of execution on my 4 core machine:

countdown in one thread  3.59939150155

countdown in two thread  9.87704289712

Gosh, two-threaded calculations are more than 2x slower than single-threaded calculations, why is that, because countdown is a CPU-intensive task (calculations)

I/O-intensive tasks: when a thread does I/O processing, it releases GIL, other threads get GIL, and when the thread does I/O operations again, it releases GIL, and so on;

CPU-intensive tasks: in multicore multithreading is worse than single-core multithreading, the reason is that single-core multithreading, each time the GIL is released, whichever thread wakes up can acquire the GIL lock, so it can be executed seamlessly (the nature of single-core multithreading is sequential execution), but multicore, after the GIL is released by CPU0, threads on other CPUs will compete, but the GIL may be immediately and again by CPU0 ( CPU0 may be more than one thread), resulting in several other CPUs after the wake-up thread will be awake and wait until the switching time and then enter the pending scheduling state, which will cause thread bumps (thrashing), resulting in even lower efficiency.

Readers interested in more Python related content can check out this site's topic: theSummary of Python process and thread manipulation techniques》、《Python Data Structures and Algorithms Tutorial》、《Summary of Python function usage tips》、《Summary of Python string manipulation techniques》、《Python introductory and advanced classic tutorials》、《Python + MySQL Database Programming Tutorial for Beginnersand theSummary of common database manipulation techniques in Python

I hope the description of this article will help you in Python programming.