Detailed explanation of the multi-threaded processing method of OpenCV video stream in Python development

Preface

In visual projects, it is often necessary to use OpenCV to read local video streams of webcams in the Python environment, and then adjust various models, such as target classification, target detection, face recognition, etc. If single-threaded processing is used, there will often be serious delays. If the computing power is tight and the model inference takes longer, this sense of delay will be more obvious and there will be frame stuck. In this case, the code is often changed from a single thread to a multi-thread, that is, a single thread captures video frames in real time, and the main thread can copy the most recent frame from the child thread when needed.

When processing video streams with a single thread, if the object detection model is large or the task is complex, it will affect the processing speed. Using multi-threading allows video capture and object detection to run in their respective threads, which can make full use of the CPU's multi-core processing capabilities and improve overall processing efficiency and real-time performance.

In real-time video processing, especially when it comes to computing-intensive tasks such as deep learning model inference, multithreading can indeed bring significant performance improvements. By separating video capture and processing, frame loss or delay due to excessive processing time can be avoided.

1. Multi-threading

In Python, you can use the threading module to implement multithreading. Here is a simple example that demonstrates how to create and use multithreading in Python:

1. Import threading module

First, import Python's threading module, which provides the functions required for multithreading programming.

import threading

2. Create thread execution function

Define a function as the execution body of the thread. This function will run in each thread. Write the code logic inside a function that you want the thread to execute.

def my_function():
    # Your code here
    pass

3. Create thread object

Use () to create a thread object, specify the objective function as the function you just defined, and pass in the required parameters.

my_thread = (target=my_function, args=(arg1, arg2)) # Pass in parameter args

4. Dynamic thread

Use the start() method to start the thread.

my_thread.start()

5. Wait for the thread to complete execution

Use the join() method to wait for the thread to complete execution. This will cause the main thread to wait for the end of the child thread.

my_thread.join()

6. Example

Here is a simple example that demonstrates how to use multithreading:

import threading
import time

# thread execution body functiondef print_numbers():
    for i in range(5):
        print(f"Child Thread: {i}")
        (1)

# Create thread objectthread = (target=print_numbers)

# Start the thread()

# Other operations in the main threadfor i in range(5):
    print(f"Main Thread: {i}")
    (0.5)

# Wait for the child thread to execute()

print("Main Thread exiting...")

The print_numbers() function is the execution body of the child thread, which prints numbers in the child thread. After the main thread starts the child thread, it will perform its own tasks at the same time. Finally, wait for the child thread to end through the join() method.

2. Video processing

The general video processing code is divided into two parts: reading the next available frame from the camera and image processing of the frame, such as sending the image to the Yolov5 object detection model for detection.

In a program without multithreading, the next frame is read in sequence and processed. The program waits for the next frame to be available and then handles it as necessary. The time it takes to read a frame depends mainly on the time it takes to request, wait for the next video frame, and transfer it from the camera to memory. Whether on the CPU or GPU, the time required to calculate a video frame takes up most of the time spent on video processing.

However, in a program with multithreading, reading the next frame and processing it does not need to be performed sequentially. When a thread performs a task to read the next frame, the main thread can use the CPU or GPU to process the last read frame. In this way, the two tasks can be performed overlappingly, thereby reducing the total time to read and process frames.

1. Single-threaded video processing

# importing required libraries 
import cv2 
import time# opening video capture stream
vcap = (0)
if () is False :
    print("[Exiting]: Error accessing webcam stream.")
    exit(0)
fps_input_stream = int((5))
print("FPS of webcam hardware/input stream: {}".format(fps_input_stream))
grabbed, frame = () # reading single frame for initialization/ hardware warm-up# processing frames in input stream
num_frames_processed = 0 
start = ()
while True :
    grabbed, frame = ()
    if grabbed is False :
        print('[Exiting] No more frames to read')
        break# adding a delay for simulating time taken for processing a frame 
    delay = 0.03 # delay value in seconds. so, delay=1 is equivalent to 1 second 
    (delay) 
    num_frames_processed += ('frame' , frame)
    key = (1)
    if key == ord('q'):
        break
end = ()# printing time elapsed and fps 
elapsed = end-start
fps = num_frames_processed/elapsed 
print("FPS: {} , Elapsed Time: {} , Frames Processed: {}".format(fps, elapsed, num_frames_processed))# releasing input stream , closing all windows 
()
()

2. Video multi-threading

# importing required libraries 
import cv2 
import time 
from threading import Thread # library for implementing multi-threaded processing# defining a helper class for implementing multi-threaded processing 
class WebcamStream :
    def __init__(self, stream_id=0):
        self.stream_id = stream_id   # default is 0 for primary camera 
        
        # opening video capture stream 
              = (self.stream_id)
        if () is False :
            print("[Exiting]: Error accessing webcam stream.")
            exit(0)
        fps_input_stream = int((5))
        print("FPS of webcam hardware/input stream: {}".format(fps_input_stream))
            
        # reading a single frame from vcap stream for initializing 
         ,  = ()
        if  is False :
            print('[Exiting] No more frames to read')
            exit(0)#  is set to False when frames are being read from  stream 
         = True# reference to the thread for reading next available frame from input stream 
         = Thread(target=, args=())
         = True # daemon threads keep running in the background while the program is executing 
        
    # method for starting the thread for grabbing next available frame in input stream 
    def start(self):
         = False
        ()# method for reading next frame 
    def update(self):
        while True :
            if  is True :
                break
             ,  = ()
            if  is False :
                print('[Exiting] No more frames to read')
                 = True
                break 
        ()# method for returning latest read frame 
    def read(self):
        return # method called to stop reading frames 
    def stop(self):
         = True# initializing and starting multi-threaded webcam capture input stream 
webcam_stream = WebcamStream(stream_id=0) #  stream_id = 0 is for primary camera 
webcam_stream.start()# processing frames in input stream
num_frames_processed = 0 
start = ()
while True :
    if webcam_stream.stopped is True :
        break
    else :
        frame = webcam_stream.read()# adding a delay for simulating time taken for processing a frame 
    delay = 0.03 # delay value in seconds. so, delay=1 is equivalent to 1 second 
    (delay) 
    num_frames_processed += ('frame' , frame)
    key = (1)
    if key == ord('q'):
        break
end = ()
webcam_stream.stop() # stop the webcam stream# printing time elapsed and fps 
elapsed = end-start
fps = num_frames_processed/elapsed 
print("FPS: {} , Elapsed Time: {} , Frames Processed: {}".format(fps, elapsed, num_frames_processed))# closing all windows 
()

The above code creates a WebcamStream class that contains the logic for multithreading camera frames. In the main loop, it still processes each frame in sequential manner, but the thread that reads the frame is running in the background.

However, while the above code improves speed, there are also the following improvements:

Logic for processing multiple frames: The code has a fixed delay every time a frame is processed in the main loop, which does not truly simulate the time of frame processing. The timestamp of each frame should be considered and the actual time of frame processing should be calculated after processing the frame is processed.
Frame processing under multithreading: Although the video stream reading part is in a separate thread, the main loop is still executed sequentially, and it waits for each frame to be processed. In a multithreaded environment, it may be worth considering processing frames in separate threads.
Memory and resource management: Ensure that all resources are freed when the program exits, especially in multithreaded environments, care needs to be taken to ensure the safe exit of threads.
Code Structure and Annotations: For better readability and maintenance, some comments are added to explain the role of each function and method, as well as the intent of the code block.

3. Multi-threaded code optimization

How to remove fixed delays: The code has fixed delays when processing each frame. Consider using the actual frame processing time instead of using a fixed delay. This can be achieved by recording the timestamp of each frame.

Parallel processing of video frames: Process each frame in sequence in the main thread. In a multithreaded environment, it is possible to consider using multiple threads to process video frames in parallel to speed up processing.

Resource Release: At the end of the program, make sure that all resources are released. This includes closing the video stream, terminating the thread, etc. when appropriate.

import cv2 
import time 
from threading import Thread

class WebcamStream:
    def __init__(self, stream_id=0):
        self.stream_id = stream_id
         = (self.stream_id)
        if not ():
            print("[Exiting]: Error accessing webcam stream.")
            exit(0)
        self.fps_input_stream = int((cv2.CAP_PROP_FPS))
        print("FPS of webcam hardware/input stream: {}".format(self.fps_input_stream))
        ,  = ()
        if not :
            print('[Exiting] No more frames to read')
            exit(0)
         = False
         = Thread(target=, args=())
         = True
        ()

    def update(self):
        while not :
            grabbed, frame = ()
            if not grabbed:
                print('[Exiting] No more frames to read')
                 = True
                break
             = frame

    def read(self):
        return 

    def stop(self):
         = True
        ()
        ()

webcam_stream = WebcamStream(stream_id=0)
num_frames_processed = 0
start = ()
while True:
    frame = webcam_stream.read()
    if webcam_stream.stopped:
        break
    delay = 0.03
    (delay)
    num_frames_processed += 1
    ('frame', frame)
    key = (1)
    if key == ord('q'):
        break
end = ()
webcam_stream.stop()
elapsed = end - start
fps = num_frames_processed / elapsed
print("FPS: {} , Elapsed Time: {} , Frames Processed: {}".format(fps, elapsed, num_frames_processed))
()

The above is a detailed explanation of the multi-threaded processing method of OpenCV video streams in Python development. For more information about Python OpenCV video stream processing, please pay attention to my other related articles!