SoFunction
Updated on 2024-11-19

Python write a CAPTCHA image data annotation GUI program with source code

To do CAPTCHA image recognition, whether using traditional ORC techniques, or using statistical machine learning or using deep learning neural networks, it is indispensable to collect a large number of relevant CAPTCHA images from the network to make a dataset of samples for training.

To collect CAPTCHA images, you can directly use Python to batch download, and after downloading, you need to label the downloaded CAPTCHA images. In general, the filename of a CAPTCHA image is the actual string of the CAPTCHA in the image.

Without the help of tools, our process for labeling captcha images as described above is:

1. Open the folder where the picture is located;
2. Select a picture;
3. Right mouse button to rename;
4. Enter the correct string;
5. Preservation

Mr. State's personal experience, a CAPTCHA to complete the labeling of the data, it takes about 10 to 20 seconds. A lot of time is wasted in the repetitive right mouse button rename operation. So, using the Python wrapper for Qt - PyQt5, I wrote a small tool to facilitate data annotation of CAPTCHA images, saving time and valuing life.

The program runs as shown in the following motion picture:

Let's find out how to write this CAPTCHA image data labeling program.

First, let's build a graphical interface. This graphical interface contains an image display control, a text input control, and four button controls. Based on this, we choose three layouts to arrange the layout of the GUI. The core control in the GUI window is a QWidget() with its layout layer set to a grid layout QGridLayout(). Three controls are placed in it: an image display control QWidget(), a text input control QLineText(), and a group of four buttons QWidget().

Meanwhile, the image display control QWidget() contains a QLabel() label to placeholder with a horizontal layout layer QHBoxLayout(); and the button group control QWidget() adds the four button controls QPushButton() with a vertical layout layer QVBoxLayout(). Finally, the code is shown below:

class ImgTag():
 def __init__(self):
 super().__init__()
 ("Captcha image labeling, Mr. State.")
 # Masters and master layouts
 self.main_widget = ()
 self.main_layout = ()
 self.main_widget.setLayout(self.main_layout)

 # Image Display Controls
 self.img_widget = ()
 self.img_layout = ()
 self.img_widget.setLayout(self.img_layout)
 # Tag placeholder
 self.img_view = ("Please select a folder!")
 self.img_view.setAlignment()
 self.img_layout.addWidget(self.img_view)

 # Image annotation controls
 self.img_input = ()

 # Control Button Controls
 self.opera_widget = ()
 self.opera_layout = ()
 self.opera_widget.setLayout(self.opera_layout)
 # The buttons
 self.select_img_btn = ("Select Catalog")
 self.previous_img_btn = ("Previous.")
 self.previous_img_btn.setEnabled(False)
 self.next_img_btn = ("Next.")
 self.next_img_btn.setEnabled(False)
 self.save_img_btn = ("Save.")
 self.save_img_btn.setEnabled(False)
 # Add buttons to the layout
 self.opera_layout.addWidget(self.select_img_btn)
 self.opera_layout.addWidget(self.previous_img_btn)
 self.opera_layout.addWidget(self.next_img_btn)
 self.opera_layout.addWidget(self.save_img_btn)

 # Add controls to the master layout layer
 self.main_layout.addWidget(self.img_widget,0,0,4,4)
 self.main_layout.addWidget(self.opera_widget,0,4,5,1)
 self.main_layout.addWidget(self.img_input,4,0,1,4)

 # Status bar
 self.img_total_current_label = ()
 self.img_total_label = ()
 ().addPermanentWidget(self.img_total_current_label)
 ().addPermanentWidget(self.img_total_label, stretch=0) # Add permanent controls to the status bar

 # Setting up the core UI interface controls
 (self.main_widget)

Running the above code, we can get the following graphical interface as shown below:

Below, we add event responses to this static GUI.

Second, select the directory to read the file

First, let's implement the "Select Directory" button. When this button is clicked, it opens the folder selection box, and then after selecting a folder, it automatically reads the image files in the folder and displays the first image to the graphical display control.

Here, we implement the call to the folder dialog via (), which returns a string of the path to the selected folder. Then through the listdir() method of the os module, we get all the files under the folder, iterate over them, extract the image files, and add these image files to a new list. The code is shown below:

# Select Directory button
def select_img_click(self):
 self.dir_path = (self,'Select folder')
 # print(self.dir_path)
 dir_list = (self.dir_path)
 img_list = []
 for dir in dir_list:
 suffix_list = ['jpg','png','jpeg','bmp',]
 if ('.')[-1].lower() in suffix_list:
  img_list.append(dir)

Next, we continue to traverse this list to generate an indexed dictionary of images, which is used to record information about the order of each image to facilitate the switching operations of the previous and next buttons.

# Image file index dictionary
self.img_index_dict = dict()
for i,d in enumerate(img_list):
 self.img_index_dict[i] = d
self.current_index = 0 # Current image index
# Path to the current image file
self.current_filename = (
 self.dir_path,self.img_index_dict[self.current_index]
)

Then, a Qt image is instantiated with the help of the QImage() class, and the image is displayed with the setPixmap setting in the image placeholder tag.

# Instantiate an image
image = (self.current_filename)
self.img_width = () # Image width
self.img_height = () # Image height
self.img_scale = 1
 = (self.img_width*self.img_scale,self.img_height*self.img_scale)

# Display images in the img_view control
self.img_view.setPixmap(())

Then set the content of the text input box, get the focus of the text input box and select the content of the text input box:

# Set the text content of the img_input control
self.img_input.setText(self.current_text)
self.img_input.setFocus() # Get the focus of the input box
self.img_input.selectAll() # Select All Text

Finally, set the information about the number of images in the status bar, including the current image and the total number of images:

# Setting the status bar Number of pictures information
self.img_total_current_label.setText("{}".format(self.current_index+1))
self.img_total_label.setText("/{total}".format(total=len(img_list)))

All of the above code is written in the select_img_click() method. After writing the select_img_click() method, we bind it to the "select directory" click signal:

self.select_img_btn.(self.select_img_click)

In this way, the function of selecting a directory and displaying the first picture in the directory is realized. The effect is shown in the following motion picture:

Next, let's implement the button function for the next image

Third, switch the next picture

To switch to the next image, we first need to rename the currently displayed image to the contents of the text input box:

# Next picture
def next_img_click(self):
 # Modify the current image file name
 new_tag = self.img_input.text() # Get the content of the current input box
 current_img = self.img_index_dict[self.current_index] # Get the name of the current picture
 try:
 (
  (self.dir_path,current_img),
  (self.dir_path,new_tag+'.'+current_img.split('.')[-1])
 ) # Modify the file name
 self.img_index_dict[self.current_index] = new_tag+'.'+current_img.split('.')[-1]
 except FileExistsError as e: # Same name file exceptions
 print(repr(e))
 (
  self, 'Hints', 'A file with the same name already exists!',
  
 )

Next, add 1 to the value of the image's current index variable, use this index value to get the filename of the next image, and then read it as an image and display it on the label placeholder control in the same way as before, as well as updating the information in the status bar:

# Current image index plus 1
self.current_index += 1
if self.current_index in self.img_index_dict.keys():
 # Path to the current image file
 self.current_filename = (
 self.dir_path, self.img_index_dict[self.current_index]
 )
 # Instantiate an image
 image = (self.current_filename)
 self.img_width = () # Image width
 self.img_height = () # Image height
 self.img_scale = 1
  = (self.img_width * self.img_scale, self.img_height * self.img_scale)

 # Display images in the img_view control
 self.img_view.setPixmap(())
 # Current file name
 self.current_text = self.img_index_dict[self.current_index].split('.')[0]
 # Set the text content of the img_input control
 self.img_input.setText(self.current_text)
 self.img_input.setFocus() # Get the focus of the input box
 self.img_input.selectAll() # Select all text

 # Setting up the status bar
 self.img_total_current_label.setText(str(self.current_index+1))
else:
 self.current_index -=1
 (
 self,'Hints','All images have been labeled!',
 
 )

This way, by calling the next_img_click() method, we can switch to the next image. We bind it to the "next" button, the "save" button and the enter signal of the text input box, "Save" button or directly enter after marking a data to switch to the next picture:

self.next_img_btn.(self.next_img_click)
self.save_img_btn.(self.next_img_click)
self.img_input.(self.next_img_click) # Carriage Return Event Binding

In this way, the function of switching the next picture is also realized, and its effect is shown in the following moving picture:

Four, switch the previous picture

Sometimes we need to return to the previous labeled image, this time the function of switching the previous image is also necessary. The logic of switching to the previous image is basically the same as that of switching to the next image, except that the index value of the image needs to be reduced by one:

# Previous image
def previous_img_click(self):
 # Modify the current image file name
 new_tag = self.img_input.text() # Get the content of the current input box
 current_img = self.img_index_dict[self.current_index] # Get the name of the current picture
 try:
 (
  (self.dir_path, current_img),
  (self.dir_path, new_tag + '.' + current_img.split('.')[-1])
 ) # Modify the file name
 self.img_index_dict[self.current_index] = new_tag + '.' + current_img.split('.')[-1]
 except FileExistsError as e: # Same name file exceptions
 print(repr(e))
 (
  self, 'Hints', 'A file with the same name already exists!',
  
 )

 # Current image index plus 1
 self.current_index -= 1
 if self.current_index in self.img_index_dict.keys():
 # Path to the current image file
 self.current_filename = (
  self.dir_path, self.img_index_dict[self.current_index]
 )
 # Instantiate an image
 image = (self.current_filename)
 self.img_width = () # Image width
 self.img_height = () # Image height
 self.img_scale = 1
  = (self.img_width * self.img_scale, self.img_height * self.img_scale)

 # Display images in the img_view control
 self.img_view.setPixmap(())
 # Current file name
 self.current_text = self.img_index_dict[self.current_index].split('.')[0]
 # Set the text content of the img_input control
 self.img_input.setText(self.current_text)
 self.img_input.setFocus() # Get the focus of the input box
 self.img_input.selectAll() # Select all text

 # Setting up the status bar
 self.img_total_current_label.setText(str(self.current_index + 1))
 else:
 self.current_index += 1
 (
  self, 'Hints', 'Picture list to the top!',
  
 )

You can see that this and switch the next picture of the code is almost the same, because the core logic is the same, we will be "previous" button click signal binding in this method, you can realize the function of switching the previous picture:

self.previous_img_btn.(self.previous_img_click)

The effect is shown in the following motion picture:

V. Picture scaling

Here, our CAPTCHA image data annotation program has basically been completed, but suddenly found that some CAPTCHA images are very sick, its interference lines and interference points simply make it impossible to see what characters it really is, such a case may need to zoom in or zoom out a little bit of the picture, to facilitate us to confirm the information on the CAPTCHA image, so our program also needs a picture zoom function. In the end, we realize the effect is, hold down Ctrl + mouse wheel, wheel up, picture zoom, wheel down, picture shrink. This is achieved by overriding the mouse wheel event:

# Rewrite mouse wheel events
def wheelEvent(self, event):
 # If Ctrl is held down
 if () == :
 try:
  delta = ().y()
  if delta > 0:
  self.img_scale += 0.25
  self.image_scaled = (self.img_width * self.img_scale, self.img_height * self.img_scale)
  self.img_view.setPixmap((self.image_scaled))
  ().showMessage("The current image scaling is:{}%".format(self.img_scale * 100))
  elif delta < 0:
  if self.img_scale > 0.25:
   self.img_scale -= 0.25
   self.image_scaled = (self.img_width * self.img_scale, self.img_height * self.img_scale)
   self.img_view.setPixmap((self.image_scaled))
   ().showMessage("The current image scaling is:{}%".format(self.img_scale * 100))
 except Exception as e:
  print(traceback.print_exc())
  print(repr(e))

Finally, so that the function of image scaling is also realized, the effect is shown below:

VI. Complete program code

Above, our image CAPTCHA data annotation program is completely written, based on this, we can further use Pyinstaller and other packaging tools, packaged into a binary executable file, easy to spread the use.

Source code download address: Link./s/1FadzPC2FoIJNPMCmpYBKRg Extract code: e4w4

summarize

The above is a small introduction to the Python to write a CAPTCHA image data labeling GUI program with source code, I hope to help you, if you have any questions please leave me a message, I will reply to you in a timely manner. I would also like to thank you very much for your support of my website!
If you find this article helpful, please feel free to reprint it, and please note the source, thank you!