To do CAPTCHA image recognition, whether using traditional ORC techniques, or using statistical machine learning or using deep learning neural networks, it is indispensable to collect a large number of relevant CAPTCHA images from the network to make a dataset of samples for training.
To collect CAPTCHA images, you can directly use Python to batch download, and after downloading, you need to label the downloaded CAPTCHA images. In general, the filename of a CAPTCHA image is the actual string of the CAPTCHA in the image.
Without the help of tools, our process for labeling captcha images as described above is:
1. Open the folder where the picture is located;
2. Select a picture;
3. Right mouse button to rename;
4. Enter the correct string;
5. Preservation
Mr. State's personal experience, a CAPTCHA to complete the labeling of the data, it takes about 10 to 20 seconds. A lot of time is wasted in the repetitive right mouse button rename operation. So, using the Python wrapper for Qt - PyQt5, I wrote a small tool to facilitate data annotation of CAPTCHA images, saving time and valuing life.
The program runs as shown in the following motion picture:
Let's find out how to write this CAPTCHA image data labeling program.
First, let's build a graphical interface. This graphical interface contains an image display control, a text input control, and four button controls. Based on this, we choose three layouts to arrange the layout of the GUI. The core control in the GUI window is a QWidget() with its layout layer set to a grid layout QGridLayout(). Three controls are placed in it: an image display control QWidget(), a text input control QLineText(), and a group of four buttons QWidget().
Meanwhile, the image display control QWidget() contains a QLabel() label to placeholder with a horizontal layout layer QHBoxLayout(); and the button group control QWidget() adds the four button controls QPushButton() with a vertical layout layer QVBoxLayout(). Finally, the code is shown below:
class ImgTag(): def __init__(self): super().__init__() ("Captcha image labeling, Mr. State.") # Masters and master layouts self.main_widget = () self.main_layout = () self.main_widget.setLayout(self.main_layout) # Image Display Controls self.img_widget = () self.img_layout = () self.img_widget.setLayout(self.img_layout) # Tag placeholder self.img_view = ("Please select a folder!") self.img_view.setAlignment() self.img_layout.addWidget(self.img_view) # Image annotation controls self.img_input = () # Control Button Controls self.opera_widget = () self.opera_layout = () self.opera_widget.setLayout(self.opera_layout) # The buttons self.select_img_btn = ("Select Catalog") self.previous_img_btn = ("Previous.") self.previous_img_btn.setEnabled(False) self.next_img_btn = ("Next.") self.next_img_btn.setEnabled(False) self.save_img_btn = ("Save.") self.save_img_btn.setEnabled(False) # Add buttons to the layout self.opera_layout.addWidget(self.select_img_btn) self.opera_layout.addWidget(self.previous_img_btn) self.opera_layout.addWidget(self.next_img_btn) self.opera_layout.addWidget(self.save_img_btn) # Add controls to the master layout layer self.main_layout.addWidget(self.img_widget,0,0,4,4) self.main_layout.addWidget(self.opera_widget,0,4,5,1) self.main_layout.addWidget(self.img_input,4,0,1,4) # Status bar self.img_total_current_label = () self.img_total_label = () ().addPermanentWidget(self.img_total_current_label) ().addPermanentWidget(self.img_total_label, stretch=0) # Add permanent controls to the status bar # Setting up the core UI interface controls (self.main_widget)
Running the above code, we can get the following graphical interface as shown below:
Below, we add event responses to this static GUI.
Second, select the directory to read the file
First, let's implement the "Select Directory" button. When this button is clicked, it opens the folder selection box, and then after selecting a folder, it automatically reads the image files in the folder and displays the first image to the graphical display control.
Here, we implement the call to the folder dialog via (), which returns a string of the path to the selected folder. Then through the listdir() method of the os module, we get all the files under the folder, iterate over them, extract the image files, and add these image files to a new list. The code is shown below:
# Select Directory button def select_img_click(self): self.dir_path = (self,'Select folder') # print(self.dir_path) dir_list = (self.dir_path) img_list = [] for dir in dir_list: suffix_list = ['jpg','png','jpeg','bmp',] if ('.')[-1].lower() in suffix_list: img_list.append(dir)
Next, we continue to traverse this list to generate an indexed dictionary of images, which is used to record information about the order of each image to facilitate the switching operations of the previous and next buttons.
# Image file index dictionary self.img_index_dict = dict() for i,d in enumerate(img_list): self.img_index_dict[i] = d self.current_index = 0 # Current image index # Path to the current image file self.current_filename = ( self.dir_path,self.img_index_dict[self.current_index] )
Then, a Qt image is instantiated with the help of the QImage() class, and the image is displayed with the setPixmap setting in the image placeholder tag.
# Instantiate an image image = (self.current_filename) self.img_width = () # Image width self.img_height = () # Image height self.img_scale = 1 = (self.img_width*self.img_scale,self.img_height*self.img_scale) # Display images in the img_view control self.img_view.setPixmap(())
Then set the content of the text input box, get the focus of the text input box and select the content of the text input box:
# Set the text content of the img_input control self.img_input.setText(self.current_text) self.img_input.setFocus() # Get the focus of the input box self.img_input.selectAll() # Select All Text
Finally, set the information about the number of images in the status bar, including the current image and the total number of images:
# Setting the status bar Number of pictures information self.img_total_current_label.setText("{}".format(self.current_index+1)) self.img_total_label.setText("/{total}".format(total=len(img_list)))
All of the above code is written in the select_img_click() method. After writing the select_img_click() method, we bind it to the "select directory" click signal:
self.select_img_btn.(self.select_img_click)
In this way, the function of selecting a directory and displaying the first picture in the directory is realized. The effect is shown in the following motion picture:
Next, let's implement the button function for the next image
Third, switch the next picture
To switch to the next image, we first need to rename the currently displayed image to the contents of the text input box:
# Next picture def next_img_click(self): # Modify the current image file name new_tag = self.img_input.text() # Get the content of the current input box current_img = self.img_index_dict[self.current_index] # Get the name of the current picture try: ( (self.dir_path,current_img), (self.dir_path,new_tag+'.'+current_img.split('.')[-1]) ) # Modify the file name self.img_index_dict[self.current_index] = new_tag+'.'+current_img.split('.')[-1] except FileExistsError as e: # Same name file exceptions print(repr(e)) ( self, 'Hints', 'A file with the same name already exists!', )
Next, add 1 to the value of the image's current index variable, use this index value to get the filename of the next image, and then read it as an image and display it on the label placeholder control in the same way as before, as well as updating the information in the status bar:
# Current image index plus 1 self.current_index += 1 if self.current_index in self.img_index_dict.keys(): # Path to the current image file self.current_filename = ( self.dir_path, self.img_index_dict[self.current_index] ) # Instantiate an image image = (self.current_filename) self.img_width = () # Image width self.img_height = () # Image height self.img_scale = 1 = (self.img_width * self.img_scale, self.img_height * self.img_scale) # Display images in the img_view control self.img_view.setPixmap(()) # Current file name self.current_text = self.img_index_dict[self.current_index].split('.')[0] # Set the text content of the img_input control self.img_input.setText(self.current_text) self.img_input.setFocus() # Get the focus of the input box self.img_input.selectAll() # Select all text # Setting up the status bar self.img_total_current_label.setText(str(self.current_index+1)) else: self.current_index -=1 ( self,'Hints','All images have been labeled!', )
This way, by calling the next_img_click() method, we can switch to the next image. We bind it to the "next" button, the "save" button and the enter signal of the text input box, "Save" button or directly enter after marking a data to switch to the next picture:
self.next_img_btn.(self.next_img_click) self.save_img_btn.(self.next_img_click) self.img_input.(self.next_img_click) # Carriage Return Event Binding
In this way, the function of switching the next picture is also realized, and its effect is shown in the following moving picture:
Four, switch the previous picture
Sometimes we need to return to the previous labeled image, this time the function of switching the previous image is also necessary. The logic of switching to the previous image is basically the same as that of switching to the next image, except that the index value of the image needs to be reduced by one:
# Previous image def previous_img_click(self): # Modify the current image file name new_tag = self.img_input.text() # Get the content of the current input box current_img = self.img_index_dict[self.current_index] # Get the name of the current picture try: ( (self.dir_path, current_img), (self.dir_path, new_tag + '.' + current_img.split('.')[-1]) ) # Modify the file name self.img_index_dict[self.current_index] = new_tag + '.' + current_img.split('.')[-1] except FileExistsError as e: # Same name file exceptions print(repr(e)) ( self, 'Hints', 'A file with the same name already exists!', ) # Current image index plus 1 self.current_index -= 1 if self.current_index in self.img_index_dict.keys(): # Path to the current image file self.current_filename = ( self.dir_path, self.img_index_dict[self.current_index] ) # Instantiate an image image = (self.current_filename) self.img_width = () # Image width self.img_height = () # Image height self.img_scale = 1 = (self.img_width * self.img_scale, self.img_height * self.img_scale) # Display images in the img_view control self.img_view.setPixmap(()) # Current file name self.current_text = self.img_index_dict[self.current_index].split('.')[0] # Set the text content of the img_input control self.img_input.setText(self.current_text) self.img_input.setFocus() # Get the focus of the input box self.img_input.selectAll() # Select all text # Setting up the status bar self.img_total_current_label.setText(str(self.current_index + 1)) else: self.current_index += 1 ( self, 'Hints', 'Picture list to the top!', )
You can see that this and switch the next picture of the code is almost the same, because the core logic is the same, we will be "previous" button click signal binding in this method, you can realize the function of switching the previous picture:
self.previous_img_btn.(self.previous_img_click)
The effect is shown in the following motion picture:
V. Picture scaling
Here, our CAPTCHA image data annotation program has basically been completed, but suddenly found that some CAPTCHA images are very sick, its interference lines and interference points simply make it impossible to see what characters it really is, such a case may need to zoom in or zoom out a little bit of the picture, to facilitate us to confirm the information on the CAPTCHA image, so our program also needs a picture zoom function. In the end, we realize the effect is, hold down Ctrl + mouse wheel, wheel up, picture zoom, wheel down, picture shrink. This is achieved by overriding the mouse wheel event:
# Rewrite mouse wheel events def wheelEvent(self, event): # If Ctrl is held down if () == : try: delta = ().y() if delta > 0: self.img_scale += 0.25 self.image_scaled = (self.img_width * self.img_scale, self.img_height * self.img_scale) self.img_view.setPixmap((self.image_scaled)) ().showMessage("The current image scaling is:{}%".format(self.img_scale * 100)) elif delta < 0: if self.img_scale > 0.25: self.img_scale -= 0.25 self.image_scaled = (self.img_width * self.img_scale, self.img_height * self.img_scale) self.img_view.setPixmap((self.image_scaled)) ().showMessage("The current image scaling is:{}%".format(self.img_scale * 100)) except Exception as e: print(traceback.print_exc()) print(repr(e))
Finally, so that the function of image scaling is also realized, the effect is shown below:
VI. Complete program code
Above, our image CAPTCHA data annotation program is completely written, based on this, we can further use Pyinstaller and other packaging tools, packaged into a binary executable file, easy to spread the use.
Source code download address: Link./s/1FadzPC2FoIJNPMCmpYBKRg Extract code: e4w4
summarize
The above is a small introduction to the Python to write a CAPTCHA image data labeling GUI program with source code, I hope to help you, if you have any questions please leave me a message, I will reply to you in a timely manner. I would also like to thank you very much for your support of my website!
If you find this article helpful, please feel free to reprint it, and please note the source, thank you!