SoFunction
Updated on 2024-11-12

Python Universal CAPTCHA Recognition OCR library ddddocr installation and use tutorials

preamble

In the use of automated login website, often after entering the user name and password will encounter a verification code. Today introduces a generic CAPTCHA recognition OCR library, the CAPTCHA recognition completely say bye-bye, its name is ddddocr (with a brother with OCR). Here mainly alphanumeric CAPTCHA to explain.

Project address: /sml2h3/ddddocr

I. Installation of ddddocr

The command will automatically install the latest ddddocr that matches your computer's environment.

pip install ddddocr

If the installation is slow, you can connect to a domestic mirror to install it with the following command:

pip install ddddocr -i /simple/

II. Use of ddddocr

1. Examples of use

import ddddocr

ocr = ()
with open('', 'rb') as f:
	img_bytes = ()
res = (img_bytes)
print('The recognized CAPTCHA is:' + res)

2. Full code

import os
import ddddocr
from time import sleep
from PIL import Image
from selenium import webdriver
from  import By

class GetVerificationCode:
	def __init__(self):
         = None
        url = 'Address to log in to'
         = ()
        .maximize_window()  # Maximize the browser
        (url)

	# Get CAPTCHA information
    def getVerification(self):
        # Get the location of the current file and where to save the screenshot.
        current_location = (__file__)
        screenshot_path = (current_location, "..", "VerificationCode")
        # Capture the current web page and put it in a custom directory named printscreen, which contains the captcha we need.
        sleep(1)
        .save_screenshot(screenshot_path + '//' + '')
        sleep(1)
        # Locate CAPTCHA
        imgelement = .find_element(, 'Xpath localization of captcha images')
        # Get CAPTCHA x,y axis coordinates
        location = 
        # Get the length and width of the captcha
        size = 
        # Write it as the coordinates of the position we need to intercept
        rangle = (int(location['x'] + 430),
                  int(location['y'] + 200),
                  int(location['x'] + size['width'] + 530),
                  int(location['y'] + size['height'] + 250))
        # Open the screenshot
        i = (screenshot_path + '//' + '')
        # Use Image's crop function to capture the area we need again from the screenshot
        fimg = (rangle)
        fimg = ('RGB')
        # Save our captured CAPTCHA image and read the CAPTCHA content
        (screenshot_path + '//' + '')
        ocr = ()
        with open(screenshot_path + '//' + '', 'rb') as f:
            img_bytes = ()
         = (img_bytes)
        print('The recognized CAPTCHA is:' + )

    # Determine if the alert message exists when the CAPTCHA is wrong
    def isElementPresent(self, by, value):
        try:
            element = .find_element(by=by, value=value)
        except NoSuchElementException:
            pass
            # A NoSuchElementException occurred, indicating that the element was not found on the page, return False
            return False
        else:
            # No exception occurred, means the element was found in the page, return True
            return True

	# Login
    def login(self):
        ()
        .find_element(, 'Username input box Xpath positioning').send_keys('Username')
        .find_element(, 'Password input box Xpath positioning').send_keys('Password')
        .find_element(, 'Captcha input box Xpath positioning').send_keys()
        sleep(1)
        .find_element(, 'Login button Xpath positioning').click()
        sleep(2)
		isFlag = True
        while isFlag:
            try:
                isPresent = (, 'Prompt message on captcha error Xpath localization')
                if isPresent is True:
                    codeText = .find_element(, 'Prompt message on captcha error Xpath localization').text
                    if codeText == "Authentication code incorrect.":
                        ()
                        sleep(2)
                        .find_element(, 'Captcha input box Xpath positioning').clear()
                        sleep(1)
                        .find_element(, 'Captcha input box Xpath positioning').send_keys()
                        sleep(1)
                        .find_element(, 'Login button Xpath positioning').click()
                        sleep(2)
                    tips = .find_element(,
                                                    'Prompt message Xpath location when captcha is not entered').text
                    if tips == "Please enter the verification code.":
                        ()
                        sleep(2)
                        .find_element(, 'Captcha input box Xpath positioning').click()
                        sleep(1)
                        .find_element(, 'Captcha input box Xpath positioning').send_keys()
                        sleep(1)
                        .find_element(, 'Login button Xpath positioning').click()
                        sleep(2)
                    continue
                else:
                    print("The verification code is correct, login successful!")
            except NoSuchElementException:
                pass
            else:
                isFlag = False
                
        sleep(5)
        ()

if __name__ == '__main__':
    GetVerificationCode().login()

3. Sample Captcha

4. Identification of results

It can be realized that: after the CAPTCHA recognition error, continue to recognize the

III. Code descriptions

In this article, the code in the time to wait for the use of forced waiting, if necessary, you can modify the code, you can use the display waiting. About selenium's three ways to wait (display wait, implicit wait, forced wait) you can refer to other bloggers to understand the article to learn.

summarize

It is possible to have some recognition ability for all the CAPTCHA images that exist now. Simply put, ddddocr makes CAPTCHA recognition so simple and easy to use that it can quickly detect text, numbers or icons on the picture, allowing more partners to quickly crack the login CAPTCHA of a website.

To this article on the Python universal CAPTCHA recognition OCR library ddddocr installation and use of tutorials on this article, more related Python CAPTCHA recognition OCR library ddddocr content, please search for my previous posts or continue to browse the following related articles I hope you will support me more in the future!