preamble
In this paper, we are going to implement slider CAPTCHA cracking using pytorch framework's target recognition technique. We have chosen the yolov5 algorithm here
Example: Input image
output image
You can see that after the detection, we can accurately locate the position of the gap, and can get the coordinates of the gap, so that we can easily realize the sliding CAPTCHA crack.
I. Preliminary work
The yolov series are commonly used target detection algorithms. yolov5 is not only easy to configure, but also has a considerable increase in speed, and we can easily train our own dataset.
YOLOV5 Pytorch version GIthub URL thanks to this author for the code.
After downloading, it's in this format
---data/ Annotations/ Annotation file for storing images(.xml) images/ Storing images to be trained ImageSets/ File to hold the partitioned dataset labels/ Box information for storing images
Only two of the folders, Annotations and images, need to be modified.
First we put the images to be trained into images
The dataset is thanks to this god for organizing /tzutalin/labelImg, on top of which I added 50 CAPTCHA images from Tencent
Data set has been uploaded to Baidu cloud
Link./s/1XS5KVoXqGHglfP0mZ3HJLQ
Extract code: wqi8
Then we need to annotate it and tell the computer what we want it to recognize. That's when we need the software Wizard Labeling. It's free and powerful, five stars!
The first step is to select the images folder, the second step is to write several categories, it is recommended to use English. Here there is only one category, that is, the location of the missing fast, named target. note that when labeling the left and right should be exactly stuck, otherwise the coordinates obtained will not be accurate.
When the labeling is done, click Export, the file format does not need to move, just click OK, it will generate our labeling file in the images/outputs folder. Copy them all to the Annotations folder.
Back to the main directory, run and voc_label.py, makeTxt directly run can, voc_label need to change the value of classes, this time only a target
import as ET import pickle import os The # () method is used to return a list of the names of the files or folders contained in the specified folder. from os import listdir, getcwd from import join sets = ['train', 'test', 'val'] classes = ['target'] # There were a few classes in the previous labeling, so enter a few here. """ ............ """
Go to the data folder and the changes
# COCO 2017 dataset # Download command: bash yolov5/data/get_coco2017.sh # Train command: python --data ./data/ # Dataset should be placed next to yolov5 folder: # /parent_folder # /coco # /yolov5 # train and val datasets (image directory or *.txt file with image paths) train: ../coco/ # 118k images val: ../coco/ # 5k images test: ../coco/ # 20k images for submission to /competitions/20794 # number of classes nc: 1 # class names names: ['target'] # Print classes # with open('data/') as f: # d = (f, Loader=) # dict # for i, x in enumerate(d['names']): # print(i, x)
Then go to the mods folder and the modifications
nc: 1 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple """ '''''''''''' """
At this point the configuration session is finally over and training can begin!
To open it, we usually just need to change a few settings -weights, -cfg, -data, -epochs
parser = () parser.add_argument('--weights', type=str, default='', help='initial weights path') parser.add_argument('--cfg', type=str, default='models/', help=' path') parser.add_argument('--data', type=str, default='data/', help=' path') parser.add_argument('--hyp', type=str, default='data/', help='hyperparameters path') parser.add_argument('--epochs', type=int, default=300) parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs') parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes') parser.add_argument('--rect', action='store_true', help='rectangular training') parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training') parser.add_argument('--nosave', action='store_true', help='only save final checkpoint') parser.add_argument('--notest', action='store_true', help='only test final epoch') parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check') parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters') parser.add_argument('--bucket', type=str, default='', help='gsutil bucket') parser.add_argument('--cache-images', action='store_true', help='cache images for faster training') parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training') parser.add_argument('--device', default='', help='cuda device, . 0 or 0,1,2,3 or cpu') parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%') parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class') parser.add_argument('--adam', action='store_true', help='use () optimizer') parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode') parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify') parser.add_argument('--log-imgs', type=int, default=16, help='number of images for W&B logging, max 100') parser.add_argument('--log-artifacts', action='store_true', help='log artifacts, . final trained model') parser.add_argument('--workers', type=int, default=4, help='maximum number of dataloader workers') parser.add_argument('--project', default='runs/train', help='save to project/name') parser.add_argument('--name', default='exp', help='save to project/name') parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment') opt = parser.parse_args()
Run it straight away and start training!
。。。。。。。。。。。。。。。。
Once the training is done, go to runs/train/exp/weights and we copy to the home directory.
Finally, let's open up and change a couple of properties
parser = () parser.add_argument('--weights', nargs='+', type=str, default='', help=' path(s)') parser.add_argument('--source', type=str, default='', help='source') # file/folder, 0 for webcam parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)') parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS') parser.add_argument('--device', default='0', help='cuda device, . 0 or 0,1,2,3 or cpu') parser.add_argument('--view-img', action='store_true', help='display results') parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3') parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') parser.add_argument('--augment', action='store_true', help='augmented inference') parser.add_argument('--update', action='store_true', help='update all models') parser.add_argument('--project', default='runs/detect', help='save results to project/name') parser.add_argument('--name', default='exp', help='save results to project/name') parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment') opt = parser.parse_args()
The -source attribute can be changed to data/images to recognize your own dataset to see if it can be recognized properly.
Small Tips, if the execution does not report an error, but there is no detection box, try to modify the -device to cpu, cuda version is too low will lead to the use of gpu no detection box (ask is persecuted by this small problem for a long time --_-).
Finally, around line 112, add a print
At this point the execution program will return the position information and confidence level of the box
Our precursor work is finally done~
II. Writing a crawler
1. Finding the right website
After a lot of searching, it finally locked up/
This is because it has a site structure that is easy for us to maneuver.
2. Import dependent libraries
Here we use selenium to simulate human actions.
About selenium installation and webdriver installation method in this article does not extend.
from selenium import webdriver from .action_chains import ActionChains import requests,re import os import requests import re import time from import ActionChains
3. Writing cracking programs
Visit the website and discover the crack before you have to click in turn on the
Write code
def run() driver = () headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Safari/537.36"} # Disguise request headers ('/') # Visiting the site driver.find_element_by_xpath('/html/body/div[1]/section[1]/div/div/div/div[2]/div[1]/a[2]').click() driver.find_element_by_xpath('//*[@]').click() #Analogize the clicking action
proceed with
Here is the image we want to identify, but direct positioning can not be located, because this code is wrapped by the iframe, we need to locate the iframe first!
(2) # Sleep for 2 seconds to prevent error reporting driver.switch_to_frame("tcaptcha_iframe") #Locate to an iframe based on its id target = driver.find_element_by_xpath("/html/body/div/div[3]/div[2]/div[1]/div[2]/img").get_attribute("src") # Get the original address of the image response = (target,headers=headers) #Access to the image address img = with open( '','wb' ) as f: (img) #Saving images to the home directory,named after
Now that the pictures are available and the testing program is ready, let's get started!
''' The usage of () is simply to execute the cmd command and get the return value of cmd Here is the execution of ''' result = ("python ").readlines() # Implemented target detection program list = [] for line in result: (line) #Store the return information from cmd into a list print(list) a = ("(.*):(.*]).(.*)\\n",list[-4]) #Get the location information of the image print(a) print(len(a)) if len(a) != 0: #If the box can be detected tensor=a[0][1] pro = a[0][2] list_=tensor[2:-1].split(",") location = [] for i in list_: print(i) b = ("tensor(.*)",i)[0] (b[1:-2]) # Extract the xy in the upper left corner of the box and the xy in the lower right corner of the box drag1 = driver.find_element_by_xpath('/html/body/div/div[3]/div[2]/div[2]/div[2]/div[1]') #Locate to the drag button action_chains = ActionChains(driver) # Instantiate the mouse manipulation class action_chains.drag_and_drop_by_offset(drag1, int(int(location[2])/2-85), 0).perform() #Simulates holding and dragging the mouse to a distance of X and then releasing it. input("Waiting for operation") () else: () print("Failure to recognize")
Here's the focus.
action_chains.drag_and_drop_by_offset(drag1, int(int(location[2])/2-85), 0).perform()
Why are you dragging?int(int(location[2])/2-85)
Far.
in the first placelocation
The format of this list is[Upper left x, upper left y, lower right x, lower right y]
,location[2]
That is, to take out the x-value in the lower right corner.
The resolution of the CAPTCHA image we saved locally is as follows
However, the size of the images displayed on the website
x-axis
is exactly half the size of the local image, soint(location[2]/2)
And that's what you get.
However, the square to be dragged itself is still some distance to the left, and by analyzing it, we found that
The distance of the leftmost part of this small square from the leftmost part of the picture is 26 in the red box, i.e.
26 + 68 - 10 = 84, since this 10 is the length of the trial, let's make this distance 85
so farint(int(location[2])/2-85)
The origin of this is also explained.
It's done, so let's watch the demo!
The full selenium code is as follows
from selenium import webdriver from .action_chains import ActionChains import requests,re import os import requests import re import time from import ActionChains def run() driver = () headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Safari/537.36"} # Disguise request headers ('/') # Visiting the site driver.find_element_by_xpath('/html/body/div[1]/section[1]/div/div/div/div[2]/div[1]/a[2]').click() driver.find_element_by_xpath('//*[@]').click() #Analogize the clicking action (2) # Sleep for 2 seconds to prevent error reporting driver.switch_to_frame("tcaptcha_iframe") #Locate to an iframe based on its id target = driver.find_element_by_xpath("/html/body/div/div[3]/div[2]/div[1]/div[2]/img").get_attribute("src") # Get the original address of the image response = (target,headers=headers) #Access to the image address img = with open( '','wb' ) as f: (img) # Save the image to the home directory, name it ''' The usage of () is simply to execute the cmd command and get the return value of cmd Here is the execution of ''' result = ("python ").readlines() # Implemented target detection program list = [] for line in result: (line) #Store the return information from cmd into a list print(list) a = ("(.*):(.*]).(.*)\\n",list[-4]) #Get the location information of the image print(a) print(len(a)) if len(a) != 0: #If the box can be detected tensor=a[0][1] pro = a[0][2] list_=tensor[2:-1].split(",") location = [] for i in list_: print(i) b = ("tensor(.*)",i)[0] (b[1:-2]) # Extract the xy in the upper left corner of the box and the xy in the lower right corner of the box drag1 = driver.find_element_by_xpath('/html/body/div/div[3]/div[2]/div[2]/div[2]/div[1]') #Locate to the drag button action_chains = ActionChains(driver) # Instantiate the mouse manipulation class action_chains.drag_and_drop_by_offset(drag1, int(int(location[2])/2-85), 0).perform() #Simulates holding and dragging the mouse to a distance of X and then releasing it. input("Waiting for operation") () else: () print("Failure to recognize") while True: run()
To this article on the Pytorch version of the yolov5 based slider CAPTCHA crack ideas in detail to this article, more related Pytorch slider CAPTCHA crack content, please search for my previous articles or continue to browse the following related articles I hope that you will support me more in the future!