Case Story.
Earlier when we were doing Android phone testing, the
The marketing department wants us, the testing department, to conduct compatibility testing of Top 1000 apps (top 1000 apps).
to make sure that our phones are capable of installing and running so many great apps properly.
And the marketing department provides the apk download address of the top 1000 on an app market.
How to achieve fast batch download of apk files?
preparatory phase
The url in the above excel clearly needs a secondary redirect because its not a .apk ending link.
We need to parse it and then redirect it. wget command doesn't support this url redirection parsing, so it can't be used.
So we still use the requests module to implement the download.
Python batch script form Single-threaded writeups
Remembering the essence of batch scripting: executing statements in batch order.
Since the batch script form can only realize a single apk download task, we use the requests module to realize the download.
Single-threaded efficiency is slower, you must wait for the previous apk to finish downloading before starting the next one.
# coding=utf-8 import os import requests import openpyxl curdir = () # Get current work directory header = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.93 Safari/537.36'} # Create a folder to store the downloaded apk if not ("downloaded_apk"): ("mkdir downloaded_apk") # read the download url from excel line by line excel = openpyxl.load_workbook('Top_1000_app.xlsx') # Read the contents of the excel inside table = rows = table.max_row for r in range(2, rows + 1): # Nothing to do with the first header row of excel, start with the second line of text content apk_name = (row=r, column=2).value # Get app name (Chinese) apk_url = (row=r, column=3).value # Get the download save_path = (curdir, "downloaded_apk", "%" % apk_name) if not (save_path): # Avoid secondary downloads print("Downloading the %sth apk and will save to %s" % (r, save_path)) try: r = (apk_url, headers=header, allow_redirects=True, timeout=720) # Initiate requests to download status_code = r.status_code if (status_code == 200 or status_code == 206): with open(save_path, "wb") as hf: () except: print("Error, can not download %" % apk_name) else: print("%s downloaded already!" % save_path) ("pause")
Python Object-Oriented Class Forms Writing Multi-Threaded Downloads
preparatory phase
Multithreading is generally much, much more efficient.
Multi-threaded task execution, generally put the apk download task into the Queue queue, FIFO.
Then as long as the queue is not empty, it takes the task (q_job) from the queue and has 10 threads going at the same time, the
Relatively speaking, it will be more difficult to understand some, but after mastering, you can quickly improve the efficiency of the download.
#coding=utf-8 import os import queue import threading import requests import openpyxl curdir = () # Get current work directory header = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.93 Safari/537.36'} # Create folders if not ("downloaded_apk"): ("mkdir downloaded_apk") def download_single_apk(apk_url_str): '''Download a single apk file''' apk_name, apk_url = apk_url_str.split(";") # print(apk_url) save_path = (curdir, "downloaded_apk", "%" % apk_name) if not (save_path): # Avoid secondary downloads print("Downloading %s" % (save_path)) try: r = (apk_url, headers=header, allow_redirects=True, timeout=720) # Initiate requests to download status_code = r.status_code if (status_code == 200 or status_code == 206): with open(save_path, "wb") as hf: () except: print("Error, can not download %" % apk_name) else: print("%s downloaded already!" % save_path) # Batch download threads class DownLoadThread(): def __init__(self, q_job): self._q_job = q_job .__init__(self) def run(self): while True: if self._q_job.qsize() > 0: download_single_apk(self._q_job.get()) # It's 10 threads all running this download function # else: break if __name__ == '__main__': # Initialize a queue q = (0) # Read the url in excel line by line excel = openpyxl.load_workbook('Top_1000_app.xlsx') # Read the contents of the excel inside table = rows = table.max_row for r in range(2, rows + 1): # Irrelevant to the first header row of excel, do the replacement from the second line of the text content apk_name = (row=r, column=2).value # Get app name (Chinese) apk_url = (row=r, column=3).value # Get the download temp_str = apk_name + ";" + apk_url # Can't put a list into a queue, can only try to put a string (temp_str) for i in range(10): # Open 10 threads DownLoadThread(q).start()
This case material download
Click to download
Mode of operation and effect
For example, save the above code as download_1000apk.py and put it on your desktop.
Suggest python download_1000apk.py to run it, or of course double-click to run it.
The running effect is as follows:
Above is the details of batch download apk with python, for more information about batch download apk with python, please pay attention to my other related articles!