SoFunction
Updated on 2024-11-10

Batch download apk with python

Case Story.

Earlier when we were doing Android phone testing, the

The marketing department wants us, the testing department, to conduct compatibility testing of Top 1000 apps (top 1000 apps).
to make sure that our phones are capable of installing and running so many great apps properly.
And the marketing department provides the apk download address of the top 1000 on an app market.

How to achieve fast batch download of apk files?

preparatory phase

The url in the above excel clearly needs a secondary redirect because its not a .apk ending link.
We need to parse it and then redirect it. wget command doesn't support this url redirection parsing, so it can't be used.
So we still use the requests module to implement the download.

Python batch script form Single-threaded writeups

Remembering the essence of batch scripting: executing statements in batch order.
Since the batch script form can only realize a single apk download task, we use the requests module to realize the download.
Single-threaded efficiency is slower, you must wait for the previous apk to finish downloading before starting the next one.

# coding=utf-8

import os
import requests
import openpyxl

curdir = () # Get current work directory
header = {
 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.93 Safari/537.36'}

# Create a folder to store the downloaded apk
if not ("downloaded_apk"):
 ("mkdir downloaded_apk")

# read the download url from excel line by line
excel = openpyxl.load_workbook('Top_1000_app.xlsx') # Read the contents of the excel inside
table = 
rows = table.max_row
for r in range(2, rows + 1): # Nothing to do with the first header row of excel, start with the second line of text content
 apk_name = (row=r, column=2).value # Get app name (Chinese)
 apk_url = (row=r, column=3).value # Get the download
 save_path = (curdir, "downloaded_apk", "%" % apk_name)
 if not (save_path): # Avoid secondary downloads
  print("Downloading the %sth apk and will save to %s" % (r, save_path))
  try:
   r = (apk_url, headers=header, allow_redirects=True, timeout=720) # Initiate requests to download
   status_code = r.status_code
   if (status_code == 200 or status_code == 206):
    with open(save_path, "wb") as hf:
     ()
  except:
   print("Error, can not download %" % apk_name)
 else:
  print("%s downloaded already!" % save_path)

("pause")


Python Object-Oriented Class Forms Writing Multi-Threaded Downloads

preparatory phase

Multithreading is generally much, much more efficient.
Multi-threaded task execution, generally put the apk download task into the Queue queue, FIFO.
Then as long as the queue is not empty, it takes the task (q_job) from the queue and has 10 threads going at the same time, the
Relatively speaking, it will be more difficult to understand some, but after mastering, you can quickly improve the efficiency of the download.

#coding=utf-8

import os
import queue
import threading
import requests
import openpyxl

curdir = () # Get current work directory
header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.93 Safari/537.36'}

# Create folders
if not ("downloaded_apk"):
 ("mkdir downloaded_apk")


def download_single_apk(apk_url_str):
 '''Download a single apk file'''
 apk_name, apk_url = apk_url_str.split(";")
 # print(apk_url)
 save_path = (curdir, "downloaded_apk", "%" % apk_name)
 if not (save_path): # Avoid secondary downloads
  print("Downloading %s" % (save_path))
  try:
   r = (apk_url, headers=header, allow_redirects=True, timeout=720) # Initiate requests to download
   status_code = r.status_code
   if (status_code == 200 or status_code == 206):
    with open(save_path, "wb") as hf:
     ()
  except:
   print("Error, can not download %" % apk_name)
 else:
  print("%s downloaded already!" % save_path)


# Batch download threads
class DownLoadThread():
 def __init__(self, q_job):
  self._q_job = q_job
  .__init__(self)

 def run(self):
  while True:
   if self._q_job.qsize() > 0:
    download_single_apk(self._q_job.get()) # It's 10 threads all running this download function #
   else:
    break


if __name__ == '__main__':
 # Initialize a queue
 q = (0)
 
 # Read the url in excel line by line
 excel = openpyxl.load_workbook('Top_1000_app.xlsx') # Read the contents of the excel inside
 table = 
 rows = table.max_row
 for r in range(2, rows + 1): # Irrelevant to the first header row of excel, do the replacement from the second line of the text content
  apk_name = (row=r, column=2).value # Get app name (Chinese)
  apk_url = (row=r, column=3).value # Get the download
  temp_str = apk_name + ";" + apk_url # Can't put a list into a queue, can only try to put a string
  (temp_str) 
 
 for i in range(10): # Open 10 threads
  DownLoadThread(q).start()

This case material download

Click to download

Mode of operation and effect

For example, save the above code as download_1000apk.py and put it on your desktop.
Suggest python download_1000apk.py to run it, or of course double-click to run it.
The running effect is as follows:

Above is the details of batch download apk with python, for more information about batch download apk with python, please pay attention to my other related articles!