How to batch download music based on Python

This article introduces how to batch download music based on Python, the text of the sample code through the introduction of the very detailed, for everyone to learn or work with certain reference learning value, you can refer to the following friends

Music is the spice of life, and currently a lot of music can only be played but not downloaded. How can we, born technicians, be willing to do that?

Knowledge Points:

requests
regular expression (math.)

Development Environment:

Version: anaconda5.2.0 (python3.6.5)
Editor: pycharm

Third-party libraries:

requests
parsel

Web analytics

Target site:/search?key=%E9%99%88%E7%B2%92

Analyze the real address of the music

Choose a song by Rene Chenride (a horse)e.g.

Open the developer tools and select network -> media -> refresh the page to get the real address of the music.

But the address you get is not readable in the view source, it must be hidden by Baidu Music. There are generally two scenarios when this happens. The first is that JavaScript is used to splice or encrypt the request connection, and the second is that the data is hidden. Since we are not sure what kind of situation is occurring. So we can only take our time to analyze the requested data.

After analyzing we can see that the real music address is present in this API inside /v1/restserver/ting?method=&format=jsonp&callback=jQuery17206453751179783578_1544942124991& ;songid=243093242&from=web&_=1544942128336

And we're requesting that the API return a json data (aka python's dictionary data type). As long as we use the dictionary rules we can extract all our data.

url splicing Get all data

Previously we got the real address of the music, next we are analyzing the url of the real address in anticipation of getting the know-how to download all the music.

A careful analysis of the url reveals that the ? The from parameter followed by _ does not affect the data request even if it does not exist.

And the songid in the latter parameter is actually the unique id of the song, and the from parameter is actually an indication of which platform it's coming from

So when we download the music later, we just need to get the songid of the songs in bulk and then we can download all the songs.

Batch get singid

Using the developer tools, view the web page source code will be able to view the location of the songid, if we analyze the url of a singer's page you will find the same can be constructed.

At this point, the entire web analysis is over.

Realization effects

Full Code

import re
import requests

def get_songid():
  """Get the songid of the music."""
  url = '/artist/2517'
  response = (url=url)
  html = 
  sids = (r'href="/song/(\d+)" rel="external nofollow" ', html)
  return sids

def get_music_url(songid):
  """Get download link"""
  api_url = f'/v1/restserver/ting?method=&format=jsonp&songid={songid}&from=web'
  response = (api_url.format(songid=songid))
  data = ()
  print(data)
  try:
    music_name = data['songinfo']['title']
    music_url = data['bitrate']['file_link']
    return music_name, music_url
  except Exception as e:
    print(e)

def download_music(music_name, music_url):
  """Download Music"""
  response = (music_url)
  content = 
  save_file(music_name+'.mp3', content)

def save_file(filename, content):
  """Save the music."""
  with open(file=filename, mode="wb") as f:
    (content)
if __name__ == "__main__":
  for song_id in get_songid():
    music_name, music_url = get_music_url(song_id)
    download_music(music_name, music_url)

This is the whole content of this article.