python crawl wechat 8.0 status video under know answer

WeChat 8.0 version update, you can set up a personal status, the status of the inside can be added to the fire recording video, soon the status of the video on the fire, you can look at the Zhihu hot list there is no WeChat 8.0 status sand sculpture and cute video or picture? [1]. For example, I also set up a:

So I thought I'd download the videos and play with them as well. This article describes how to use Python to download all the videos under a certain answer in Zhihu with one click.

Idea: analyze the Zhihu answer page -> locate the video -> find the url to play the video -> download. It's really just two steps: find the url and download it.

Find url

There may be more than one video under one answer, analyze one video first, open Google Chrome's developer tools window, find network, check preserve log, disable cache, select xhr, refresh, it's easy to find the interface as shown below:.

The data returned from the above interface can be obtained from the video playback url, title, format and other information, which is enough, copy play_url, put on the browser to see, found that you can directly download, that is, then the url is what we need.

Next, write code to get the data returned by the interface:

def get(url: str) -> list:
  """
  Getting to know the video of the url
  Return Format
  [{'url':'', 'title','format':'',},{}]
  """
  data = []
  headers = {
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36",
  }
  with (url, headers=headers, timeout=10) as rep:
    if rep.status_code == 200:
      ids = (r"/zvideo/(\d{1,})", )
      ids = list(set(ids)) # Remove duplicate elements
    else:
      print(f"Network Connection Failure，status code { rep.status_code }")
      return []

  if not ids:
    print("Failed to get video, probably no video on this page")
    return []

  for id in ids:
    print(id)
    with (
      f"/api/v4/zvideos/{id}/card",
      headers=headers,
      timeout=10,
    ) as rep:
      if rep.status_code == 200:
        ret_data = ()
        playlist = ret_data["video"]["playlist"]
        title = ret_data.get("title")
        temp = ("ld") or ("sd")
        if temp:
          sigle_video = {}
          sigle_video["url"] = ("play_url")
          sigle_video["title"] = title
          sigle_video["format"] = ("format")
          (sigle_video)
      else:
        print(f"Network Connection Failure，status code { rep.status_code }")
        return []
  return data

Download Video

This one is relatively simple, directly request the url of the video to play, save the streamed content to a file, and at most add a progress bar to the display. Some of the videos get an empty title, so we use the timestamp to name the file.

See the code:

def download( file_url, file_name=None, file_type=None, save_path="download", headers=None, timeout=15,): 
  """
  :param file_url: Download Resource Links
  :param file_name: Save file name，Defaults to the current date and time
  :param file_type: Document type(extension name)
  :param save_path: Save Path，defaultdownload,Not the back."/"
  :param headers: httprequest header
  """
  if file_name is None or file_name == "":
    file_name = str(())

  if file_type is None:
    if "." in file_url:
      file_type = file_url.split(".")[-1]
    else:
      file_type = "uknown"

  file_name = file_name + "." + file_type

  if headers is None:
    headers = {
      "User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1"
    }

  if (save_path):
    pass
  else:
    (save_path)

  # Download tips

  if (f"{save_path}/{file_name}"):
    print(f"\033[33m{file_name}pre-existing，No more downloads！\033[0m")
    return True

  print(f"Downloading {file_name}")
  try:
    with (
      file_url, headers=headers, stream=True, timeout=timeout
    ) as rep:
      file_size = int(["Content-Length"])
      if rep.status_code != 200:
        print("\033[31m download failed \033[0m")
        return False
      label = "{:.2f}MB".format(file_size / (1024 * 1024))
      with (length=file_size, label=label) as progressbar:
        with open(f"{save_path}/{file_name}", "wb") as f:
          for chunk in rep.iter_content(chunk_size=1024):
            if chunk:
              (chunk)
              (1024)
      print(f"\033[32m{file_name}Download Successfully\033[0m")
  except Exception as e:
    print("Download failed:", e)
  return True

Execute the code download:

import os, sys
import re
import click
import requests
from datetime import datetime

def get(url: str) -> list:
  # See above
  ...

def download( file_url, file_name=None, file_type=None, save_path="download", headers=None, timeout=15,): 
  # See above
  ...



if __name__ == "__main__":
  videos = get([1])
  for video in videos:
    download(file_url = video['url'],file_name= video['title'] ,file_type= video['format'],save_path='./download')

The result of the execution is shown below:

final words

The site is subject to change, so the code in this article may change over time and not work, so please adjust some of the regular expressions and parameters yourself. The idea of crawling is universal, which is nothing more than finding the streaming data of the video and saving it. The idea is there, writing the code is physical work.

In addition, if you just want some cool, funny and cute video resources to play with the state of WeChat 8.0, please reply "video" in the public number "Python 7" to get the download link of the state of WeChat 8.0 video collection:

Answer source

Is there any WeChat 8.0 status sandy and cute video or picture? :/question/441253090

Above is the detailed content of python crawling the WeChat 8.0 status video under the Zhihu answer, for more information about python crawling Zhihu video, please pay attention to my other related articles!