SoFunction
Updated on 2024-11-19

Python + tkinter implementation of the website download tool

preamble

Recently, many students want to ask me how to bring together the functions of several codes.

It is very simple, write an interface on the line, want to run which code, mouse click on the line

development environment (computer)

python 3.8: interpreter

pycharm: code editor

Steps for this project case

1. First determine the desired function, the main functions of this project today are three

  • video
  • commentaries
  • bullet screen (computing)

2. Create a simple user interaction interface that is concise and clear

First show the effect of the finished product

interfaces

import module

import tkinter as tk
from tkinter import ttk
import 

First create a window

root = ()
('Beep Station Download Software')
('367x134+200+200')
# Transparency value: 0~1 can also be a decimal point, 0: fully transparent; 1: fully opaque
("-alpha", 0.9)

​​​​​​​()

Function buttons

text_label_1 = (root, text='Select: ', font=('Bold', 15))
text_label_1.grid(row=1, column=0, padx=5, pady=5)
  
number_int_var = ()
# Create a drop-down list
numberChosen = (root, textvariable=number_int_var, width=26)
# Set the value of the drop-down list
numberChosen['values'] = ('Video', 'Pop-ups', 'Comments')
# Set its position in the interface column for columns row for rows
(row=1, column=1, padx=5, pady=5)
# Set the default value to be displayed in the drop-down list, 0 is the subscript value of numberChosen['values']
(0)

text_label = (root, text='BV number:', font=('Bold', 15))
text_label.grid(row=2, column=0, padx=5, pady=5)

bv_va = ()
entry_1 = (root, font=('Bold', 15), textvariable=bv_va)
entry_1.grid(row=2, column=1)

Button_1 = (root, text='Download', font=('Bold', 13))
Button_1.grid(row=2, column=2, padx=5, pady=5)

Main function code writing

Function One

Let's extract the data using the regular

Regular Expressions -> Extraction/parsing for string data types

re module findall() ----> tells the program what data to look for from what place

() '“title”:“(.?)“,“pubdate”',

Inside, look for "title":"(.?)" , "pubdate" where the contents of the parentheses are what we're looking for.

def Video(bv_id):
    url = f'/video/{bv_id}'
    # Disguise python code as a browser ---> copy and paste directly inside developer tools
    headers = {
        # Anti-theft chains
        'referer': '/video/',
        # Browser Basic Identity Indicates the browser
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36'
    }
    # Send request ---> <Response [200]> Response object, 200 status code indicates request was successful
    response = (url=url, headers=headers)
    # Get the video title
    title = ('"title":"(.*?)","pubdate"', )[0].replace(' ', '')
    # Getting video data info Frontend tags two by two together
    html_data = ('<script>window.__playinfo__=(.*?)</script>', )[0]
    # Convert data type String data to json dictionary data type
    json_data = (html_data)
    audio_url = json_data['data']['dash']['audio'][0]['baseUrl']
    video_url = json_data['data']['dash']['video'][0]['baseUrl']
    audio_content = (url=audio_url, headers=headers).content
    video_content = (url=video_url, headers=headers).content
    if not ('video\\'):
        ('video\\')
    with open('video\\' + title + '.mp3', mode='wb') as audio:
        (audio_content)
    with open('video\\' + title + '.mp4', mode='wb') as video:
        (video_content)
    return title

Function 2

This feature has been published some time ago in a related article tutorial

See here:Two ways to get pop-ups with Python (one is simple but low volume, the other is high volume and full)

def get_response(html_url):
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36'
    }
    response = (url=html_url, headers=headers)
     = response.apparent_encoding
    return response


def get_Dm_url(bv_id):
    link = f'/video/{bv_id}/'
    html_data = get_response(link).text
    Dm_url = ('<a href="(.*?)" rel="external nofollow"   class="btn btn-default" target="_blank">bullet screen (computing)</a>', html_data)[0]
    title = ('<input type="text" value="(.*?)"', html_data)[-1]
    return Dm_url, title


def get_Dm_content(Dm_url, title):
    html_data = get_response(Dm_url).text
    content_list = ('<d p=".*?">(.*?)</d>', html_data)
    if not ('Pop-ups\\\'):
        ('Pop-ups\\\')
    for content in content_list:
        with open(f'bullet screen (computing)\\{title}bullet screen (computing).txt', mode='a', encoding='utf-8') as f:
            (content)
            ('\n')


def main(bv_id):
    Dm_url, title = get_Dm_url(bv_id)
    get_Dm_content(Dm_url, title)

Function 3

A single page with a small amount of data is easy, but to turn the page, you must analyze the site and find patterns

def get_response(html_url, params=None):
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36'
    }
    response = (url=html_url, params=params, headers=headers)
    return response


def get_oid(bv_id):
    link = f'/video/{bv_id}/'
    html_data = get_response(link).text
    oid = ('window.__INITIAL_STATE__={"aid":(\d+),', html_data)[0]
    title = ('"title":"(.*?)","pubdate"', html_data)[0].replace(' ', '')
    return oid, title


def get_content(oid, page, title):
    content_url = '/x/v2/reply/main'
    data = {
        'csrf': '6b0592355acbe9296460eab0c0a0b976',
        'mode': '3',
        'next': page,
        'oid': oid,
        'plat': '1',
        'type': '1',
    }
    json_data = get_response(content_url, data).json()
    content = '\n'.join([i['content']['message'] for i in json_data['data']['replies']])
    if not ('Comments\\\\'):
        ('Comments\\\\')
    with open(f'commentaries\\{title}commentaries.txt', mode='a', encoding='utf-8') as f:
        (content)


def main(bv_id):
    oid, title = get_oid(bv_id)
    for page in range(1, 6):
        try:
            get_content(oid, page, title)
        except:
            pass

to this article on Python tkinter website download tool to this article, more related Python tkinter website download tool content please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!