SoFunction
Updated on 2024-11-21

Python Crawler to Crawl and Visualize Outbreak Data

point of knowledge (math.)

  1. Basic Crawler Process
  2. json
  3. Requests Among Crawlers Sending Web Requests
  4. pandas table processing / saving data
  5. pyecharts Visualization

development environment (computer)

python 3.8 more stable version interpreter distribution anaconda jupyter notebook inside write data analysis code professionalism

pycharm Professional Code Editor Versions by Year and Month

Crawler full code

import module

import requests      # Send web request module
import json
import pprint        # Formatting output module
import pandas as pd  # A very important module in data analysis

Analyzing the website

First, find the target data you want to crawl today

/zt2020/page/#/

Find the url where the data is located

Send Request

url = '/g2/getOnsInfo?name=disease_h5&_=1638361138568'
response = (url, verify=False)

Getting data

json_data = ()['data']

parsing data

json_data = (json_data)
china_data = json_data['areaTree'][0]['children'] # List
data_set = []
for i in china_data:
    data_dict = {}
    # Name of region
    data_dict['province'] = i['name']
    # Add new confirmations
    data_dict['nowConfirm'] = i['total']['nowConfirm']
    # of deaths
    data_dict['dead'] = i['total']['dead']
    # of people cured
    data_dict['heal'] = i['total']['heal']
    # Death rate
    data_dict['deadRate'] = i['total']['deadRate']
    # Cure rate
    data_dict['healRate'] = i['total']['healRate']
    data_set.append(data_dict)

Save data

df = (data_set)
df.to_csv('')

data visualization

import module

from pyecharts import options as opts
from  import Bar,Line,Pie,Map,Grid

retrieve data

df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9]
df2

Mortality and cure rates

line = (
    Line()
    .add_xaxis(list(df['province'].values))
    .add_yaxis("Cure rate", df['healRate'].())
    .add_yaxis("Mortality", df['deadRate'].())
    .set_global_opts(
        title_opts=(title="Mortality and cure rates"),

    )
)
line.render_notebook()

 

Number of confirmed cases and deaths by region

bar = (
    Bar()
    .add_xaxis(list(df['province'].values)[:6])
    .add_yaxis("Death.", df['dead'].()[:6])
    .add_yaxis("Cure.", df['heal'].()[:6])
    .set_global_opts(
        title_opts=(title="The number of confirmed diagnoses and deaths by region."),
        datazoom_opts=[()],
        )
)
bar.render_notebook()

Above is Python crawler to crawl the epidemic data and visualize the display of the details, more information about Python crawl data Visualization display please pay attention to my other related articles!