point of knowledge (math.)
- Basic Crawler Process
- json
- Requests Among Crawlers Sending Web Requests
- pandas table processing / saving data
- pyecharts Visualization
development environment (computer)
python 3.8 more stable version interpreter distribution anaconda jupyter notebook inside write data analysis code professionalism
pycharm Professional Code Editor Versions by Year and Month
Crawler full code
import module
import requests # Send web request module import json import pprint # Formatting output module import pandas as pd # A very important module in data analysis
Analyzing the website
First, find the target data you want to crawl today
/zt2020/page/#/
Find the url where the data is located
Send Request
url = '/g2/getOnsInfo?name=disease_h5&_=1638361138568' response = (url, verify=False)
Getting data
json_data = ()['data']
parsing data
json_data = (json_data) china_data = json_data['areaTree'][0]['children'] # List data_set = [] for i in china_data: data_dict = {} # Name of region data_dict['province'] = i['name'] # Add new confirmations data_dict['nowConfirm'] = i['total']['nowConfirm'] # of deaths data_dict['dead'] = i['total']['dead'] # of people cured data_dict['heal'] = i['total']['heal'] # Death rate data_dict['deadRate'] = i['total']['deadRate'] # Cure rate data_dict['healRate'] = i['total']['healRate'] data_set.append(data_dict)
Save data
df = (data_set) df.to_csv('')
data visualization
import module
from pyecharts import options as opts from import Bar,Line,Pie,Map,Grid
retrieve data
df2 = df.sort_values(by=['nowConfirm'],ascending=False)[:9] df2
Mortality and cure rates
line = ( Line() .add_xaxis(list(df['province'].values)) .add_yaxis("Cure rate", df['healRate'].()) .add_yaxis("Mortality", df['deadRate'].()) .set_global_opts( title_opts=(title="Mortality and cure rates"), ) ) line.render_notebook()
Number of confirmed cases and deaths by region
bar = ( Bar() .add_xaxis(list(df['province'].values)[:6]) .add_yaxis("Death.", df['dead'].()[:6]) .add_yaxis("Cure.", df['heal'].()[:6]) .set_global_opts( title_opts=(title="The number of confirmed diagnoses and deaths by region."), datazoom_opts=[()], ) ) bar.render_notebook()
Above is Python crawler to crawl the epidemic data and visualize the display of the details, more information about Python crawl data Visualization display please pay attention to my other related articles!