SoFunction
Updated on 2024-11-17

Python draws nice analytical charts with matplotlib

preamble

As a good analyst, or have to learn some of the skills to make the charts beautiful, this way to take out more face haha. Well, today's bag of tricks is to introduce a variety of common charts, you can how to draw it.

Data set introduction

First introduce the dataset, let's also use the same datasets, respectivelySalary_Ranges_by_Job_Classificationas well asGlobalLandTemperaturesByCity. (Specific datasets can be replied to in the background)plot(Access)

# Import some common packages
import pandas as pd
import numpy as np
import seaborn as sns

%matplotlib inline
import  as plt
import matplotlib as mpl
('fivethirtyeight')

#Resolve Chinese display issues with Mac
from matplotlib.font_manager import FontProperties

# View the valid styles of the local plt
print()
# Choose one of the styles based on the locally available styles, since I knew ggplot looked good before, I chose it
(['ggplot'])

# ['_classic_test', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark-palette', 'seaborn-dark', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'seaborn', 'Solarize_Light2']

# Data set import

# Introduced 1st dataset Salary_Ranges_by_Job_Classification
salary_ranges = pd.read_csv('./data/Salary_Ranges_by_Job_Classification.csv')

# Introduced 2nd dataset GlobalLandTemperaturesByCity
climate = pd.read_csv('./data/')
# Remove missing values
(axis=0, inplace=True)
# Just look at China #
# date conversion, convert dt to date, take the year, note the use of map
climate['dt'] = pd.to_datetime(climate['dt'])
climate['year'] = climate['dt'].map(lambda value: )
climate_sub_china = [climate['Country'] == 'China']
climate_sub_china['Century'] = climate_sub_china['year'].map(lambda x:int(x/100 +1))
()

line graph

Line charts are relatively simple charts, there is nothing to optimize, the color looks good. Below is a color chart from the Internet, you can choose from it~.

# Selected Weather Data for Shanghai
df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.set_index('dt')
()

# Line graphs
(colors=['lime'])
('AverageTemperature Of ShangHai')
('Number of immigrants')
('Years')
()

Above this is a single line graph, multiple line graphs can also be drawn, just add a few more columns.

# Multiple line graphs
df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SH'})
df2 = [(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'TJ'})
df3 = [(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SY'})
# Merge
df123 = (df2, how='inner', on=['dt'])\
.merge(df3, how='inner', on=['dt'])\
.set_index(['dt'])
()

# Multiple line graphs
()
('AverageTemperature Of 3 City')
('Number of immigrants')
('Years')
()

pie chart

The next step is to draw a pie chart, we can optimize a few more points, for example, from the degree of separation of the pie pieces, we first draw a "low version of the" pie chart.

df1 = salary_ranges.groupby('SetID', axis=0).sum()

 
# "Low-end" pie charts
df1['Step'].plot(kind='pie', figsize=(7,7),
autopct='%1.1f%%',
shadow=True)
('equal')
()

# "Premium" pie charts
colors = ['lightgreen', 'lightblue'] # control pie chart color ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold']
explode=[0, 0.2] #Control the pie chart separation status, the larger the more separated it is

df1['Step'].plot(kind='pie', figsize=(7, 7),
autopct = '%1.1f%%', startangle=90,
shadow=True, labels=None, pctdistance=1.12, colors=colors, explode = explode)
('equal')
(labels=, loc='upper right', fontsize=14)
()

scatterplot

Scatterplot can be optimized in fewer places, ggplot2's color schemes are quite good-looking, as the saying goes, a good choice of style saves a lot of work!

# Selected Weather Data for Shanghai
df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SH'})

df2 = [(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SY'})
# Merge
df12 = (df2, how='inner', on=['dt'])
()

# Scatterplot
(kind='scatter', x='SH', y='SY', figsize=(10, 6), color='darkred')
('Average Temperature Between ShangHai - ShenYang')
('ShangHai')
('ShenYang')
()

area plan

# Multiple line graphs
df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SH'})
df2 = [(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'TJ'})
df3 = [(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.rename(columns={'AverageTemperature':'SY'})
# Merge
df123 = (df2, how='inner', on=['dt'])\
.merge(df3, how='inner', on=['dt'])\
.set_index(['dt'])
()

colors = ['red', 'pink', 'blue'] # control pie chart color ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold']
(kind='area', stacked=False,
figsize=(20, 10), colors=colors)
('AverageTemperature Of 3 City')
('AverageTemperature')
('Years')
()

bar chart

# Selected Weather Data for Shanghai
df = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.set_index('dt')
()

# The simplest histogram
df['AverageTemperature'].plot(kind='hist', figsize=(8,5), colors=['grey'])
('ShangHai AverageTemperature Of 2010-2013') # add a title to the histogram
('Number of month') # add y-label
('AverageTemperature') # add x-label
()

bar chart

# Selected Weather Data for Shanghai
df = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
.loc[:,['dt','AverageTemperature']]\
.set_index('dt')
()

(kind='bar', figsize = (10, 6))
('Month')
('AverageTemperature')
('AverageTemperature of shanghai')
()

(kind='barh', figsize=(12, 16), color='steelblue')
('AverageTemperature')
('Month')
('AverageTemperature of shanghai')
()

to this article on the use of Python matplotlib to draw a beautiful analysis of the chart of the article is introduced to this, more related to Python analysis of the content of the chart, please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future more!