preamble
As a good analyst, or have to learn some of the skills to make the charts beautiful, this way to take out more face haha. Well, today's bag of tricks is to introduce a variety of common charts, you can how to draw it.
Data set introduction
First introduce the dataset, let's also use the same datasets, respectivelySalary_Ranges_by_Job_Classification
as well asGlobalLandTemperaturesByCity
. (Specific datasets can be replied to in the background)plot
(Access)
# Import some common packages import pandas as pd import numpy as np import seaborn as sns %matplotlib inline import as plt import matplotlib as mpl ('fivethirtyeight') #Resolve Chinese display issues with Mac from matplotlib.font_manager import FontProperties # View the valid styles of the local plt print() # Choose one of the styles based on the locally available styles, since I knew ggplot looked good before, I chose it (['ggplot']) # ['_classic_test', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark-palette', 'seaborn-dark', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'seaborn', 'Solarize_Light2'] # Data set import # Introduced 1st dataset Salary_Ranges_by_Job_Classification salary_ranges = pd.read_csv('./data/Salary_Ranges_by_Job_Classification.csv') # Introduced 2nd dataset GlobalLandTemperaturesByCity climate = pd.read_csv('./data/') # Remove missing values (axis=0, inplace=True) # Just look at China # # date conversion, convert dt to date, take the year, note the use of map climate['dt'] = pd.to_datetime(climate['dt']) climate['year'] = climate['dt'].map(lambda value: ) climate_sub_china = [climate['Country'] == 'China'] climate_sub_china['Century'] = climate_sub_china['year'].map(lambda x:int(x/100 +1)) ()
line graph
Line charts are relatively simple charts, there is nothing to optimize, the color looks good. Below is a color chart from the Internet, you can choose from it~.
# Selected Weather Data for Shanghai df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .set_index('dt') ()
# Line graphs (colors=['lime']) ('AverageTemperature Of ShangHai') ('Number of immigrants') ('Years') ()
Above this is a single line graph, multiple line graphs can also be drawn, just add a few more columns.
# Multiple line graphs df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SH'}) df2 = [(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'TJ'}) df3 = [(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SY'}) # Merge df123 = (df2, how='inner', on=['dt'])\ .merge(df3, how='inner', on=['dt'])\ .set_index(['dt']) ()
# Multiple line graphs () ('AverageTemperature Of 3 City') ('Number of immigrants') ('Years') ()
pie chart
The next step is to draw a pie chart, we can optimize a few more points, for example, from the degree of separation of the pie pieces, we first draw a "low version of the" pie chart.
df1 = salary_ranges.groupby('SetID', axis=0).sum()
# "Low-end" pie charts df1['Step'].plot(kind='pie', figsize=(7,7), autopct='%1.1f%%', shadow=True) ('equal') ()
# "Premium" pie charts colors = ['lightgreen', 'lightblue'] # control pie chart color ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold'] explode=[0, 0.2] #Control the pie chart separation status, the larger the more separated it is df1['Step'].plot(kind='pie', figsize=(7, 7), autopct = '%1.1f%%', startangle=90, shadow=True, labels=None, pctdistance=1.12, colors=colors, explode = explode) ('equal') (labels=, loc='upper right', fontsize=14) ()
scatterplot
Scatterplot can be optimized in fewer places, ggplot2's color schemes are quite good-looking, as the saying goes, a good choice of style saves a lot of work!
# Selected Weather Data for Shanghai df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SH'}) df2 = [(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SY'}) # Merge df12 = (df2, how='inner', on=['dt']) ()
# Scatterplot (kind='scatter', x='SH', y='SY', figsize=(10, 6), color='darkred') ('Average Temperature Between ShangHai - ShenYang') ('ShangHai') ('ShenYang') ()
area plan
# Multiple line graphs df1 = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SH'}) df2 = [(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'TJ'}) df3 = [(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SY'}) # Merge df123 = (df2, how='inner', on=['dt'])\ .merge(df3, how='inner', on=['dt'])\ .set_index(['dt']) ()
colors = ['red', 'pink', 'blue'] # control pie chart color ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold'] (kind='area', stacked=False, figsize=(20, 10), colors=colors) ('AverageTemperature Of 3 City') ('AverageTemperature') ('Years') ()
bar chart
# Selected Weather Data for Shanghai df = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .set_index('dt') ()
# The simplest histogram df['AverageTemperature'].plot(kind='hist', figsize=(8,5), colors=['grey']) ('ShangHai AverageTemperature Of 2010-2013') # add a title to the histogram ('Number of month') # add y-label ('AverageTemperature') # add x-label ()
bar chart
# Selected Weather Data for Shanghai df = [(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .set_index('dt') ()
(kind='bar', figsize = (10, 6)) ('Month') ('AverageTemperature') ('AverageTemperature of shanghai') ()
(kind='barh', figsize=(12, 16), color='steelblue') ('AverageTemperature') ('Month') ('AverageTemperature of shanghai') ()
to this article on the use of Python matplotlib to draw a beautiful analysis of the chart of the article is introduced to this, more related to Python analysis of the content of the chart, please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future more!