As we all know, Matplotlib is the originator of many Python visualization packages, and is also the most commonly used standard visualization library in Python, which is very powerful and complex, and it is not easy to understand. However, since Python entered the 3.0 era, the use of pandas has become more popular, and its figure is often seen in market analysis, crawlers, financial analysis, and scientific computing.
As a collection of data analysis tools, pandas authors have said that visualization in pandas is easier and more powerful than plt. In fact, if one is extremely demanding of chart details, then it is recommended to use matplotlib to code through the underlying chart module. Of course, most of us will not have such perverse requirements in our work, so one sentence import pandas as pd is enough to cope with all the visualization work.
Below, we summarize some of the ways to use the PD library and tips for getting started.
I. Line diagrams
For the built-in data types of pandas, Series and DataFrame both have a plot method for generating various types of charts. By default, they generate line charts. This functionality on Series and DataFrame is actually a simple wrapper implementation of the plot() method using the matplotlib library. See the following sample code -
import pandas as pd import numpy as np df = ((10,4),index=pd.date_range('2018/12/18', periods=10), columns=list('ABCD')) ()
Execution of the above sample code yields the following results -
If the index consists of a date, call gct().autofmt_xdate() to format the x-axis as shown above.
We can use the x and y keywords to plot one column against another.
s = Series( np. random. randn( 10). cumsum(), index= np. arange( 0, 100, 10)) s. plot()
Most of the pandas plotting methods have an optional ax parameter, which can be a matplotlib subplot object. This gives you more flexibility in the placement of the subplot on the grid layout. The plot method of a DataFrame draws a line for each column in a subplot and automatically creates the legend (as shown here):
df = DataFrame( np. random. randn( 10, 4). cumsum( 0), ...: columns=[' A', 'B', 'C', 'D'], index= np. arange( 0, 100, 10)) df. plot()
II. Histograms
A bar chart can be generated by adding kind='bar' (vertical bar) or kind='barh' (horizontal bar) to the code that generates the line chart. In this case, the indexes of the Series and DataFrame will be used as X (bar) or (barh) scales:
In [59]: fig, axes = plt. subplots( 2, 1) In [60]: data = Series( np. random. rand( 16), index= list(' abcdefghijklmnop')) In [61]: data. plot( kind=' bar', ax= axes[ 0], color=' k', alpha= 0. 7) Out[ 61]: < matplotlib. axes. AxesSubplot at 0x4ee7750> In [62]: data. plot( kind=' barh', ax= axes[ 1], color=' k', alpha= 0.
For a DataFrame, the bar chart divides the values of each row into groups, as shown in Figure 8-16:
In [63]: df = DataFrame( np. random. rand( 6, 4), ...: index=[' one', 'two', 'three', 'four', 'five', 'six'], ...: columns= pd. Index([' A', 'B', 'C', 'D'], name=' Genus')) In [64]: df Out[ 64]: Genus A B C D one 0. 301686 0. 156333 0. 371943 0. 270731 two 0. 750589 0. 525587 0. 689429 0. 358974 three 0. 381504 0. 667707 0. 473772 0. 632528 four 0. 942408 0. 180186 0. 708284 0. 641783 five 0. 840278 0. 909589 0. 010041 0. 653207 six 0. 062854 0. 589813 0. 811318 0. 060217 In [65]: df. plot( kind=' bar')
III. Bar charts
Now see what a bar chart is by creating one. A bar chart can be created by -
import pandas as pd import numpy as np df = ((10,4),columns=['a','b','c','d']) ()
Execution of the above sample code yields the following results -
To generate a stacked bar graph, specify: pass stacked=True -
import pandas as pd df = ((10,4),columns=['a','b','c','d']) (stacked=True)
Execution of the above sample code yields the following results -
To get a horizontal bar graph, use the barh() method - the
import pandas as pd import numpy as np df = ((10,4),columns=['a','b','c','d']) (stacked=True)
IV. Histograms
Histograms can be plotted using the () method. We can specify the value of the number of bins.
import pandas as pd import numpy as np df = ({'a':(1000)+1,'b':(1000),'c': (1000) - 1}, columns=['a', 'b', 'c']) (bins=20)
Execution of the above sample code yields the following results -
To plot different histograms for each column, use the following code -
import pandas as pd import numpy as np df=({'a':(1000)+1,'b':(1000),'c': (1000) - 1}, columns=['a', 'b', 'c']) (bins=20)
Execution of the above sample code yields the following results -
V. Box diagrams
Boxplot can plot calls to () and () or () to visualize the distribution of values in each column.
For example, here is a box plot representing five trials of 10 observations of a uniform random variable on [0,1).
import pandas as pd import numpy as np df = ((10, 5), columns=['A', 'B', 'C', 'D', 'E']) ()
Execution of the above sample code yields the following results -
VI. Block diagrams
Area graphics can be created using the () or () method.
import pandas as pd import numpy as np df = ((10, 4), columns=['a', 'b', 'c', 'd']) ()
Execution of the above sample code yields the following results -
VII. Scatterplot
Scatterplots can be created using the () method.
import pandas as pd import numpy as np df = ((50, 4), columns=['a', 'b', 'c', 'd']) (x='a', y='b')
Execution of the above sample code yields the following results -
VIII. Pie charts
Pie charts can be created using the () method.
import pandas as pd import numpy as np df = (3 * (4), index=['a', 'b', 'c', 'd'], columns=['x']) (subplots=True)
Execution of the above sample code yields the following results -
Above this Python data analysis: hands-on tutorial to teach you to use Pandas to generate visual charts is all I have to share with you, I hope to be able to give you a reference, and I hope that you will support me more.