SoFunction
Updated on 2024-11-07

Python visualization module altair use details

Today I'm here to talk to you aboutPythonamongaltairVisualization module, and by calling the module to draw some common charts, with Altair, we can focus more energy and time on understanding the data itself and the meaning of the data, from the complexity of the data visualization process.

What's Altair?

Altair is known as a statistical visualization library because it provides a comprehensive way to know data, understand and analyze data through classification and aggregation, data transformation, data interaction, graphical compositing, etc. And its installation process is very simple, directly through thepipcommand to execute, as follows

pip install altair
pip install vega_datasets
pip install altair_viewer

If you are using the conda package manager to install the Altair module, the code is as follows

conda install -c conda-forge altair vega_datasets

Altair First Experience

Let's simply try to draw a histogram by first creating aDataFramedataset, the code is as follows

df = ({"brand":["iPhone","Xiaomi","HuaWei","Vivo"],
                   "profit(B)":[200,55,88,60]})

Next is the code for plotting the histogram

import altair as alt
import pandas as pd
import altair_viewer

chart = (df).mark_bar().encode(x="brand:N",y="profit(B):Q")
# Display the data, call the display() method
altair_viewer.display(chart,inline=True)

output

Looking at the entire syntactic structure, first using the()Specify the dataset to use, and then use the example methodmark_*()Drawing chart style, and finally specify the data represented by the X-axis and Y-axis, you may be curious, among theNas well asQWhat they stand for, respectively, this is an abbreviated form of the variable type, in other wordsAltairThe module needs to understand the types of variables involved in drawing a graph, only then will the graph be drawn with the effect we expect.

includedNrepresents a variable of nominal type (Nominal), for example, the brands of cell phones are all one moniker, and theQrepresents a numeric variable (Quantitative), which can be categorized into discrete data (discrete) and continuous type data (continuous), in addition to time series type data, abbreviated asTand order-type variables (O), for example, there are 1-5 star ratings for merchants in the online shopping process.

Preservation of charts

For saving the final chart, we can just call thesave()method to save the object asHTMLfile with the following code

("")

It can also be saved asJSONfile, which is very similar from the code point of view

("")

Of course, we can also save the file as an image format, as shown in the following figure

Altair Advanced Operations

Let's build on the above by deriving and expanding on it, for example, if we want to draw a horizontally oriented bar graph.Xshafts andYThe data for the axes are interchanged with the following code

chart = (df).mark_bar().encode(x="profit(B):Q", y="brand:N")
("")

output

Let's also try to draw a line graph, calling themark_line()The method code is as follows

## Create a new set of data with date as the row index value
(29)
value = (365)
data = (value)
date = pd.date_range(start="20220101", end="20221231")
df = ({"num": data}, index=date)

line_chart = (df.reset_index()).mark_line().encode(x="index:T", y="num:Q")
line_chart.save("")

output

We can also draw a Gantt chart, which is often used in project management.Xaxis adds the time and date, while theYThe axis represents the progress of the project, and the code is as follows

project = [{"project": "Proj1", "start_time": "2022-01-16", "end_time": "2022-03-20"},
 {"project": "Proj2", "start_time": "2022-04-12", "end_time": "2022-11-20"},
 ......
 ]

df = (values=project)
chart = (df).mark_bar().encode(
    ("start_time:T",
          axis=(format="%x",
                        formatType="time",
                        tickCount=3),
          scale=(domain=[(year=2022, month=1, date=1),
                                  (year=2022, month=12, date=1)])),
    alt.X2("end_time:T"),
    ("project:N", axis=(labelAlign="left",
                                     labelFontSize=15,
                                     labelOffset=0,
                                     labelPadding=50)),
    color=("project:N", legend=(labelFontSize=12,
                                                   symbolOpacity=0.7,
                                                   titleFontSize=15)))

("chart_gantt.html")

output

In the chart above, we see that the team is working on several projects, each of which has a different level of progress, and of course, the time span of the different projects is also different, which is very intuitive when shown on the chart.

Immediately after that, we'll plot the scatterplot again, calling themark_circle()method with the following code

df = ()

## Filter out passenger car data for the region "USA", i.e., the United States.
df_1 = (df).transform_filter(
     == "USA"
)

df = ()

df_1 = (df).transform_filter(
     == "USA"
)

chart = df_1.mark_circle().encode(
    ("Horsepower:Q"),
    ("Miles_per_Gallon:Q")
)

("chart_dots.html")

output

Of course, we can further optimize it to make the chart look more beautiful, add some color to it, the code is as follows

chart = df_1.mark_circle(color=("radial",[("white", 0.0),
                                                 ("red", 1.0)]),
              size=160).encode(
    ("Horsepower:Q", scale=(zero=False,padding=20)),
    ("Miles_per_Gallon:Q", scale=(zero=False,padding=20))
)

output

We change the size of the scatter, different scatter sizes represent different values, the code is as follows

chart = df_1.mark_circle(color=("radial",[("white", 0.0),
                                                 ("red", 1.0)]),
              size=160).encode(
    ("Horsepower:Q", scale=(zero=False, padding=20)),
    ("Miles_per_Gallon:Q", scale=(zero=False, padding=20)),
    size="Acceleration:Q"
)

output

Above is the use of Python visualization module altair detailed content, more information about Python visualization module altair please pay attention to my other related articles!