SoFunction
Updated on 2024-11-19

An overview of which Python library is best for data visualization

Data visualization is a key step in any exploratory data analysis or report that gives us insight into a data set at a glance. There are a number of very good Business Intelligence tools available, such as Tableau, googledatastudio and PowerBI, which allow us to create graphs with ease.

However, data analysts or data scientists are still accustomed to using Python to create visualizations on Jupyter notebook. The most popular Python libraries for data visualization: Matplotlib, Seaborn, plotlyexpress, and Altair. each visualization library has its own characteristics, there is no perfect visualization library, we should know the advantages and disadvantages of each kind of data visualization, and finding the one that suits you is the key.

intend

First, let's import all the important libraries. Most likely you already have Matplotlib and Seaborn installed on your computer. However, you probably don't have Plotly Express and Altair. you can now easily install them using pip install plotly==4.14.3 and pip install altair dataset.

import pandas as pd
import  as plt
%matplotlib inline
import altair as alt
import  as px

Now we will import the dataset. For demonstration purposes, we will only create a data frame with the 15 most populated cities in the United States. I will also fix the capitalization of the city names. It will facilitate the editing process as we create the visualization.

df = pd.read_csv('')
us = df[df['Country'] == 'us']
us['City'] = us['City'].()
cities = us[['City', 'Population']].nlargest(15, ['Population'], keep='first')

Now we should be ready to analyze each library. Are you ready?

Setting Difficulty and Initial Results

Winner: Plotly Express
Loser: Matplotlib, Altair and Seaborn

All of the libraries performed well in this category. They were all easy to set up and the basic edits were good enough for most analyses, but we need to have winners and losers, right?

Matplotlib is easy to set up and remember the code. However, the chart does not look good. It may do the job of analyzing the data, but the results in a business meeting are not good.

Seaborn creates a much better chart. It automatically adds x-axis and y-axis labels. x notation looks better, but for a basic chart this is much better than Matplotlib.

Plotly Expres, performs very well. A good looking, professional looking bar graph can be created with very little code. There is no need to set the graph or font size. It even rotates the x-axis labels. All this with a single line of code. Very impressive!

Altair Charts performs well. It provides a nice looking graph, but it requires more code, it's in alphabetical order, which isn't horrible and would be helpful in many cases, but I think that should be up to the user to decide.

Editing and customization

Winner: Plotly Express, Seaborn, Matplotlib
Loser: Altair

I believe all four libraries have the potential to be winners. Custom charts are however different on each one, but I think if you learn enough you will learn how to create beautiful visualizations. However, I'm thinking about how easy it is to edit and customize and imagine myself as a new user.

Matplotlib and Seaborn are very easy to customize and their documentation is great. Even if you don't find the information you're looking for in their documentation, you'll easily find it there. They also have the advantage of working together; Seaborn is based on Matplotlib, so if you know how to edit one, you know how to edit the other, which is very convenient. If you use

sns.set_style('darkgrid')

Set the Seaborn theme and it will affect Matplotlib, which is probably why Matplotlib and Seaborn are two of the more popular data visualization libraries.

plotly express provides nice charts from the start, for example, and requires less editing than Matplotlib to get very nice visualizations. Its documentation is easy to understand, they provide documentation via Shift+Tab which is very handy. It also offers the most customization options out of all the libraries I tried. You can edit anything, including fonts, tab colors, etc. and the best part is that it's effortless. Its documentation is full of examples.

I found Altair's documentation to be very confusing. Unlike other libraries, Altair does not have a Shift+Tab shortcut. For a beginner, this is very problematic and confusing. I was able to do some editing, but finding information about it was stressful. In terms of editing compared to the time I spent on Matplotlib and plotly express, Altair is not a good choice for beginners.

additional functionality

Winner: Plotly Express and Altair
Losers: Matplolib and Seaborn

For this category, I'm going to consider features other than those we can implement through code.Matplotlib and Seaborn are very basic in this category. They do not provide any additional editing or interaction options beyond the code. However, Plotly Express shines in this category. First of all, the charts are interactive. You can simply hover your mouse over the graph and see information about it.

在这里插入图片描述

Altair provides options to save the file or open the JSON file through the Vega editor.

Documentation and website

Winners: Plotly Express, Altair, Seaborn, Matplotlib

All of these libraries are well documented.Plotly Express has a nice website with code and visual demos. It's easy to read and find information about it. I love how polished and well-designed their site is and you can even interact with the charts.

在这里插入图片描述

Altair does a good job on their website. Their custom documentation isn't the best, but the site looks good and it's easy to find code examples. I wouldn't say it's amazing, but it does make a difference.

在这里插入图片描述

Seaborn's website is ok. Some people say they have the best documentation with code examples included. It can get tricky if you're looking for customization options, but otherwise it's a clean site and its documentation is pretty complete.


Matplotlib has a full website. In my opinion, it has so much text that finding some information can be a bit tricky. However, the information is there. They also provide documentation in PDF format.

summarize

The four libraries I've analyzed in this article are all great at the moment. All visualization libraries have advantages and disadvantages, and finding the right one for you is the key. My favorite is Plotly Express because it excels in all categories. However, Matplotlib and Seaborn are more popular and most people have them installed on their computers. altair is between my least favorites. What is your favorite data visualization library?

To this point this article on which Python library is the most suitable for data visualization of the article is introduced to this, more related Python data visualization content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!