SoFunction
Updated on 2024-11-13

Python implementation of data visualization case study

1. Description of the problem

Make changes to the diagram on the right:

  • Please change the style of the graphic
  • Please change the data for the x-axis to -10 to 10.
  • Construct your own function for a y-value
  • Change the position of the numbers on the histogram to a vertically centered position inside the bar graph.
  • Segmentation of the performance data: every 5 points is a segment and a histogram of the number of people in each segment is shown.
  • Self-created 3 semester rank data for 10 students and presented in a histogram for comparison.
  • line drawing
    • Make some adjustments to this image and ask for 5 full peaks to appear.
    • Increase the amplitude of the cos waveform.
    • Turn up the frequency of the sin waveform.
  • Showing Beijing's air quality data in a line graph

Shows the change in monthly average PM Index data over 10-15 years, with 6 curves in one graph and 1 curve for each year.

2. Experimental environment

Microsoft Windows 10 version 18363

​ PyCharm 2020.2.1 (Community Edition)

​ Python 3.8(Scrapy 2.4.0 + numpy 1.19.4 + pandas 1.1.4 + matplotlib 3.3.3)

3. Experimental procedures and results

Make changes to the diagram on the right:

  • Please change the style of the graphic
  • Please change the data for the x-axis to -10 to 10.
  • Construct your own function for a y-value
  • Change the position of the numbers on the histogram to a vertically centered position inside the bar graph.
from matplotlib import pyplot as plt
import numpy as np

fig, ax = ()
('classic')
("square numbers")

ax.set_xlim(-11, 11)
ax.set_ylim(0, 100)

x = (range(-10, 11))
y = x * x
rect1 = (x, y)
for r in rect1:
    (r.get_x(), r.get_height() / 2, r.get_height())
()

As shown using the classic style, the x-axis data is an integer of [-10, 10], and the constructed function is y=x2, which displays the position and changes it to center the value vertically inside the bar graph.

Segmentation of the performance data: every 5 points is a segment and a histogram of the number of people in each segment is shown.

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

df = pd.read_csv("./", encoding='utf-8', dtype=str)
df = (df, columns=['score'], dtype=)
section = (range(0, 105, 5))
result = (df['score'], section)
count = pd.value_counts(result, sort=False)
fig, ax = ()
('classic')
ax.set_xlim(0, 100)
rect1 = ((2.5, 100, 5), count, width=5)
for r in rect1:
    (r.get_x(), r.get_height(), r.get_height())
()

Self-created 3 semester rank data for 10 students and presented in a histogram for comparison.

import random

semester1 = (1, 11)
semester2 = (1, 11)
semester3 = (1, 11)

(semester1)
(semester2)
(semester3)
df = ({'semester1':semester1, 'semester2':semester2, 'semester3':semester3})
print(df)
df.to_csv("", encoding="utf-8")

Use the code as above to create randomized ranking data.

df = pd.read_csv("./", encoding='utf-8', dtype=str)
df = (df, columns=['semester1', 'semester2', 'semester3'], dtype=)

df['total'] = (df['semester1'] + df['semester2'] + df['semester3']) / 3
df = df.sort_values('total')

fig, ax = ()
('classic')
('RANK')
width = 0.2
x = (range(0, 10))
rect1 = (x-2*width, df['semester1'], width=width, label='semester1')
rect2 = (x-width, df['semester2'], width=width, label='semester2')
rect3 = (x, df['semester3'], width=width, label='semester3')
for r in rect1:
    (r.get_x(), r.get_height(), r.get_height())
for r in rect2:
    (r.get_x(), r.get_height(), r.get_height())
for r in rect3:
    (r.get_x(), r.get_height(), r.get_height())
(ncol=1)
()

Code drawing as above:

Wiremap :

  • Make some adjustments to this image and ask for 5 full peaks to appear.
  • Increase the amplitude of the cos waveform.
  • Turn up the frequency of the sin waveform.
import numpy as np
from matplotlib import pyplot as plt

x = (-5 * , 5 * , 500)
y1 = 3 * (x)
y2 = (4*x)

fig, ax = ()
('classic')
["right"].set_visible(False)
["top"].set_visible(False)
['bottom'].set_position(('data',0))
.set_ticks_position('bottom')
['left'].set_position(('data',0))
.set_ticks_position('left')
(x, y1, color='blue', linestyle='-', label='y=3cosx')
(x, y2, color='red', linestyle='-', label='y=sin3x')
()
()

Showing Beijing's air quality data in a line graph

Shows the change in monthly average PM Index data over 10-15 years, with 6 curves in one graph and 1 curve for each year.

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
orig_df = pd.read_csv("./BeijingPM20100101_20151231.csv", encoding='utf-8', dtype=str)
orig_df = (orig_df, columns=['year', 'month', 'PM_US Post'])
df = orig_df.dropna(0, how='any')
df['month'] = df['month'].astype(int)
df['year'] = df['year'].astype(int)
df['PM_US Post'] = df['PM_US Post'].astype(int)
df.reset_index(drop=True, inplace=True)
num = len(df)
section = (1, 13)
record = 0
fig, ax = ()
('classic')
("2010-2015 Beijing average PM2.5(from PM_US Post) per month")

for nowyear in range(2010, 2016):
    i = record
    result = [0 for i in range(13)]
    nowsum = 0
    cntday = 0
    nowmonth = 1
    while i < num:
        if df['month'][i] == nowmonth:
            cntday = cntday + 1
            nowsum = nowsum + df['PM_US Post'][i]
        else:
            if df['year'][i] != nowyear:
                record = i
                result[nowmonth] = nowsum / cntday
                break
            result[nowmonth] = nowsum / cntday
            cntday = 1
            nowsum = df['PM_US Post'][i]
            nowmonth = df['month'][i]
        i = i + 1
    result = result[1:]
    #
    x = (range(1, 13))
    (x, result, linestyle='-', label=str(nowyear))
()
()

To this article on the implementation of Python data visualization case study is introduced to this article, more related Python data visualization content please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future!