To sort the sum, you can use the sort_values() and sort_index() methods.
Note that the sort() method, which existed in older versions, is deprecated.
Sort by elements sort_values()
- Ascending, Descending (parameter ascending)
- multicolumn sorting
- Handling of missing value NaN (parameter na_position)
- Change original object (parameter inplace)
Sort by row direction (parameter axis)
- Sort by index (row name/column name) sort_index()
- Sort by row name index
- Ascending, Descending (parameter ascending)
- Change original object (parameter inplace)
- Sort by column name (parameter axis)
Take the following data as an example.
import pandas as pd df = pd.read_csv('./data/17/sample_pandas_normal.csv') print(df) # name age state point # 0 Alice 24 NY 64 # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 3 Dave 68 TX 70 # 4 Ellen 24 CA 88 # 5 Frank 30 NY 57
The examples are, but also have sort_values() and sort_index(), so the usage is the same.
Sort by elements sort_values()
Use the sort_values() method to sort by element values.
Specify the label (column name) of the column to be sorted in the first argument (by).
df_s = df.sort_values('state') print(df_s) # name age state point # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 4 Ellen 24 CA 88 # 0 Alice 24 NY 64 # 5 Frank 30 NY 57 # 3 Dave 68 TX 70
Ascending, Descending (parameter ascending)
The default is ascending order. If you want to use descending order, set the ascending order parameter to False.
df_s = df.sort_values('state', ascending=False) print(df_s) # name age state point # 3 Dave 68 TX 70 # 0 Alice 24 NY 64 # 5 Frank 30 NY 57 # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 4 Ellen 24 CA 88
multicolumn sorting
If you specify the first argument as a list, you can sort by multiple columns.
The images are sorted sequentially starting from the back of the list. Finally, it sorts by the first column in the list.
df_s = df.sort_values(['state', 'age']) print(df_s) # name age state point # 2 Charlie 18 CA 70 # 4 Ellen 24 CA 88 # 1 Bob 42 CA 92 # 0 Alice 24 NY 64 # 5 Frank 30 NY 57 # 3 Dave 68 TX 70 df_s = df.sort_values(['age', 'state']) print(df_s) # name age state point # 2 Charlie 18 CA 70 # 4 Ellen 24 CA 88 # 0 Alice 24 NY 64 # 5 Frank 30 NY 57 # 1 Bob 42 CA 92 # 3 Dave 68 TX 70
If the ascending parameter is specified as a list, you can select ascending or descending order for each column.
df_s = df.sort_values(['age', 'state'], ascending=[True, False]) print(df_s) # name age state point # 2 Charlie 18 CA 70 # 0 Alice 24 NY 64 # 4 Ellen 24 CA 88 # 5 Frank 30 NY 57 # 1 Bob 42 CA 92 # 3 Dave 68 TX 70
Handling of missing value NaN (parameter na_position)
If the value NaN is missing, it will be sorted by default.
df_nan = () df_nan.iloc[:2, 1] = print(df_nan) # name age state point # 0 Alice NaN NY 64 # 1 Bob NaN CA 92 # 2 Charlie 18.0 CA 70 # 3 Dave 68.0 TX 70 # 4 Ellen 24.0 CA 88 # 5 Frank 30.0 NY 57 df_nan_s = df_nan.sort_values('age') print(df_nan_s) # name age state point # 2 Charlie 18.0 CA 70 # 4 Ellen 24.0 CA 88 # 5 Frank 30.0 NY 57 # 3 Dave 68.0 TX 70 # 0 Alice NaN NY 64 # 1 Bob NaN CA 92
If the parameter na_position = 'first', it will be placed at the beginning.
df_nan_s = df_nan.sort_values('age', na_position='first') print(df_nan_s) # name age state point # 0 Alice NaN NY 64 # 1 Bob NaN CA 92 # 2 Charlie 18.0 CA 70 # 4 Ellen 24.0 CA 88 # 5 Frank 30.0 NY 57 # 3 Dave 68.0 TX 70
To remove a missing value or replace it with another value, see the following article.
Pandas removes, replaces and extracts the missing values in it NaN(dropna,fillna,isnull)
Change original object (parameter inplace)
By default, a new sorted object will be returned, but if the inplace parameter is True, the original object itself will be changed.
df.sort_values('state', inplace=True) print(df) # name age state point # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 4 Ellen 24 CA 88 # 0 Alice 24 NY 64 # 5 Frank 30 NY 57 # 3 Dave 68 TX 70
Sort by row direction (parameter axis)
As in the previous example, the default sort is column (vertical).
To sort by row direction, set the parameter axis to 1. The other parameters are the same as in the previous example.
Since an error occurs if values and strings are mixed, the string columns are dropped here and only the numeric columns are displayed. For the drop() method, see the following article.
Delete specified rows and columns (drop)
df_d = (['name', 'state'], axis=1) print(df_d) # age point # 1 42 92 # 2 18 70 # 4 24 88 # 0 24 64 # 5 30 57 # 3 68 70 df_d .sort_values(by=1, axis=1, ascending=False, inplace=True) print(df_d) # point age # 1 92 42 # 2 70 18 # 4 88 24 # 0 64 24 # 5 57 30 # 3 70 68
Sort by index (row name/column name) sort_index()
Use the sort_index() method to sort by index (row name/column name).
Sort by row name index
By default, sort_index() sorts in the column direction (vertically) based on the row name.
print(df) # name age state point # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 4 Ellen 24 CA 88 # 0 Alice 24 NY 64 # 5 Frank 30 NY 57 # 3 Dave 68 TX 70 df_s = df.sort_index() print(df_s) # name age state point # 0 Alice 24 NY 64 # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 3 Dave 68 TX 70 # 4 Ellen 24 CA 88 # 5 Frank 30 NY 57
Ascending, Descending (parameter ascending)
As with sort_values(), the default value is ascending. To use descending order, set the ascending parameter to False.
df_s = df.sort_index(ascending=False) print(df_s) # name age state point # 5 Frank 30 NY 57 # 4 Ellen 24 CA 88 # 3 Dave 68 TX 70 # 2 Charlie 18 CA 70 # 1 Bob 42 CA 92 # 0 Alice 24 NY 64
Change original object (parameter inplace)
As with sort_values(), the argument inplace can be specified. if True, the original object is changed.
df.sort_index(inplace=True) print(df) # name age state point # 0 Alice 24 NY 64 # 1 Bob 42 CA 92 # 2 Charlie 18 CA 70 # 3 Dave 68 TX 70 # 4 Ellen 24 CA 88 # 5 Frank 30 NY 57
Sort by column name (parameter axis)
Similarly to sort_values(), if the parameter axis = 1 is set, the columns are sorted in the row direction (horizontally) according to their names. Other parameters can be used as in the previous example.
df_s = df.sort_index(axis=1) print(df_s) # age name point state # 0 24 Alice 64 NY # 1 42 Bob 92 CA # 2 18 Charlie 70 CA # 3 68 Dave 70 TX # 4 24 Ellen 88 CA # 5 30 Frank 57 NY df.sort_index(axis=1, ascending=False, inplace=True) print(df) # state point name age # 0 NY 64 Alice 24 # 1 CA 92 Bob 42 # 2 CA 70 Charlie 18 # 3 TX 70 Dave 68 # 4 CA 88 Ellen 24 # 5 NY 57 Frank 30
to this article on the use of Series sort (sort_values, sort_index) of the article is introduced to this, more related pandas DataFrame Series sort content please search my previous posts or continue to browse the following related articles I hope you will support me in the future!