SoFunction
Updated on 2024-11-20

pandas null data handling methods explained

This article introduces the pandas empty data processing methods explained, the text of the sample code through the introduction of a very detailed, for everyone to learn or work with certain reference learning value, you can refer to the following friends

Method 1: Direct deletion

1. View rows or columns whether there are spaces (the following df for the DataFrame type, axis = 0, on behalf of the columns, axis = 1 on behalf of the rows, the following return value are rows or columns of the index plus a Boolean value)

The isnull method

View rows: ().any(axis=1)

View columns: ().any(axis=0)

notnull method:

View rows: ().all(axis=1)

View columns: ().all(axis=0)

Example:

().any(axis=1) # Detect if there is a null value in the line
0 False
1 True
2 False
3 True
4 False
5 True
6 False
7 True
8 False
9 False
dtype: bool

Note: All of the above methods can be inverted to get the opposite result with ~.

2. In the premise of 1 using [], you can take out 1 in the screening out of the data of the specific data such as:

[().any(axis=1)]

To retrieve the indexes of these lines use the attribute index e.g.:[().any(axis=1)].index

After getting these indexes you can use the drop method to delete as:

Note: The axis value in the drop method is the opposite of the other methods, axis=0 for rows, =1 for columns.

(labels=drop_index, axis=0)

Summarized in 4 steps:

I. Use isnull or notnull filtering: ().any(axis=0)

II. Use loc to take out specific data: [().any(axis=1)]

Three: take out the index of this data: [().any(axis=1)].index

IV. Delete using drop: (labels=drop_index, axis=0)

Method 2: Fill in the null value

The steps are the same as the first few steps of Method 1

isnull()

notnull()

dropna(): filter missing data (() can choose whether to filter rows or columns (defaults to rows): 0 in axis means rows, 1 means columns)

fillna(): fills in missing data (you can choose to add the data yourself, or supplement it with data already in the table)

1. Use dropna (not commonly used): (axis=0)

2. Use fillna (commonly used):

I. (value=666) assigns a value of 666 to all controls

II. (method='fill', axis=0) # axis=0 means fill in the vertical direction (axis value: 0 is vertical, 1 is horizontal), use the value above to fill the null value, the combination is, use the value above the vertical direction to fill the value at the current position

III. (method='bfill', axis=1) # axis=1 means fill horizontally (axis value 0 vertical 1 is horizontal), bfill means use the value behind to fill the null value, the combination is, use the value to the right of the horizontal direction to fill the value of the current position

To summarize: ffill (front) and bfill (back) determine front or back, axis determines vertical or horizontal

This is the whole content of this article.