Pandas2.2 DataFrame
Indexing, iteration
method | describe |
---|---|
([n]) | Used to return the first few lines of the DataFrame |
Methods to quickly access and modify individual values in DataFrame | |
Methods to quickly access and modify individual values in DataFrame | |
Used to access and modify data in a DataFrame based on tags (row labels and column labels) | |
Used to access and modify data in a DataFrame based on integer positions (row and column numbers) | |
(loc, column, value[, …]) | Used to insert a new column at the specified location of the DataFrame |
DataFrame.iter() | Column name used to iterate over DataFrame |
() | Column names and column data used to iterate over DataFrame |
() | Returns the column name of the DataFrame |
() | Used for iterating DataFrame |
([index, name]) | Used for iterating DataFrame |
(item) | Used to delete a specified column from a DataFrame |
([n]) | Used to return the last of the DataFramen OK |
(key[, axis, level, drop_level]) | Used to extract a cross-section from a DataFrame |
(key[, default]) | Used to get data for a specified column from a DataFrame |
(values) | Used to check whether each element in the DataFrame is included in the specified value collection |
(cond[, other, inplace, …]) | Used to filter elements in DataFrame based on conditions |
()
(cond, other=nan, *, inplace=False, axis=None, level=None)
Methods are used to filter elements in DataFrame based on conditions. If the condition isTrue
, the elements are retained; if the condition isFalse
, then useother
The value specified by the parameter replaces the element.
parameter
-
cond
: Boolean condition, which can be a boolean value, a boolean array, a boolean DataFrame, or a boolean Series. -
other
: Optional parameters, when the condition isFalse
The value used when . Default isNaN
。 -
inplace
: Boolean value, ifTrue
, then modify it directly on the original DataFrame, otherwise a new DataFrame will be returned. Default isFalse
。 -
axis
:Specify the axis,0
or'index'
Indicates that it is based on line,1
or'columns'
Indicates by column. Default isNone
。 -
level
: If the index is a multi-level index, specify the level to be used. Default isNone
。
Return value
- if
inplace=False
, return a new DataFrame. - if
inplace=True
,returnNone
。
Example
Suppose we have a DataFrame as follows:
import pandas as pd import numpy as np data = { 'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8] } df = (data) print("Original DataFrame:") print(df)
Output:
Original DataFrame:
A B
0 1 5
1 2 6
2 3 7
3 4 8
Example 1: Replace the value with a boolean condition
WillA
Replace the value greater than 2 in the column withNaN
:
result = (df['A'] <= 2) print("\nReplace the value greater than 2 in column A with NaN:") print(result)
Output:
Replace the value greater than 2 in column A with NaN:
A B
0 1.0 5.0
1 2.0 6.0
2 NaN NaN
3 NaN NaN
Example 2: Use Boolean conditions and custom replacement values
WillA
Replace the value greater than 2 in the column with0
:
result = (df['A'] <= 2, other=0) print("\nReplace the value greater than 2 in column A with 0:") print(result)
Output:
Replace the value greater than 2 in column A with 0:
A B
0 1 5
1 2 6
2 0 0
3 0 0
Example 3: Replace values with a Boolean DataFrame
WillA
Replace the value greater than 2 in the column withNaN
,B
Replace the value greater than 6 in the column withNaN
:
cond = (df['A'] <= 2) & (df['B'] <= 6) result = (cond) print("\nReplace the value greater than 2 in column A and the value greater than 6 in column B with NaN:") print(result)
Output:
Replace the values greater than 2 in column A and the values greater than 6 in column B with NaN:
A B
0 1.0 5.0
1 2.0 6.0
2 NaN NaN
3 NaN NaN
Example 4: Use inplace=True to directly modify the original DataFrame
WillA
Replace the value greater than 2 in the column with0
, directly modify the original DataFrame:
(df['A'] <= 2, other=0, inplace=True) print("\nModify the original DataFrame directly:") print(df)
Output:
Modify the original DataFrame directly:
A B
0 1 5
1 2 6
2 0 0
3 0 0
Example 5: Using Multilevel Indexing
Suppose we have a DataFrame with a multi-level index:
index = .from_tuples([('a', 'x'), ('a', 'y'), ('b', 'x'), ('b', 'y')], names=['first', 'second']) df = (data, index=index) print("Original DataFrame:") print(df)
Output:
Original DataFrame:
A B
first second
a x 1 5
y 2 6
b x 0 0
y 0 0
usewhere
Method and specifylevel
Parameters:
result = (df['A'] <= 2, level='first') print("\nUse the where method and specify the level parameter:") print(result)
Output:
Use the where method and specify the level parameters:
A B
first second
a x 1.0 5.0
y 2.0 6.0
b x NaN NaN
y NaN NaN
Summarize
Methods provide a flexible way to filter and replace elements in DataFrame based on conditions. You can use a Boolean condition, a Boolean array, or a Boolean DataFrame to specify which elements need to be preserved and which ones need to be replaced. pass
other
The parameter can specify the replacement value, the default isNaN
。inplace
Parameters allow you to choose whether to directly modify the original DataFrame. This is very useful for data cleaning and preprocessing.
This is the end of this article about the implementation example of pandas DataFrame where. For more related pandas DataFrame where, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!