SoFunction
Updated on 2025-04-23

pandas DataFrame where implementation example

Pandas2.2 DataFrame

Indexing, iteration

method describe
([n]) Used to return the first few lines of the DataFrame
Methods to quickly access and modify individual values ​​in DataFrame
Methods to quickly access and modify individual values ​​in DataFrame
Used to access and modify data in a DataFrame based on tags (row labels and column labels)
Used to access and modify data in a DataFrame based on integer positions (row and column numbers)
(loc, column, value[, …]) Used to insert a new column at the specified location of the DataFrame
DataFrame.iter() Column name used to iterate over DataFrame
() Column names and column data used to iterate over DataFrame
() Returns the column name of the DataFrame
() Used for iterating DataFrame
([index, name]) Used for iterating DataFrame
(item) Used to delete a specified column from a DataFrame
([n]) Used to return the last of the DataFramenOK
(key[, axis, level, drop_level]) Used to extract a cross-section from a DataFrame
(key[, default]) Used to get data for a specified column from a DataFrame
(values) Used to check whether each element in the DataFrame is included in the specified value collection
(cond[, other, inplace, …]) Used to filter elements in DataFrame based on conditions

()

(cond, other=nan, *, inplace=False, axis=None, level=None)Methods are used to filter elements in DataFrame based on conditions. If the condition isTrue, the elements are retained; if the condition isFalse, then useotherThe value specified by the parameter replaces the element.

parameter

  • cond: Boolean condition, which can be a boolean value, a boolean array, a boolean DataFrame, or a boolean Series.
  • other: Optional parameters, when the condition isFalseThe value used when  . Default isNaN
  • inplace: Boolean value, ifTrue, then modify it directly on the original DataFrame, otherwise a new DataFrame will be returned. Default isFalse
  • axis:Specify the axis,0or'index'Indicates that it is based on line,1or'columns'Indicates by column. Default isNone
  • level: If the index is a multi-level index, specify the level to be used. Default isNone

Return value

  • ifinplace=False, return a new DataFrame.
  • ifinplace=True,returnNone

Example

Suppose we have a DataFrame as follows:

import pandas as pd
import numpy as np

data = {
    'A': [1, 2, 3, 4],
    'B': [5, 6, 7, 8]
}

df = (data)
print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
   A  B
0  1  5
1  2  6
2  3  7
3  4  8

Example 1: Replace the value with a boolean condition

WillAReplace the value greater than 2 in the column withNaN

result = (df['A'] <= 2)
print("\nReplace the value greater than 2 in column A with NaN:")
print(result)

Output:

Replace the value greater than 2 in column A with NaN:
     A    B
0  1.0  5.0
1  2.0  6.0
2  NaN  NaN
3  NaN  NaN

Example 2: Use Boolean conditions and custom replacement values

WillAReplace the value greater than 2 in the column with0

result = (df['A'] <= 2, other=0)
print("\nReplace the value greater than 2 in column A with 0:")
print(result)

Output:

Replace the value greater than 2 in column A with 0:
   A  B
0  1  5
1  2  6
2  0  0
3  0  0

Example 3: Replace values ​​with a Boolean DataFrame

WillAReplace the value greater than 2 in the column withNaNBReplace the value greater than 6 in the column withNaN

cond = (df['A'] <= 2) & (df['B'] <= 6)
result = (cond)
print("\nReplace the value greater than 2 in column A and the value greater than 6 in column B with NaN:")
print(result)

Output:

Replace the values ​​greater than 2 in column A and the values ​​greater than 6 in column B with NaN:
     A    B
0  1.0  5.0
1  2.0  6.0
2  NaN  NaN
3  NaN  NaN

Example 4: Use inplace=True to directly modify the original DataFrame

WillAReplace the value greater than 2 in the column with0, directly modify the original DataFrame:

(df['A'] <= 2, other=0, inplace=True)
print("\nModify the original DataFrame directly:")
print(df)

Output:

Modify the original DataFrame directly:
   A  B
0  1  5
1  2  6
2  0  0
3  0  0

Example 5: Using Multilevel Indexing

Suppose we have a DataFrame with a multi-level index:

index = .from_tuples([('a', 'x'), ('a', 'y'), ('b', 'x'), ('b', 'y')], names=['first', 'second'])
df = (data, index=index)
print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
              A  B
first second       
a     x       1  5
      y       2  6
b     x       0  0
      y       0  0

usewhereMethod and specifylevelParameters:

result = (df['A'] <= 2, level='first')
print("\nUse the where method and specify the level parameter:")
print(result)

Output:

Use the where method and specify the level parameters:
              A    B
first second       
a     x    1.0  5.0
      y    2.0  6.0
b     x    NaN  NaN
      y    NaN  NaN

Summarize

Methods provide a flexible way to filter and replace elements in DataFrame based on conditions. You can use a Boolean condition, a Boolean array, or a Boolean DataFrame to specify which elements need to be preserved and which ones need to be replaced. passotherThe parameter can specify the replacement value, the default isNaNinplaceParameters allow you to choose whether to directly modify the original DataFrame. This is very useful for data cleaning and preprocessing.

This is the end of this article about the implementation example of pandas DataFrame where. For more related pandas DataFrame where, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!