SoFunction
Updated on 2024-11-18

Example of Python pandas method to delete specified rows/columns of data

1. Filter out missing data dropna()

import pandas as pd
import numpy as np
df=({"record":[,"Sub-healthy|Pan Light|45 years old","Disease | Jang Si",],"date":[,20210102,20210103,20210104]},index=["one","two","three","four"])

1) Filter all rows containing NaN values

()# Default axis=0

2) Filter out all columns containing NaN values

(axis=1)

3) Filter rows whose elements are all NaN values

(axis=0,how="all")

4) Filter columns whose elements are all NaN values

5) Filter out missing rows in the specified columns.

(subset=["record"],axis=0)

If you need to make changes directly on the original data, you need to set the parameter inplace=True.

2. Delete duplicate values drop_duplicates()

df=({'state':[1,1,2,2,1,2,2],'pop':['a','b','c','d','b','c','d']})

Syntax: drop_duplicates(subset,keep,inplace), where argument keep:{'first','last', False}, defaults to 'first'

first: keep the first occurrence of duplicates and delete the second and subsequent occurrences.

last: retains the last occurrence of duplicates and removes duplicates that occurred before.

"false": removes all duplicates.

1)keep=“first”

df.drop_duplicates(keep="first")

2)keep=“last”

df.drop_duplicates(keep="last")

3)keep=False

df.drop_duplicates(keep=False)

4) Delete the row corresponding to the duplicate item in the specified column.

df.drop_duplicates(subset=["state"],keep="first")

If you need to make changes directly on the original data, you need to set the parameter inplace=True.

3. Delete rows and columns according to the specified conditions drop()

df=((16).reshape(4,4),columns=["one","two","three","four"])

1). Delete the specified column

(["one"],axis=1)

Alternatively, thedel df["one"]to remove the specified column, but this method is not recommended because it defaults to making changes directly on the source data.

2). Delete the specified line

([0],axis=0)

If you need to make changes directly on the original data, you need to set the parameter inplace=True.

summarize

to this article on Python pandas to delete the specified rows/columns of data on this article, more related python pandas to delete the specified rows/columns of content, please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future more!