SoFunction
Updated on 2024-11-17

pandas implementation of one/multiple columns for data interval filtering

It's a waste of life if you don't live it.

The following table

If you want to filter out rows with age greater than or equal to 18 and less than or equal to 30:

It would be easy to straighten out this kind of job in mysql with awhereThat's it.

There is also syntax equivalent to where in pandas.loc

Filtering data for 18<=age<=30

import pandas as pd

stu = pd.read_csv("data/", index_col='id')

# Leave 18<=age<=30
def age_18_to30(a):
    return 18 <= a <= 30

# Leave 85<=score
def level_a(s):
    return 85 <= s

# Using loc generates a new series
stu = [stu['age'].apply(age_18_to30)]
# Or use the following lambda expression.
# stu = [stu['age'].apply(lambda a:18<=a<=30)]

print(stu)

Results:

Add another filter at this point

AGE greater than or equal to 18, less than or equal to 30 and a score greater than or equal to 85:

Code:

import pandas as pd

stu = pd.read_csv("data/", index_col='id')


# Leave 18<=age<=30
def age_18_to30(a):
    return 18 <= a <= 30


# Leave 85<=score
def level_a(s):
    return 85 <= s

stu = [stu['age'].apply(age_18_to30)].loc[stu['score'].apply(level_a)]

print(stu)

Results:

One of the columns, the Get Mo column, we have been using thestu['age'], this can also be written as:

Then the whole job worked!

Documentation:

F:\Project\python\src\WangYiYun\DataAnalysis\19_.py

Full Code:

# @DATE : 2021-1-2
# @TIME : 16:13
# @USER : kirin
import pandas as pd

stu = pd.read_csv("data/", index_col='id')

# Leave 18<=age<=30
def age_18_to30(a):
    return 18 <= a <= 30

# Leave 85<=score
def level_a(s):
    return 85 <= s

# Using loc generates a new series
# stu = [stu['age'].apply(age_18_to30)]
stu = [stu['age'].apply(age_18_to30)].loc[stu['score'].apply(level_a)]
# or don't use stu['age']:
# stu = [(age_18_to30)].loc[(level_a)]

# Use lambda expressions.
# stu = [(lambda a: 18 <= a <= 30)].loc[(lambda s: 85 <= s)]

# The code is too long to return to a car: (space + backslash + return)
# stu = [(lambda a: 18 <= a <= 30)]. \
#     loc[(lambda s: 85 <= s)]

print(stu)

summarize

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.