SoFunction
Updated on 2024-11-16

How pandas has the flexibility to add new empty fields

pandas flexibility to add new empty fields

Let's start with demand

Read certain fields of the data from mongodb (e.g. A,B,C,D), if the data does not have a certain field (e.g. 'D' field), add that field and the value is null

cure

import pandas as pd
import numpy as np
a=([[1,2,3],[4,5,6],[7,8,9]])
df1=(a,index=['row0','row1','row2'],columns=list('ABC'))
df1

df1 Results:

df1的结果

Creating an empty dataframe with the specified fields

df2 = (columns=['A','B','C','D'])

reconnect via a method

([df2,df1])

The final result is as follows:

Python pandas data cleansing, assigning empty fields by conditions

Python pandas assigns empty fields conditionally

Find the null value and see the pattern about the null value

The savings field in the travel data is partially empty, consider using the mean value of the destination assigned to the corresponding null value; there is a Beijing to Xi'an journey with a null savings, which is needed for the process:

1. Find the rows and columns where the savings field is empty:

import numpy as np
import pandas as pd
data[data['Savings'].isnull()]

2. Extract the destinations therein:

[data['Savings'].isnull(),['Destination']]

3. Average values for each destination and origin:

round((['Destination','Place of departure'])['Savings'].mean())

Use the fillna function to solve the problem:

1. Create a new DataFrame to load the updated data.

2. Access to destinations

3. Assignment of destination mean to destination null value

4. Drop the processed data into the new DataFrame

5. Circulate steps 2-4

datafillna = ()
place = data.destination (location).unique()
for pla in place:
    t = data.destination (location) == pla
    print(t)
    a = data[t].fillna(data[t].mean())
    print(a)
    datafillna = (a)
    break

summarize

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.