pandas flexibility to add new empty fields
Let's start with demand
Read certain fields of the data from mongodb (e.g. A,B,C,D), if the data does not have a certain field (e.g. 'D' field), add that field and the value is null
cure
import pandas as pd import numpy as np a=([[1,2,3],[4,5,6],[7,8,9]]) df1=(a,index=['row0','row1','row2'],columns=list('ABC')) df1
df1 Results:
Creating an empty dataframe with the specified fields
df2 = (columns=['A','B','C','D'])
reconnect via a method
([df2,df1])
The final result is as follows:
Python pandas data cleansing, assigning empty fields by conditions
Python pandas assigns empty fields conditionally
Find the null value and see the pattern about the null value
The savings field in the travel data is partially empty, consider using the mean value of the destination assigned to the corresponding null value; there is a Beijing to Xi'an journey with a null savings, which is needed for the process:
1. Find the rows and columns where the savings field is empty:
import numpy as np import pandas as pd data[data['Savings'].isnull()]
2. Extract the destinations therein:
[data['Savings'].isnull(),['Destination']]
3. Average values for each destination and origin:
round((['Destination','Place of departure'])['Savings'].mean())
Use the fillna function to solve the problem:
1. Create a new DataFrame to load the updated data.
2. Access to destinations
3. Assignment of destination mean to destination null value
4. Drop the processed data into the new DataFrame
5. Circulate steps 2-4
datafillna = () place = data.destination (location).unique() for pla in place: t = data.destination (location) == pla print(t) a = data[t].fillna(data[t].mean()) print(a) datafillna = (a) break
summarize
The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.