1. Preface
A small tip for working, use pandas to read the main table and subtable, use the designated columns of the main table, and update the designated columns of the subtable.
Case:
Main table:
uid name
0 101 Xiaobai
1 102 Xiaohong
2 103 XiaolanSub-table:
name zb_uid
0 Xiaobai None
1 Xiaohong None
2 Little Green None
Requirements: When the name column data of the main table is the same as the name column data of the subtable name column data, the corresponding uid of the main table is assigned to the zb_uid of the subtable
2. Basic cases
1. Create the main table data
import pandas as pd #Main representation datadata_sheet = { 'uid': [101, 102, 103], 'name': ['noob', 'Little Red', 'Little Blue'], } df_sheet = (data_sheet) print('Main Table:') print(df_sheet)
-
data_sheet
is a dictionary where the keys'uid'
Complied with a list containing three integers, indicating the user's unique identity; key'name'
Corresponding to a list containing three strings, indicating the user's name. -
(data_sheet)
Bundledata_sheet
Dictionary conversion topandas
ofDataFrame
Objectdf_sheet
。DataFrame
is a two-dimensional tabular data structure similar to a spreadsheet or SQL table. - Finally passed
print
The function prints out the contents of the main table.
2. Create a mapping dictionary
# Create a mapping dictionary, map name to uidname_uid_map = df_sheet.set_index('name')['uid'].to_dict() print('Mapping Dictionary:',name_uid_map)
-
df_sheet.set_index('name')
Bundledf_sheet
of'name'
Set the column to index, so that the corresponding row can be positioned by name. -
['uid']
Selectuid
column data. -
to_dict()
Convert the selected data into a dictionary. The key of the dictionary is the name and the value is the corresponding one.uid
。 - Finally passed
print
The function prints out this mapping dictionary.
3. Create subtable data
# Sub-expression datadata_sheet1 = { 'name': ['noob', 'Little Red', 'Little Green'], 'zb_uid': [None,None,None] } df_sheet1 = (data_sheet1) print('Subtable:') print(df_sheet1)
-
data_sheet1
is a dictionary, key'name'
Complied with a list containing three strings, indicating the user's name; key'zb_uid'
The corresponding one contains threeNone
The list of values, herezb_uid
The initial values are all empty and will be updated later. -
(data_sheet1)
Willdata_sheet1
Dictionary conversion toDataFrame
Objectdf_sheet1
。 - Finally passed
print
The function prints out the contents of the subtable.
4. Update the zb_uid column of the subtable
# Update the zb_uid column of the subtabledf_sheet1['zb_uid'] = df_sheet1['name'].map(name_uid_map).fillna(df_sheet1['zb_uid'])
-
df_sheet1['name'].map(name_uid_map)
Will be based onname_uid_map
Dictionary, willdf_sheet1
middle'name'
Each value of the column is mapped to the correspondinguid
. if'name'
The column's value isname_uid_map
If it does not exist in the dictionary, it will be mapped asNaN
。 -
fillna(df_sheet1['zb_uid'])
Put the mapping resultNaN
Value fordf_sheet1
The original one of Zhongyuan'zb_uid'
Column value padding. Here becausezb_uid
The initial value isNone
,existpandas
It will be considered asNaN
, so the actual operation is to keep the originalNaN
Value. - Finally assign the updated value to
df_sheet1
of'zb_uid'
List.
5. Complete code
import pandas as pd #Main representation datadata_sheet = { 'uid': [101, 102, 103], 'name': ['noob', 'Little Red', 'Little Blue'], } df_sheet = (data_sheet) print('Main Table:') print(df_sheet) # Create a mapping dictionary, map name to uidname_uid_map = df_sheet.set_index('name')['uid'].to_dict() print('Mapping Dictionary:',name_uid_map) # Sub-expression datadata_sheet1 = { 'name': ['noob', 'Little Red', 'Little Green'], 'zb_uid': [None,None,None] } df_sheet1 = (data_sheet1) print('Subtable:') print(df_sheet1) # Update the zb_uid column of the subtabledf_sheet1['zb_uid'] = df_sheet1['name'].map(name_uid_map).fillna(df_sheet1['zb_uid']) print("Updated subtable:") print(df_sheet1)
Running results:
Main table:
uid name
0 101 Xiaobai
1 102 Xiaohong
2 103 Xiaolan
Sub-table:
name zb_uid
0 Xiaobai None
1 Xiaohong None
2 Little Green None
Mapping dictionary: {'Xiaobai': 101, 'Xiaohong': 102, 'Xiaolan': 103}
Updated subtable:
name zb_uid
0 Xiaobai 101.0
1 Xiaohong 102.0
2 Little Green NaN
6. Summary
The main function of this code is based on the name anduid
The corresponding relationship of is updated in the subtablezb_uid
The value of the column. If the name in the subtable exists in the main table, use the correspondinguid
Fillzb_uid
; If not, keepzb_uid
is empty.
3. Upgrade cases
Based on the basic case, update the name fields of the main table and the subtable according to the mapping dictionary of names and pinyin:
import pandas as pd def get_namePingyingMap(): // The following method is to manually build a mapping dictionary. Of course, if you have a database, you can also read it from the database and then build a mapping dictionary. name_pingying_map = {'noob':'xiaobai','Little Red':'xiaohong','Little Blue':'xiaolan','Little Green':'xiaolù'} print('name_pingying_map map dictionary:', name_pingying_map) return name_pingying_map #Main representation datadata_sheet = { 'uid': [101, 102, 103], 'name': ['noob', 'Little Red', 'Little Blue'], } df_sheet = (data_sheet) print('Main Table:') print(df_sheet) # Update the name column of the main tablename_pingying_map = get_namePingyingMap() df_sheet['name'] = df_sheet['name'].map(name_pingying_map).fillna(df_sheet['name']) print("Updated main table:") print(df_sheet) # Create a mapping dictionary, map name to uidname_uid_map = df_sheet.set_index('name')['uid'].to_dict() print('name_uid_map map dictionary:',name_uid_map) # Sub-expression datadata_sheet1 = { 'name': ['noob', 'Little Red', 'Little Green'], 'zb_uid': [None,None,None] } df_sheet1 = (data_sheet1) print('Subtable:') print(df_sheet1) # Update the name column of the subtabledf_sheet1['name'] = df_sheet1['name'].map(name_pingying_map).fillna(df_sheet1['name']) # Update the zb_uid column of the subtabledf_sheet1['zb_uid'] = df_sheet1['name'].map(name_uid_map).fillna(df_sheet1['zb_uid']) print("Updated subtable:") print(df_sheet1)
Running results:
Main table:
uid name
0 101 Xiaobai
1 102 Xiaohong
2 103 Xiaolan
name_pingying_map mapping dictionary: {'Xiaobai': 'xiaobai', 'Xiaohong': 'xiaohong', 'Xiaolan': 'xiaolan', 'Xiaogreen': 'xiaolù'}
Updated main table:
uid name
0 101 xiaobai
1 102 xiaohong
2 103 xiaolan
name_uid_map map dictionary: {'xiaobai': 101, 'xiaohong': 102, 'xiaolan': 103}
Sub-table:
name zb_uid
0 Xiaobai None
1 Xiaohong None
2 Little Green None
Updated subtable:
name zb_uid
0 xiaobai 101.0
1 xiaohong 102.0
2 xiaolù NaN
This is the article about Pandas' tips for updating sub-table columns using the main table. For more related Pandas to update sub-table columns, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!