Pandas merge
() is a function used in the pandas library to merge two or more DataFrame objects with the following common arguments:
- left: the left DataFrame to be merged.
- right: the right DataFrame to be merged.
- how: specify the merge method, including 'left', 'right', 'outer' and 'inner '.
- on: Specifies which columns to merge by, either a single column name or a list containing multiple column names.
- left_on and right_on: specify the names of the columns to be merged in the left and right DataFrame, if the names of the columns in the two DataFrames are different, they need to be specified by these two parameters.
- suffixes: Specifies the suffixes added to differentiate between two DataFrames when they have the same column names.
sample code (computing)
import pandas as pd # Create two DataFrames df1 = ({'key': ['A', 'B', 'C', 'D'], 'value': [1, 2, 3, 4]}) df2 = ({'key': ['B', 'D', 'E', 'F'], 'value': [5, 6, 7, 8]}) # Merge two DataFrames by key columns merged = (df1, df2, on='key') print(merged)
Run results:
key value_x value_y
0 B 2 5
1 D 4 6
In this example, two DataFrame objects df1 and df2 are created, both of which have a column named 'key'. The () function is used to merge these two DataFrame objects according to the 'key' column and the result is stored in the merged variable. Finally, the merged result is output, where value_x and value_y represent the 'value' columns in df1 and df2 respectively before the merge.
Retain the DataFram on the left
If you only want to consider the left DataFrame object, you can set the how='left' parameter in the () function to achieve this. Specifically, the how parameter controls how the merge between the two DataFrame objects is done, and can take values of 'left', 'right', 'outer ' and 'inner'. When the value is 'left', the () function will keep all the rows in the left DataFrame object and add the rows in the right DataFrame object that can match the left DataFrame object in the merged DataFrame object.
Below is a sample code:
import pandas as pd # Create two DataFrames df1 = ({'key': ['A', 'B', 'C', 'D'], 'value': [1, 2, 3, 4]}) df2 = ({'key': ['B', 'D', 'E', 'F'], 'value': [5, 6, 7, 8]}) # Consider only the left DataFrame object merged = (df1, df2, on='key', how='left') print(merged)
Run results:
key value_x value_y
0 A 1 NaN
1 B 2 5.0
2 C 3 NaN
3 D 4 6.0
In this example, df1 and df2 are merged by the 'key' column and the merge method is set to 'left'. The merge result contains all the rows in df1 because only the left DataFrame object is considered. The rows with 'key' columns 'E' and 'F' in the DataFrame object on the right are in the merged DataFrame object at 'value_y' column are NaN.
to this article on the Pandas merge merge two DataFram implementation of the article is introduced to this, more related Pandas merge merge two DataFram content, please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!