The code is as follows, and the step-by-step flow is visible in the code comments:
# -*- coding: utf-8 -*- import pandas as pd from import SparkSession from import SQLContext from pyspark import SparkContext # Initialization data # Initialize pandas DataFrame df = ([[1, 2, 3], [4, 5, 6]], index=['row1', 'row2'], columns=['c1', 'c2', 'c3']) #Print data print df #Initialize the spark DataFrame sc = SparkContext() if __name__ == "__main__": spark = SparkSession\ .builder\ .appName("testDataFrame")\ .getOrCreate() sentenceData = ([ (0.0, "I like Spark"), (1.0, "Pandas is useful"), (2.0, "They are coded by Python ") ], ["label", "sentence"]) #Display data ("label").show() # Converted to sqlContest = SQLContext(sc) spark_df = (df) #Display data spark_df.select("c1").show() # Converted to pandas_df = () #Print data print pandas_df
Program Results:
The above example of interconversion between this and is all I have to share with you.