preamble
Pandas is a data analysis package for python, a tool based on NumPy provides a large number of data structures and functions that can easily handle structured data, common data structures are:
Series: one-dimensional array, similar to one-dimensional array in Numpy.
DataFrame: two-dimensional tabular data structure, DataFrame can be understood as the container of Series.
Time- Series: Time-indexed Series
Panel: a three-dimensional array, which can be interpreted as a DataFrame container.
1. Import txt file
The content used in this article is as follows:
Importing a txt file using the read_table() method
import pandas as pd df = pd.read_table(r'C:\Users\admin\Desktop\') print(df)
result:
I am Li Hua. Today, I had left a dozen people in my class to do cleaning, but they all ran away, leaving me alone to finish all the work. It was raining on the way home.
0 Too much rain washed all the spokes off the front tire of my bike and I was very angry. To get revenge for what happened today, I rode the rims back to...
1 and threw out all the chairs in the classroom.
This method is a generic function for importing files separated by the use of separators into a DataFrame. It can import not only .txt files but also .csv files.
df = pd.read_table(r'C:\Users\admin\Desktop\Chinese\Data_Analysis_Test_Sheet.csv') print(df)
result:
Region, province, city
0 Dalian, Liaoning, Northeast China
1 Northwest, Shaanxi, Xi'an
2 South China, Guangdong, Shenzhen
3 North China, Beijing, Beijing
4 Wuhan, Hubei, Central China
read_table() method and read_csv() method of other parameter usage and read_table() method is basically the same, and then will not repeat.
2. Import sql file
2.1 Installing the dependency library pymysql
The python connection to MySQL uses pymysql, which needs to be installed manually.
import pandas as pd import pymysql con = (host='127.0.0.1', # Database address, locally 127.0.0.1 or localhost user='root', # Username password='123456', # Password db='test', # Database name charset='utf-8') # Database encoding, typically utf-8 sql = "select * from employees" df = pd.read_sql(sql, con) print(df)
At this point, an error is reported
Error solved after modifying charset='utf8':
con = (host='127.0.0.1', # Database address, locally 127.0.0.1 or localhost user='root', # Username password='123456', # Password db='test', # Database name charset='utf8') # Database encoding, typically utf-8 sql = "select * from employees" df = pd.read_sql(sql, con) print(df)
result:
eID NAME sex birth jobs firJob hiredate
0 1 Zhang Zhang Male 1990-06-21 Teacher 2005-06-20 2009-08-26
1 2 Lu Huaide Male 2004-06-29 Laborer 2018-08-01 2021-01-06
2 3 Zhao Yanni Female 1994-07-06 Salesperson 2004-09-21 2019-05-23
The database query results are provided here as a comparison:
Note: When python uses pymysql to interact with MySQL, the encoding can only be written as utf8, don't write utf-8 out of habit.
3. Summary
Importing data mainly uses the read_x() method in pandas, where x represents the format of the file to be imported.
In addition to the previously introduced read_excel() for importing .xlsx files, read_csv() for importing .csv files, read_table() for importing txt, and read_sql() for importing sql files.
There are a number of other methods listed here. These are not used much and are not described in depth here. If you use it later, then we will update it.
pd.read_xml() pd.read_html() pd.read_json() pd.read_clipboard() pd.read_feather() pd.read_fwf() pd.read_gbq() pd.read_orc()
summarize
to this article on pandas learning txt and sql file basic operation of the article is introduced to this, more related pandas txt and sql file operation content please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!