SoFunction
Updated on 2024-11-15

Example analysis of two ways to manipulate MySQL database in Python [pymysql and pandas].

This article example describes the two ways to operate MySQL database in Python. Shared for your reference, as follows:

First Use pymysql

The code is as follows:

import pymysql
# Open a database connection
db=(host='1.1.1.1',port=3306,user='root',passwd='123123',db='test',charset='utf8')
cursor=()# Use the cursor() method to get the cursor for an operation.
sql = "select * from test0811"
(sql)
info = ()
()
() # Close the cursor
()# Close the database connection

The contents of the data table test0811 and the code above reads out are

pymysql is a Python module for manipulating MySQL databases. First introduce the pymysql module

import pymysql

To connect to the database using pymysql's connect() method, several parameters of connect are explained below:

  • host: the address of MySQL service, if the database is on localhost, use localhost or 127.0.0.1. If it is on another server, you should write the IP address.
  • port: the port number of the service, the default is 3306, if not written, the default value.
  • user: the user name for logging into the database
  • passwd: password for the user account to log in to MySQL
  • db: the name of the database to be manipulated
  • charset: set to utf8 encoding, so that the Chinese characters can be deposited without garbled code

Note: Except for port=3306, which is not in quotes, all other values are enclosed in quotes.

The db in the code bridges the communication between Python and MySQL, and the () means return the cursor object of the connection, through which the SQL statement is executed. There are also several commonly used methods are commit() means commit database changes, rollback() means rollback, that is, cancel the current operation, close() means close the connection.

Above is the connection object db some of the methods, some of the methods of the cursor object is also very important, the use of the cursor object method can be operated on the database, the cursor object of the commonly used methods in the following table:

name (of a thing) descriptive
close() Close the cursor, after which the cursor becomes unavailable
execute(query[,args]) Execute a SQL statement with parameters.
executemany(query,pseq) Execute SQL statements for each parameter in the sequence pseq
fetchone() Return a query result
fetchall() Return all search results
fetchmany([size]) Returns size results
nextset() Move to next result
scroll(value,mode='relative') Move the cursor to the specified line, if mode='relative', it means to move the value bar from the current line, if mode='absolute', it means to move the value bar from the first line of the result set

Here is basically the basic use of pymysql clear, the rest of the operation of the database (add, delete, change, check) is the SQL statement thing. Although the SQL statement is very powerful, but sometimes it will not be enough, Python's flexibility coupled with the power of SQL can do more things, and pymysql just as a tool, the role of the bridge. From the results of the code run (the second picture) found that the results of the readout is stored in a two-dimensional tuple, namely ((1, 'Xiaohong', '80'), (2, 'Xiaoming', '90'), (3, 'Xiaomei', '87'), (4, 'GG', '67'), (5, 'MM', '78')), but the tuple can not be changed, can only be read out, for the data processing is still somewhat inconvenience, the following second method is to read out the data stored in the DataFrame, easy to handle.

Second Use pandas

The code is as follows:

import pandas as pd
from sqlalchemy import create_engine
from  import CHAR,INT
connect_info = 'mysql+pymysql://username:passwd@host:3306/dbname?charset=utf8'
engine = create_engine(connect_info) #use sqlalchemy to build link-engine
sql = "SELECT * FROM test0811" #SQL query
df = pd.read_sql(sql=sql, con=engine) #read data to DataFrame 'df'
#write df to table 'test1'
df.to_sql(name = 'test1',
      con = engine,
      if_exists = 'append',
      index = False,
      dtype = {'id': INT(),
          'name': CHAR(length=2),
          'score': CHAR(length=2)
          }
      )

The DataFrame data format of pandas has row indexes and column indexes, and it will be very convenient to use DataFrame to store data in database tables. Use the read_sql and to_sql functions in pandas to read and write data from a MySQL database. The two functions are described as follows.

pandas.read_sql

Copy Code The code is as follows.
pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)

pandas.read_sql document has a detailed description of each parameter in English (do not exclude looking at the English, humility to foreigners to learn), reference material/pandas-docs/stable/generated/pandas.read_sql.html

Commonly used parameters are sql: SQL command or table name, con: connect to the database engine, you can use SQLAlchemy or pymysql to build, the basic use of reading data from the database to give the sql and con on it. All other parameters are default, there are special needs will only be used, if you are interested you can check the documentation.

The con in the code is to build a database connection engine using SQLAlchem, i.e., sqlalchemy.create_engine( ). This function generates an engine object based on a URL, which usually contains information about the database, typically of the form:

dialect+driver://username:password@host:port/database

dialect indicates the name of the database, such as sqlite, mysql, postgresql, oracle, mssql, etc. driver is the name of the DBAPI used to connect to the database, here pymysql is used (Python , mysqldb is used in Python), if this item is not specified, it will use the default DBAPI.

In addition to using SQLAlchemy to create an engine, you can also create an engine directly using DBAPI with the following code:

con = (host=localhost, user=username, password=password, database=dbname, charset='utf8')
df = pd.read_sql(sql, con)

.to_sql

Copy Code The code is as follows.
DataFrame.to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None)

The main parameters are described as follows, refer to the detailed documentation/pandas-docs/stable/generated/.to_sql.html
  • name: name of the output table
  • con: the engine that connects to the database
  • if_exists: three modes {"fail", "replace", "append"}, default is "fail". fail: if the table exists, raise a ValueError; replace: if the table exists, overwrite the data in the original table; append: if the table exists, write the data to the back of the original table.
  • index: if or not write the index of the DataFrame to a separate column, default is "True".
  • index_label: when index is True, the specified column is output as the index of the DataFrame
  • dtype: specify the data type of the column, dictionary form storage {column_name: sql_dtype}, common data types are () and (length=x). Note: INT and CHAR both need to be capitalized, INT () do not need to specify the length.

References:

///article/

/pandas-docs/stable/generated/pandas.read_sql.html

/en/latest/core/

/pandas-docs/stable/generated/.to_sql.html

Readers interested in more Python related content can check out this site's topic: theSummary of common database manipulation techniques in Python》、《Summary of Python mathematical operations techniques》、《Python Data Structures and Algorithms Tutorial》、《Summary of Python function usage tips》、《Summary of Python string manipulation techniques》、《Python introductory and advanced classic tutorialsand theSummary of Python file and directory manipulation techniques

I hope that what I have said in this article will help you in Python programming.