Complete example of implementing batch table partition of MySQL database

For the problem of large data volume of single tables, if the data supports sharding, using table partitioning is a good choice. So how does MySQL implement table partitioning?

1. Table partitioning conditions

1. Database storage engine support: InnoDB and MyISAM engines

2. Database version support: after MySQL 5.1 (the versions are different, the specific feature support may be different)

3. The data must have one or more partition keys: the key (field) that is a partition must be part of the primary key (union primary key)

4. Partition definition: Each partition must clearly define the data range

5. Partition maintenance: Over time, it may be necessary to add new partitions or delete old partitions to maintain the performance and structure of the database

2. The difference between regular tables and partition tables

Comparison of regular tables and partition tables

	General table	Partition table
Data structure	All data is stored in a single data file	Data is logically divided into multiple parts, possibly stored in multiple files or even multiple disks
Query optimization	Scan the entire table data by default during query	Access only relevant partition data
I/O operation	Add, delete or modify row operations directly act on the entire table	Only operate on a single partition, and do not affect other partition data.
Backup and restore	Usually back up the entire table data	You can back up or restore a specific partition separately
Storage Management	All data storage	Data scattered to multiple partitions
Extensibility	As the data volume grows, you may encounter performance bottlenecks	Easier to scale horizontally, you can handle larger data sets by adding new partitions without changing the application logic
Limits and complexity	Relatively simple, without special creation or maintenance requirements	Design and implementation are more complex and need to consider how to correctly set up partitioning policies to meet business needs

From the above, is there a great advantage in partition tables? But there are also some limits for partition tables:

Restrictions on partition tables

	General table	Partition table
Foreign key constraints	√	×
Full text index	√	× (previous versions of 5.6) ⍻ (5.6 or later version)
Temporary table	√	×
Column modification	√	×
Specific ALTER TABLE statements	√	× (modify primary key, unique key, etc.)
Performance impact	Data volume impact	Adding, deleting, and merging table partitions may cause locking tables and affecting performance
Backup and Recovery Tool Support	Usually the tools support	Not all backup and recovery tools fully support all features of partition tables
Data replication of primary and secondary servers	No special requirements	Partition rules must be consistent, any mismatch may lead to replication failure or data inconsistency
Partition type limitations	none	Storage engines may limit partition types
The behavior of query optimizer	Simple index	Table partition + index, complex queries with special circumstances may have table partition clipping failure problems

3. Creation of table partitions

Three key points in table partition creation: creating tables, setting partition keys, and setting sharding strategies

Example:

CREATE TABLE sales (
    id INT NOT NULL,
    sale_date DATE NOT NULL,
    amount DECIMAL(10,2)
)
PARTITION BY RANGE (YEAR(sale_date)) (
    PARTITION p0 VALUES LESS THAN (2020),
    PARTITION p1 VALUES LESS THAN (2021),
    PARTITION p2 VALUES LESS THAN (2022),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

Create table name: sales

Partition key: year() result of sale_date field, that is, year of sale_date field

Sharding strategy: p0 partition stores data less than 2020, p1 partition stores data less than 2021, p2 partition stores data less than 2022, and p3 partition stores data other yearly (Note: The data "bag" here is very important, be careful when setting it)

Note: The sharding strategy here is "LESS THAN xxx", which means data smaller than the subsequent strategy. For example, the data smaller than the specified year belongs to this partition, so the word "data baffle" is used above.

4. Convert existing tables to partition table scripts

Because the creation structure of the table is different, existing tables cannot be converted directly into partitioned tables. To convert existing tables into partitioned tables, the following steps are required:

1. Create a new partition table with the same field structure based on existing tables and define relevant partition strategies

2. Migrate data to partition table

3. Delete the old table and rename the partition table to the original table

The specific implementation script is as follows:

CREATE DEFINER=`root`@`%` PROCEDURE `convert_table_to_partition`(IN tbl_name VARCHAR(200),OUT out_status INT)
BEGIN
    DECLARE done INT DEFAULT FALSE;
    -- Output status，Start execution status 100，Execution successful status 200，Execution failed status 50
    SET out_status = 100;
    
    -- Create a new empty table，Not included in table partition（To convert to a partition table，Must be an empty table）
    SET @create_empty_tbl_sql = CONCAT(
        'CREATE TABLE ', tbl_name, '_partitioned LIKE ', tbl_name, ';'
    );
    PREPARE stmt FROM @create_empty_tbl_sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    -- Get all unique year_no and month_no Combining as a build partition defines partition key
    SET @partition_def = '';
    SET @query = CONCAT(
        'SELECT GROUP_CONCAT(
            CONCAT("PARTITION p_", year_no, "_", LPAD(month_no, 2, "0"), 
                   " VALUES LESS THAN (", 
                   CASE WHEN month_no = 12 THEN year_no + 1 ELSE year_no END, ", ", 
                   CASE WHEN month_no = 12 THEN 1 ELSE month_no + 1 END, ")")
            ORDER BY year_no, month_no SEPARATOR ",\n"
        ) INTO @partition_def
        FROM (
            SELECT DISTINCT year_no, month_no
            FROM ', tbl_name, '
            ORDER BY year_no, month_no
        ) AS unique_years_months;'
    );
    PREPARE stmt FROM @query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    -- Debugging information：Output partition definition string
    --SELECT tbl_name,@partition_def;

    -- Check if there is a valid partition definition
    IF @partition_def IS NULL OR @partition_def = '' THEN
        SELECT tbl_name,'No data found for partitioning. Skipping partition creation and data migration.';
        -- Add empty tables directly p_max Partitions are used to capture future data
        SET @partition_def = '\nPARTITION p_max VALUES LESS THAN (MAXVALUE, MAXVALUE)';
    ELSE
        -- Add to p_max Partitions are used to capture future data
        SET @partition_def = CONCAT(@partition_def, ',\nPARTITION p_max VALUES LESS THAN (MAXVALUE, MAXVALUE)');
    END IF;

    -- use ALTER TABLE Add to分区定义
    SET @add_partitions_sql = CONCAT(
        'ALTER TABLE ', tbl_name, '_partitioned 
        PARTITION BY RANGE COLUMNS(year_no, month_no) (', @partition_def, ');'
    );

    -- Debugging information：输出Add to分区的 SQL Statement
    --SELECT tbl_name,@add_partitions_sql;

    PREPARE stmt FROM @add_partitions_sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    -- Migrate data to new partition table
    SET @insert_into_partitioned_sql = CONCAT(
        'INSERT INTO ', tbl_name, '_partitioned SELECT * FROM ', tbl_name, ';'
    );

    -- Debugging information：Output the inserted data SQL Statement
    -- SELECT tbl_name,@insert_into_partitioned_sql;

    PREPARE stmt FROM @insert_into_partitioned_sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    -- Verify that the data migration is successful
    SET @count_original = CONCAT('SELECT COUNT(*) INTO @count_original FROM ', tbl_name);
    PREPARE stmt FROM @count_original;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    SET @count_partitioned = CONCAT('SELECT COUNT(*) INTO @count_partitioned FROM ', tbl_name, '_partitioned');
    PREPARE stmt FROM @count_partitioned;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    -- 比较原表and新分区表的数据行数
    -- SELECT tbl_name,@count_original, @count_partitioned;

    
    -- If the data migration is successful，Delete the old table and rename the new table（Whether there is data or not，Delete cache tables）
    IF @count_original = @count_partitioned THEN
        -- Delete old table
        SET @drop_old_table_sql = CONCAT('DROP TABLE IF EXISTS ', tbl_name);
        PREPARE stmt FROM @drop_old_table_sql;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;

        -- Rename the new table to the old table name
        SET @rename_tables_sql = CONCAT('RENAME TABLE ', tbl_name, '_partitioned TO ', tbl_name);
        PREPARE stmt FROM @rename_tables_sql;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;

        -- SELECT tbl_name,'Table conversion and data migration completed successfully.';
        SET out_status = 200;
    ELSE
        -- SELECT tbl_name,'Data migration failed, check the logs for more information.';
        SET out_status = 50;
    END IF;
    
END

The above script is a complete conversion of existing tables into partition tables with "year_no" and "month_no" fields as partition keys. The main steps are:

1) Create a new empty table with the existing table as a template, which does not contain the table partition (to be converted to a partitioned table, it must be an empty table)

2) Get all unique year_no and month_no combinations and build partition definition strings (partition strategy that needs to be divided for existing data analysis)

3) Check whether the partition definition is valid. If there is no partition definition, it is strongly recommended to create a default partition strategy p_max to store future data.

4) Update empty tables and add relevant partitioning policies

5) Migrate historical data to partition table

6) Data migration verification (verify data integrity)

7) Delete the old table (recycle the table name)

8) Rename the new partition table to the original table name

5. Batch conversion table to partition table

Bulk converts regular tables into partition tables, with the specific scripts as follows:

CREATE DEFINER=`root`@`%` PROCEDURE `tables_convert_to_partition`()
BEGIN
    DECLARE done INT DEFAULT FALSE;
    DECLARE tbl_name VARCHAR(64);
    DECLARE convert_status INT;
    DECLARE cur CURSOR FOR
        SELECT TABLE_NAME
        FROM information_schema.TABLES
        WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'ai_result_%';
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

    OPEN cur;

    read_loop: LOOP
        FETCH cur INTO tbl_name;
        IF done THEN
            LEAVE read_loop;
        END IF;
      
        -- Debugging information：Output the converted table
        SELECT tbl_name,'covering...';

        CALL convert_table_to_partition(tbl_name,@status);
        
        SET convert_status = @status;
        
        -- According to the returned status, the corresponding processing will be performed
        CASE convert_status
            WHEN 100 THEN
                -- Start status，Can be ignored，Because this is the expected initial state
                SELECT tbl_name, 'Started conversion.';
            WHEN 200 THEN
                -- Completed successfully
                SELECT tbl_name, 'Conversion and data migration completed successfully.';
            WHEN 50 THEN
                -- fail
                SELECT tbl_name, 'Data migration failed. Check the logs for more information.';
            ELSE
                -- Unknown status
                SELECT tbl_name, CONCAT('Unknown status: ', status);
        END CASE;
    END LOOP;

    CLOSE cur;
END

Here is a table starting with "ai_result_" as an example, converting all related tables into partitioned tables. When executing this stored procedure, the operator must have information_schema database read permissions, so that the relevant table names can be queried for conversion.

This script is recommended to execute the script at one time to avoid frequent conversions and prevent table locking (so the table name prefix has been fixed in the code and needs to be modified according to your own needs)

6. Table partition maintenance: Add table partition

The table partition is created through the above process. In theory, the historical data has been table partitioned, and the future data can also be stored in the p_max partition. However, if the p_max partition data is not maintained, there will also be too much data. Therefore, we need to regularly cut the p_max partition and add related table partitions. This operation needs to be executed before the data enters. The specific execution script is as follows:

CREATE DEFINER=`root`@`%` PROCEDURE `add_monthly_partitions`(IN tbl_name VARCHAR(64), IN year_no INT, IN month_no INT,OUT out_status INT)
BEGIN
    DECLARE done INT DEFAULT FALSE;
    
    -- Output status，Start execution status 100，Execution successful status 200，Execution failed status 50
    SET out_status = 100;
    
    -- Check whether the partition to be added already exists
    SET @partition_exists = EXISTS (SELECT 1 FROM information_schema.PARTITIONS 
                                    WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name 
                                      AND PARTITION_NAME = CONCAT('p_', year_no, '_', LPAD(month_no, 2, '0')));
                                      
    IF @partition_exists THEN
        -- If the partition already exists，Return message directly
        -- SELECT CONCAT('Partition p_', year_no, '_', LPAD(month_no, 2, '0'), ' already exists. No action taken.') AS message;
        SET out_status = 200;
    ELSE
      -- Check if the table already exists p_max Partition
      SET @has_p_max = EXISTS (SELECT 1 FROM information_schema.PARTITIONS 
                               WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name 
                                 AND PARTITION_NAME = 'p_max');

      -- 构建添加Partition的 SQL Statement
      IF @has_p_max THEN
          -- If there is p_max Partition，则重新组织Partition，Will p_max 分割成新Partition和更新后的 p_max
          SET @reorganize_partition_sql = CONCAT(
              'ALTER TABLE ', tbl_name, ' REORGANIZE PARTITION p_max INTO (
                  PARTITION p_', year_no, '_', LPAD(month_no, 2, '0'), 
                  ' VALUES LESS THAN (', 
                  CASE WHEN month_no = 12 THEN year_no + 1 ELSE year_no END, ', ', 
                  CASE WHEN month_no = 12 THEN 1 ELSE month_no + 1 END, '),
                  PARTITION p_max VALUES LESS THAN (MAXVALUE, MAXVALUE)
              )'
          );
          PREPARE stmt FROM @reorganize_partition_sql;
          EXECUTE stmt;
          DEALLOCATE PREPARE stmt;

          -- SELECT CONCAT('Partition p_', year_no, '_', LPAD(month_no, 2, '0'), ' and updated p_max added successfully.') AS message;
          SET out_status = 200;
      ELSE
          -- If it does not exist p_max Partition，则直接添加新Partition
          SET @add_partition_sql = CONCAT(
              'ALTER TABLE ', tbl_name, ' ADD PARTITION (
                  PARTITION p_', year_no, '_', LPAD(month_no, 2, '0'), 
                  ' VALUES LESS THAN (', 
                  CASE WHEN month_no = 12 THEN year_no + 1 ELSE year_no END, ', ', 
                  CASE WHEN month_no = 12 THEN 1 ELSE month_no + 1 END, ')
              )'
          );
          PREPARE stmt FROM @add_partition_sql;
          EXECUTE stmt;
          DEALLOCATE PREPARE stmt;
          
          -- SELECT CONCAT('Partition p_', year_no, '_', LPAD(month_no, 2, '0'), ' added successfully.') AS message;
          SET out_status = 200;
      END IF;
    END IF;
END

The execution process of the above script is as follows:

1) Check whether the partition to be added already exists (if it already exists, it will not be added, it will not be added if it does not exist)

2) Check whether the p_max partition exists in the table (detect the partition to be cut, if it exists, cut the partition, if it does not exist, create the partition)

3) Cut the p_max partition into a new partition and a new p_max partition (the sharding strategy of the p_max partition will be adjusted here)

7. Batch maintenance: Add table partition in batches

Add table partitions to related tables in batches, and the specific scripts are as follows:

CREATE DEFINER=`root`@`%` PROCEDURE `tables_add_monthly_partition`(IN tbl_prefix VARCHAR(64), IN year_no INT, IN month_no INT,OUT out_status INT)
BEGIN
    DECLARE done INT DEFAULT FALSE;
    DECLARE tbl_name VARCHAR(64);
    DECLARE cur CURSOR FOR
        SELECT TABLE_NAME
        FROM information_schema.TABLES
        WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE CONCAT(tbl_prefix, '%');
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

    -- Output status，Start execution status 100，Execution successful status 200，Execution failed status 50
    SET out_status = 100;

    -- Open the cursor
    OPEN cur;

    read_loop: LOOP
        FETCH cur INTO tbl_name;
        IF done THEN
            LEAVE read_loop;
        END IF;

        -- Check if the table is already a partition table
        SET @is_partitioned = EXISTS (SELECT 1 FROM information_schema.PARTITIONS 
                                      WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name);

        IF NOT @is_partitioned THEN
            -- If the table is not a partition table，Call first convert_table_to_partition Convert
            CALL convert_table_to_partition(tbl_name,@status);
            
            -- After conversion, check again whether it is successfully converted to partition table
            SET @is_partitioned = EXISTS (SELECT 1 FROM information_schema.PARTITIONS 
                                          WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name);
            
            IF NOT @is_partitioned THEN
                -- If the conversion fails，Skip subsequent operations and output error messages
                SELECT CONCAT('Failed to convert table ', tbl_name, ' to partitioned. Skipping.') AS message;
                SET out_status = 50;
                ITERATE read_loop;
            END IF;
        END IF;

        -- Call add_monthly_partitions Add partition to the current table
        CALL add_monthly_partitions(tbl_name, year_no, month_no,@status);

        -- Optional：Output operation results（For debugging）
        -- SELECT CONCAT('Processed table: ', tbl_name) AS status;
    END LOOP;

    -- Close cursor
    CLOSE cur;

    -- Output completion information
    -- SELECT CONCAT('Batch partition addition completed for tables with prefix "', tbl_prefix, '".') AS message;
    SET out_status = 200;
END

To add table partitions in batches, you need to pass in the relevant table prefix, such as "ai_result_" in the above example. This script will convert the non-partitioned table into a partitioned table, and then add the corresponding table partition to the partitioned table. The specific execution process is as follows:

1) Get all relevant tables

2) Traversal to determine whether the table is a partition table

3) Non-partitioned tables are converted to partitioned tables

4) Add a table partitioning strategy to the partition table

Please execute this script carefully. We have a comparison of regular tables and partition tables above. Execution of scripts is very simple (it is automatically completed in batches), but please consider carefully the consequences of execution!

Summarize

This is the end of this article about MySQL database implementing batch table partitioning. For more related content on MySQL batch table partitioning, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!