For the problem of large data volume of single tables, if the data supports sharding, using table partitioning is a good choice. So how does MySQL implement table partitioning?
1. Table partitioning conditions
1. Database storage engine support: InnoDB and MyISAM engines
2. Database version support: after MySQL 5.1 (the versions are different, the specific feature support may be different)
3. The data must have one or more partition keys: the key (field) that is a partition must be part of the primary key (union primary key)
4. Partition definition: Each partition must clearly define the data range
5. Partition maintenance: Over time, it may be necessary to add new partitions or delete old partitions to maintain the performance and structure of the database
2. The difference between regular tables and partition tables
Comparison of regular tables and partition tables
General table | Partition table | |
---|---|---|
Data structure | All data is stored in a single data file | Data is logically divided into multiple parts, possibly stored in multiple files or even multiple disks |
Query optimization | Scan the entire table data by default during query | Access only relevant partition data |
I/O operation | Add, delete or modify row operations directly act on the entire table | Only operate on a single partition, and do not affect other partition data. |
Backup and restore | Usually back up the entire table data | You can back up or restore a specific partition separately |
Storage Management | All data storage | Data scattered to multiple partitions |
Extensibility | As the data volume grows, you may encounter performance bottlenecks | Easier to scale horizontally, you can handle larger data sets by adding new partitions without changing the application logic |
Limits and complexity | Relatively simple, without special creation or maintenance requirements | Design and implementation are more complex and need to consider how to correctly set up partitioning policies to meet business needs |
From the above, is there a great advantage in partition tables? But there are also some limits for partition tables:
Restrictions on partition tables
General table | Partition table | |
---|---|---|
Foreign key constraints | √ | × |
Full text index | √ |
× (previous versions of 5.6) ⍻ (5.6 or later version) |
Temporary table | √ | × |
Column modification | √ | × |
Specific ALTER TABLE statements | √ | × (modify primary key, unique key, etc.) |
Performance impact | Data volume impact | Adding, deleting, and merging table partitions may cause locking tables and affecting performance |
Backup and Recovery Tool Support | Usually the tools support | Not all backup and recovery tools fully support all features of partition tables |
Data replication of primary and secondary servers | No special requirements | Partition rules must be consistent, any mismatch may lead to replication failure or data inconsistency |
Partition type limitations | none | Storage engines may limit partition types |
The behavior of query optimizer | Simple index | Table partition + index, complex queries with special circumstances may have table partition clipping failure problems |
3. Creation of table partitions
Three key points in table partition creation: creating tables, setting partition keys, and setting sharding strategies
Example:
CREATE TABLE sales ( id INT NOT NULL, sale_date DATE NOT NULL, amount DECIMAL(10,2) ) PARTITION BY RANGE (YEAR(sale_date)) ( PARTITION p0 VALUES LESS THAN (2020), PARTITION p1 VALUES LESS THAN (2021), PARTITION p2 VALUES LESS THAN (2022), PARTITION p3 VALUES LESS THAN MAXVALUE );
Create table name: sales
Partition key: year() result of sale_date field, that is, year of sale_date field
Sharding strategy: p0 partition stores data less than 2020, p1 partition stores data less than 2021, p2 partition stores data less than 2022, and p3 partition stores data other yearly (Note: The data "bag" here is very important, be careful when setting it)
Note: The sharding strategy here is "LESS THAN xxx", which means data smaller than the subsequent strategy. For example, the data smaller than the specified year belongs to this partition, so the word "data baffle" is used above.
4. Convert existing tables to partition table scripts
Because the creation structure of the table is different, existing tables cannot be converted directly into partitioned tables. To convert existing tables into partitioned tables, the following steps are required:
1. Create a new partition table with the same field structure based on existing tables and define relevant partition strategies
2. Migrate data to partition table
3. Delete the old table and rename the partition table to the original table
The specific implementation script is as follows:
CREATE DEFINER=`root`@`%` PROCEDURE `convert_table_to_partition`(IN tbl_name VARCHAR(200),OUT out_status INT) BEGIN DECLARE done INT DEFAULT FALSE; -- Output status,Start execution status 100,Execution successful status 200,Execution failed status 50 SET out_status = 100; -- Create a new empty table,Not included in table partition(To convert to a partition table,Must be an empty table) SET @create_empty_tbl_sql = CONCAT( 'CREATE TABLE ', tbl_name, '_partitioned LIKE ', tbl_name, ';' ); PREPARE stmt FROM @create_empty_tbl_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- Get all unique year_no and month_no Combining as a build partition defines partition key SET @partition_def = ''; SET @query = CONCAT( 'SELECT GROUP_CONCAT( CONCAT("PARTITION p_", year_no, "_", LPAD(month_no, 2, "0"), " VALUES LESS THAN (", CASE WHEN month_no = 12 THEN year_no + 1 ELSE year_no END, ", ", CASE WHEN month_no = 12 THEN 1 ELSE month_no + 1 END, ")") ORDER BY year_no, month_no SEPARATOR ",\n" ) INTO @partition_def FROM ( SELECT DISTINCT year_no, month_no FROM ', tbl_name, ' ORDER BY year_no, month_no ) AS unique_years_months;' ); PREPARE stmt FROM @query; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- Debugging information:Output partition definition string --SELECT tbl_name,@partition_def; -- Check if there is a valid partition definition IF @partition_def IS NULL OR @partition_def = '' THEN SELECT tbl_name,'No data found for partitioning. Skipping partition creation and data migration.'; -- Add empty tables directly p_max Partitions are used to capture future data SET @partition_def = '\nPARTITION p_max VALUES LESS THAN (MAXVALUE, MAXVALUE)'; ELSE -- Add to p_max Partitions are used to capture future data SET @partition_def = CONCAT(@partition_def, ',\nPARTITION p_max VALUES LESS THAN (MAXVALUE, MAXVALUE)'); END IF; -- use ALTER TABLE Add to分区定义 SET @add_partitions_sql = CONCAT( 'ALTER TABLE ', tbl_name, '_partitioned PARTITION BY RANGE COLUMNS(year_no, month_no) (', @partition_def, ');' ); -- Debugging information:输出Add to分区的 SQL Statement --SELECT tbl_name,@add_partitions_sql; PREPARE stmt FROM @add_partitions_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- Migrate data to new partition table SET @insert_into_partitioned_sql = CONCAT( 'INSERT INTO ', tbl_name, '_partitioned SELECT * FROM ', tbl_name, ';' ); -- Debugging information:Output the inserted data SQL Statement -- SELECT tbl_name,@insert_into_partitioned_sql; PREPARE stmt FROM @insert_into_partitioned_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- Verify that the data migration is successful SET @count_original = CONCAT('SELECT COUNT(*) INTO @count_original FROM ', tbl_name); PREPARE stmt FROM @count_original; EXECUTE stmt; DEALLOCATE PREPARE stmt; SET @count_partitioned = CONCAT('SELECT COUNT(*) INTO @count_partitioned FROM ', tbl_name, '_partitioned'); PREPARE stmt FROM @count_partitioned; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- 比较原表and新分区表的数据行数 -- SELECT tbl_name,@count_original, @count_partitioned; -- If the data migration is successful,Delete the old table and rename the new table(Whether there is data or not,Delete cache tables) IF @count_original = @count_partitioned THEN -- Delete old table SET @drop_old_table_sql = CONCAT('DROP TABLE IF EXISTS ', tbl_name); PREPARE stmt FROM @drop_old_table_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- Rename the new table to the old table name SET @rename_tables_sql = CONCAT('RENAME TABLE ', tbl_name, '_partitioned TO ', tbl_name); PREPARE stmt FROM @rename_tables_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- SELECT tbl_name,'Table conversion and data migration completed successfully.'; SET out_status = 200; ELSE -- SELECT tbl_name,'Data migration failed, check the logs for more information.'; SET out_status = 50; END IF; END
The above script is a complete conversion of existing tables into partition tables with "year_no" and "month_no" fields as partition keys. The main steps are:
1) Create a new empty table with the existing table as a template, which does not contain the table partition (to be converted to a partitioned table, it must be an empty table)
2) Get all unique year_no and month_no combinations and build partition definition strings (partition strategy that needs to be divided for existing data analysis)
3) Check whether the partition definition is valid. If there is no partition definition, it is strongly recommended to create a default partition strategy p_max to store future data.
4) Update empty tables and add relevant partitioning policies
5) Migrate historical data to partition table
6) Data migration verification (verify data integrity)
7) Delete the old table (recycle the table name)
8) Rename the new partition table to the original table name
5. Batch conversion table to partition table
Bulk converts regular tables into partition tables, with the specific scripts as follows:
CREATE DEFINER=`root`@`%` PROCEDURE `tables_convert_to_partition`() BEGIN DECLARE done INT DEFAULT FALSE; DECLARE tbl_name VARCHAR(64); DECLARE convert_status INT; DECLARE cur CURSOR FOR SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'ai_result_%'; DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE; OPEN cur; read_loop: LOOP FETCH cur INTO tbl_name; IF done THEN LEAVE read_loop; END IF; -- Debugging information:Output the converted table SELECT tbl_name,'covering...'; CALL convert_table_to_partition(tbl_name,@status); SET convert_status = @status; -- According to the returned status, the corresponding processing will be performed CASE convert_status WHEN 100 THEN -- Start status,Can be ignored,Because this is the expected initial state SELECT tbl_name, 'Started conversion.'; WHEN 200 THEN -- Completed successfully SELECT tbl_name, 'Conversion and data migration completed successfully.'; WHEN 50 THEN -- fail SELECT tbl_name, 'Data migration failed. Check the logs for more information.'; ELSE -- Unknown status SELECT tbl_name, CONCAT('Unknown status: ', status); END CASE; END LOOP; CLOSE cur; END
Here is a table starting with "ai_result_" as an example, converting all related tables into partitioned tables. When executing this stored procedure, the operator must have information_schema database read permissions, so that the relevant table names can be queried for conversion.
This script is recommended to execute the script at one time to avoid frequent conversions and prevent table locking (so the table name prefix has been fixed in the code and needs to be modified according to your own needs)
6. Table partition maintenance: Add table partition
The table partition is created through the above process. In theory, the historical data has been table partitioned, and the future data can also be stored in the p_max partition. However, if the p_max partition data is not maintained, there will also be too much data. Therefore, we need to regularly cut the p_max partition and add related table partitions. This operation needs to be executed before the data enters. The specific execution script is as follows:
CREATE DEFINER=`root`@`%` PROCEDURE `add_monthly_partitions`(IN tbl_name VARCHAR(64), IN year_no INT, IN month_no INT,OUT out_status INT) BEGIN DECLARE done INT DEFAULT FALSE; -- Output status,Start execution status 100,Execution successful status 200,Execution failed status 50 SET out_status = 100; -- Check whether the partition to be added already exists SET @partition_exists = EXISTS (SELECT 1 FROM information_schema.PARTITIONS WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name AND PARTITION_NAME = CONCAT('p_', year_no, '_', LPAD(month_no, 2, '0'))); IF @partition_exists THEN -- If the partition already exists,Return message directly -- SELECT CONCAT('Partition p_', year_no, '_', LPAD(month_no, 2, '0'), ' already exists. No action taken.') AS message; SET out_status = 200; ELSE -- Check if the table already exists p_max Partition SET @has_p_max = EXISTS (SELECT 1 FROM information_schema.PARTITIONS WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name AND PARTITION_NAME = 'p_max'); -- 构建添加Partition的 SQL Statement IF @has_p_max THEN -- If there is p_max Partition,则重新组织Partition,Will p_max 分割成新Partition和更新后的 p_max SET @reorganize_partition_sql = CONCAT( 'ALTER TABLE ', tbl_name, ' REORGANIZE PARTITION p_max INTO ( PARTITION p_', year_no, '_', LPAD(month_no, 2, '0'), ' VALUES LESS THAN (', CASE WHEN month_no = 12 THEN year_no + 1 ELSE year_no END, ', ', CASE WHEN month_no = 12 THEN 1 ELSE month_no + 1 END, '), PARTITION p_max VALUES LESS THAN (MAXVALUE, MAXVALUE) )' ); PREPARE stmt FROM @reorganize_partition_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- SELECT CONCAT('Partition p_', year_no, '_', LPAD(month_no, 2, '0'), ' and updated p_max added successfully.') AS message; SET out_status = 200; ELSE -- If it does not exist p_max Partition,则直接添加新Partition SET @add_partition_sql = CONCAT( 'ALTER TABLE ', tbl_name, ' ADD PARTITION ( PARTITION p_', year_no, '_', LPAD(month_no, 2, '0'), ' VALUES LESS THAN (', CASE WHEN month_no = 12 THEN year_no + 1 ELSE year_no END, ', ', CASE WHEN month_no = 12 THEN 1 ELSE month_no + 1 END, ') )' ); PREPARE stmt FROM @add_partition_sql; EXECUTE stmt; DEALLOCATE PREPARE stmt; -- SELECT CONCAT('Partition p_', year_no, '_', LPAD(month_no, 2, '0'), ' added successfully.') AS message; SET out_status = 200; END IF; END IF; END
The execution process of the above script is as follows:
1) Check whether the partition to be added already exists (if it already exists, it will not be added, it will not be added if it does not exist)
2) Check whether the p_max partition exists in the table (detect the partition to be cut, if it exists, cut the partition, if it does not exist, create the partition)
3) Cut the p_max partition into a new partition and a new p_max partition (the sharding strategy of the p_max partition will be adjusted here)
7. Batch maintenance: Add table partition in batches
Add table partitions to related tables in batches, and the specific scripts are as follows:
CREATE DEFINER=`root`@`%` PROCEDURE `tables_add_monthly_partition`(IN tbl_prefix VARCHAR(64), IN year_no INT, IN month_no INT,OUT out_status INT) BEGIN DECLARE done INT DEFAULT FALSE; DECLARE tbl_name VARCHAR(64); DECLARE cur CURSOR FOR SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE CONCAT(tbl_prefix, '%'); DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE; -- Output status,Start execution status 100,Execution successful status 200,Execution failed status 50 SET out_status = 100; -- Open the cursor OPEN cur; read_loop: LOOP FETCH cur INTO tbl_name; IF done THEN LEAVE read_loop; END IF; -- Check if the table is already a partition table SET @is_partitioned = EXISTS (SELECT 1 FROM information_schema.PARTITIONS WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name); IF NOT @is_partitioned THEN -- If the table is not a partition table,Call first convert_table_to_partition Convert CALL convert_table_to_partition(tbl_name,@status); -- After conversion, check again whether it is successfully converted to partition table SET @is_partitioned = EXISTS (SELECT 1 FROM information_schema.PARTITIONS WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME = tbl_name); IF NOT @is_partitioned THEN -- If the conversion fails,Skip subsequent operations and output error messages SELECT CONCAT('Failed to convert table ', tbl_name, ' to partitioned. Skipping.') AS message; SET out_status = 50; ITERATE read_loop; END IF; END IF; -- Call add_monthly_partitions Add partition to the current table CALL add_monthly_partitions(tbl_name, year_no, month_no,@status); -- Optional:Output operation results(For debugging) -- SELECT CONCAT('Processed table: ', tbl_name) AS status; END LOOP; -- Close cursor CLOSE cur; -- Output completion information -- SELECT CONCAT('Batch partition addition completed for tables with prefix "', tbl_prefix, '".') AS message; SET out_status = 200; END
To add table partitions in batches, you need to pass in the relevant table prefix, such as "ai_result_" in the above example. This script will convert the non-partitioned table into a partitioned table, and then add the corresponding table partition to the partitioned table. The specific execution process is as follows:
1) Get all relevant tables
2) Traversal to determine whether the table is a partition table
3) Non-partitioned tables are converted to partitioned tables
4) Add a table partitioning strategy to the partition table
Please execute this script carefully. We have a comparison of regular tables and partition tables above. Execution of scripts is very simple (it is automatically completed in batches), but please consider carefully the consequences of execution!
Summarize
This is the end of this article about MySQL database implementing batch table partitioning. For more related content on MySQL batch table partitioning, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!