1. Problem analysis
The main problem of deleting a large amount of data at one time is:
- Long-term lock table: A large number of deletion operations will cause the database to be locked for a long time, affecting the normal operation of other transactions.
- Transaction logs skyrocketed: MySQL records transaction logs when deleting data. A large number of deletion operations may cause the log file to be too large or even full of disk.
- Influence performance: Deleting a large amount of data at one time will occupy a large amount of CPU and IO resources, which will have a serious impact on the overall performance of the database.
To avoid these problems, strategies such as batch deletion can be considered to reduce the pressure on the database.
2. Several ways to delete massive data in batches
Method 1: Use LIMIT to delete in batches
LIMIT
Batch deletion is a commonly used way to process massive data. Each time a fixed amount of data is deleted, and the execution is performed cycled until the deletion is completed.
Example SQL:
Suppose we want to deletelogs
All data in the table that was created before a certain date:
-- Set the number of rows deleted per batch SET @BATCH_SIZE = 1000; -- Delete data that meets the criteria in batches DELETE FROM logs WHERE create_time < '2023-01-01' LIMIT @BATCH_SIZE;
The above statements can be placed into stored procedures or called in loops at the application layer. Delete every timeBATCH_SIZE
Row data, reducing the time of lock tables and log generation.
advantage:
- Controls the amount of single deletion to reduce the time of lock table and log generation.
shortcoming:
- It requires multiple loops, and the logic is slightly complicated.
Notice:
- Deleted in batches
LIMIT
The value can be adjusted according to the actual environment. Usually500
arrive5000
It is a more reasonable choice.
Method 2: Delete batch by primary key range
If the data to be deleted is continuous on the primary key (such as auto-increment ID), you can delete it in batches according to the primary key range. This can avoidLIMIT
The offset overhead improves deletion efficiency.
Example SQL:
Assumptionlogs
The primary key of the table isid
:
-- Set the range of deletion for each batch SET @start_id = 0; SET @end_id = 1000; WHILE (@start_id < (SELECT MAX(id) FROM logs WHERE create_time < '2023-01-01')) DO DELETE FROM logs WHERE id BETWEEN @start_id AND @end_id AND create_time < '2023-01-01'; -- Update the delete range SET @start_id = @end_id + 1; SET @end_id = @end_id + 1000; END WHILE;
advantage:
- Primary key range is avoided in batches
LIMIT
Overhead caused by offset.
shortcoming:
- You need to know the primary key range and it is suitable for data tables with continuous primary keys.
Method 3: Delete stored procedures in batches through customization
The batch deletion logic can be encapsulated into stored procedures, and the batch deletion process can be automatically controlled by stored procedures.
Example SQL:
DELIMITER $$ CREATE PROCEDURE batch_delete_logs() BEGIN DECLARE done INT DEFAULT FALSE; DECLARE batch_size INT DEFAULT 1000; WHILE NOT done DO DELETE FROM logs WHERE create_time < '2023-01-01' LIMIT batch_size; -- Check if there is still data left IF ROW_COUNT() < batch_size THEN SET done = TRUE; END IF; END WHILE; END $$ DELIMITER ;
Execute stored procedures:
CALL batch_delete_logs();
advantage:
- Stored procedures are automated and have clear logic, avoiding multiple manual execution of SQL.
shortcoming:
- Suitable for scenarios that support stored procedures, it is very suitable for small batch deletion.
Method 4: Create a temporary table to replace the old table
In some cases, deleting a large amount of data in a large table can be done by creating a new table. That is, first transfer the data that needs to be retained to the new table, and then delete the old table. This method can reduce the time of locking tables and log overhead.
step:
- Create a new table (the structure is the same as the old table).
- Insert the data that needs to be retained into the new table.
- Delete the old table and rename the new table to the original table name.
Example SQL:
-- Create a new table CREATE TABLE logs_new LIKE logs; -- Insert data to be retained INSERT INTO logs_new SELECT * FROM logs WHERE create_time >= '2023-01-01'; -- Delete the old table and rename the new table DROP TABLE logs; RENAME TABLE logs_new TO logs;
advantage:
- Avoid large-scale deletion operations and reduce lock table time and logs.
shortcoming:
- Additional disk space is required to store new table data.
- In the case of large traffic volume, additional lock mechanism control may be required.
3. Performance optimization suggestions
- Avoid mass deletion during peak business hours, you can choose to execute during the peak hours of business such as night time.
-
Set batch size appropriately. When deleting in batches,
LIMIT
The size of the need to be adjusted according to the actual situation and should not be too large to prevent the table locking for a long time. -
Close unnecessary logs. In some extreme cases, the binary log of MySQL can be turned off (
binlog
) to reduce log overhead, but this operation is risky and should be used with caution after full understanding.
Summarize
method | Applicable scenarios | advantage | shortcoming |
---|---|---|---|
LIMIT Delete in batches |
Need to be deleted in batches | Simple logic, reduces table lock time | Requires cyclic operation |
Primary key range deleted in batches | Table with continuous primary keys | Efficient, no offset overhead | Requires manual scope |
Customize batch deletion stored procedures | Small batch deletion | Automated operations | Requires database support for stored procedures |
Temporary table replacement | The amount of data is very large | Avoid locking tables and reduce log overhead | Requires additional disk space |
Depending on different business scenarios and needs, choosing the appropriate batch deletion method can improve the deletion efficiency of MySQL and reduce the impact on the database. I hope this article will be helpful to everyone in the data cleaning and maintenance of MySQL!
The above is a detailed summary of several methods for MySQL to delete massive data in batches. For more information about MySQL to delete data in batches, please pay attention to my other related articles!