SoFunction
Updated on 2025-03-04

How to optimize mysql data table with tens of millions of data?

1. Data capacity: how many pieces of data will be approximately in 1-3 years, and how many bytes will be approximately per data;

2. Data items: whether there are large fields, and whether the values ​​of those fields are updated frequently;
3. Data query SQL conditions: Which data items' column names often appear in WHERE, GROUP BY, ORDER BY clauses, etc.;
4. Data update class SQL conditions: How many columns often appear in the WHERE clause of UPDATE or DELETE;
Statistical ratio of quantity, such as: SELECT: UPDATE+DELETE: INSERT=How much?

6. What is the order of magnitude of the total execution of large tables and associated SQLs every day?
7. Data in the table: business that is mainly updated or business that is mainly query
8. What database physical server and database server architecture are you planning to use?
9. How about concurrency?
10. Choose InnoDB or MyISAM for the storage engine?

I roughly understand the above 10 questions. As for how to design such a big table, I should be clear about everything!

As for optimization, if it refers to creating a good table and cannot change the table structure, it is recommended that the InnoDB engine utilize more point memory to reduce the disk IO load, because IO is often the bottleneck of the database server.

In addition, for optimizing the index structure to solve performance problems, it is recommended to give priority to modifying SQL statements to make them faster. We have to rely solely on indexing to organize the structure. Of course, the premise of this is,
The index has been created very well. If it is mainly read, you can consider opening query_cache.

And adjust some parameter values: sort_buffer_size, read_buffer_size, read_rnd_buffer_size, join_buffer_size

Others suggest:

1. Index, avoid scanning, search based on primary keys, hundreds of millions of data are also very fast;
2. Anti-normalization design, exchange space for time, and avoid joins. Some join operations can be implemented in code, and there is no need to use databases to implement them;