Detailed steps to optimize SQL queries using indexes

Core goal: enable the database to quickly locate the required data rows by scanning indexes (small-scale, ordered data structures), rather than scanning the entire table (very slow under large data volumes).

1. Understand the role and scenarios of indexes

Accelerate search:Main optimizationWHEREclauses andJOIN ONThe search speed of the condition.
Accelerated sorting:optimizationORDER BYclause, avoid file sorting (filesort)。
Accelerated grouping:optimizationGROUP BYClauses to help quickly find the same grouping.
Uniqueness constraints: UNIQUEIndexing can also ensure the uniqueness of data.

2. Identify the query that needs indexing

Before you start adding indexes, find out which queries are performance bottlenecks or are executed frequently. We have also introduced it before. Interested friends can go and read the previous article. Let’s review it:

MySQL Slow Query Log:Position SQL with long execution time.
APM tools (such as SkyWalking):View the time-consuming calls in the request.
EXPLAINanalyze:Execution of SQL that suspects there is a problemEXPLAIN, check the execution plan.
Business Analysis:Think about core business processes and high-frequency query scenarios.

Focus on queries involving the following operations:

Filter (WHERE): findByEmail(String email), findAllByStatus(OrderStatus status)
Connection (JOIN):Load the associated entity, such as querying orders and their user information.
Sort (ORDER BY): findAllByOrderByCreatedAtDesc()
Grouping (GROUP BY):Statistical query.

3. Master the key index types

Single-Column Index:Create an index for a single column. Suitable for simple, precise matching or range queries based on this column.

CREATE INDEX idx_users_email ON users (email);

Composite / Multi-Column Index:Create indexes for multiple column combinations.Extremely important, suitable for multiple conditionsWHEREThe clause may also need to be satisfiedWHEREandORDER BY / GROUP BYquery.

-- Applicable to WHERE status = ? AND created_at &gt; ?
CREATE INDEX idx_orders_status_created ON orders (status, created_at);

- Leftmost Prefix Rule:Joint index(a, b, c)Can supportWHERE a=?、WHERE a=? AND b=?、WHERE a=? AND b=? AND c=?queries, but are not usually supportedWHERE b=?orWHERE a=? AND c=?。The order of columns is crucial.
Covering Index:If an index contains all columns required for the query (SELECT, WHERE, ORDER BY, etc.), the database can return results directly from the index, without accessing the data table (returning to the table), and has extremely high performance.

-- Query: SELECT user_id, status FROM orders WHERE order_date &gt; ?
-- Overwrite index:
CREATE INDEX idx_orders_date_user_status ON orders (order_date, user_id, status);

Unique Index:Ensure that the value of the index column is unique, usually used for unique identification on business (such as user email address and mobile phone number), and also has the query acceleration function of ordinary indexes.

CREATE UNIQUE INDEX uk_users_email ON users (email);

Full-Text Index:Used forTEXTType data for keyword search.

4. Create and manage indexes in Spring Boot projects

Below we apply the theory to the project practice:

The wrong way (strictly forbidden to use in production environments!):
- Relying on JPA/Hibernate-auto=updateorcreate。
- reason:
  - updateThe behavior is unpredictable, and data may be lost or unexpected changes may occur.
  - createThe entire database will be deleted!
  - Version control and teamwork are not possible.
  - Bypassed the necessary Code Review and database change management processes.
- @Table(indexes = ...)or@IndexAnnotations: These annotations are mainly forddl-autoUsed, or used to generate DDL scripts for use by other tools,They should not be directly relied on to automatically create/update indexes in production environments。
The correct way (production environment standard):
- Use Database Migration Tools: FlywayorLiquibaseIt's from the Spring Boot projectBest Practices and Essential Tools。
- Workflow:
  - Add dependencies:existorAdd Spring Boot Starter dependencies for Flyway or Liquibase.

<!-- Flyway Example -->
<dependency>
    <groupId></groupId>
    <artifactId>flyway-core</artifactId>
</dependency>
<dependency> <!-- If using MySQL -->
    <groupId></groupId>
    <artifactId>flyway-mysql</artifactId>
</dependency>

<!-- Liquibase Example -->
<dependency>
    <groupId></groupId>
    <artifactId>liquibase-core</artifactId>
</dependency>

Create a migration script:existsrc/main/resources/db/migrationCreate SQL scripts under (Flyway default) or the specified path (Liquibase). Script naming must comply with the tool's version specification (for example, Flyway:V1__Initial_schema.sql, V2__Add_index_on_users_email.sql）。
Writing DDL:Use standard in SQL scriptsCREATE INDEXStatement to define index.

-- V2__Add_index_on_users_email.sql
CREATE INDEX idx_users_email ON users (email);

-- V3__Add_composite_index_on_orders.sql
CREATE INDEX idx_orders_user_status ON orders (user_id, status);

-- V4__Add_unique_index_on_products.sql
CREATE UNIQUE INDEX uk_products_sku ON products (sku);

Run the application:When the Spring Boot application is started, Flyway/Liquibase will automatically detect and execute new migration scripts in version order, applying index changes to the database.

advantage:
- Version control:Index changes can be included in our Git management repository like code.
- Repeatable:The same changes can be applied in any environment.
- automation:Conveniently integrated into the CI/CD process.
- Teamwork:Clearly record Schema's change history.
- Safety:The changes have been scripted and versioned, reducing manual operation errors.

5. Example of indexing strategies for common Spring Boot query scenarios

Scenario: Find entities through unique business identification (such asUser findByEmail(String email);)

SQL : SELECT * FROM users WHERE email = ?
Indexing strategy:existemailCreate on the columnUnique Index。

CREATE UNIQUE INDEX uk_users_email ON users (email);

Scenario: A paging list filtered by status and sorted by time (such asPage<Order> findByStatusOrderByCreatedAtDesc(OrderStatus status, Pageable pageable);)

SQL : SELECT * FROM orders WHERE status = ? ORDER BY created_at DESC LIMIT ?, ?
Indexing strategy:CreateJoint index,Includestatusandcreated_at。statusIt is filtered equally and put in front;created_atIt is sorting and putting it behind.

CREATE INDEX idx_orders_status_created ON orders (status, created_at);

Advanced (cover index):If only a few columns are needed (such asid, order_no, status, created_at), overwrite indexes can be created to avoid back to tables:

CREATE INDEX idx_orders_status_created_cover ON orders (status, created_at, id, order_no);

Scenario: Loading the associated entity (such as obtaining orders and their user informationOrder order = (id); User user = ();)

JPA may generate (depending on FetchType):
- One-time JOIN:SELECT ... FROM orders o LEFT JOIN users u ON o.user_id = WHERE = ?
- N+1 (If LAZY Fetching and subsequently access user): Check the order first, then according toorder.user_idCheck user.
Indexing strategy: Must be in the foreign key column (orders.user_idCreate an index on ) .

CREATE INDEX idx_orders_user_id ON orders (user_id);

In this way, whether it is a JOIN query or a second query in N+1, it can be quickly passed.user_idFind the corresponding order or user.
Scenario: Multi-condition filtering query (such asList<Product> findByNameContainingAndCategoryAndPriceBetween(String name, String category, BigDecimal minPrice, BigDecimal maxPrice);)
- SQL : SELECT * FROM products WHERE category = ? AND price BETWEEN ? AND ? AND name LIKE ?(NoticeLIKEThe usage will affect the index efficiency)
- Indexing strategy:Create a joint index. Usually, columns with equal value query and high selectivity are placed in front of them. Range query (BETWEEN) andLIKEPut it behind.
  - index:(category, price, name). This can be used firstcategoryFilter, then usepricePerform a range scan.nameOnLIKEin the case of'%keyword%'Then this index is invalid; if it is'prefix%'There may be some effects.
  - ifnameThe query is more frequent or more selective, you can also consider(name, category, price)and use the prefix index. It needs to be analyzed based on actual conditions.

CREATE INDEX idx_products_category_price_name ON products (category, price, name);
-- or，if name Prefix index required
-- CREATE INDEX idx_products_category_price_name ON products (category, price, name(20));

6. Verify the index effect

After adding the index, you must verify that it is used correctly and valid:

useEXPLAIN:
- Gets the SQL generated by the Spring Boot application.
- Replace the placeholder with the actual parameters.
- Execute in MySQL clientEXPLAIN [your SQL query];。
- Check the output:
  - keyDoes the column show the index name you expect to use?
  - typeIs the column the better type (such asref, range, eq_ref),avoidALL？
  - rowsIs the number of rows scanned by column estimates significantly reduced?
  - ExtraIs there a column?Using filesortorUsing temporary? Did it appearUsing index(overwrite index)?
Performance Test:
- In the test environment, simulate the load and compare the query response time before and after adding the index.
monitor:
- Observe the time-consuming changes in the corresponding database calls in the APM tool.
- Observe whether the previous slow SQL disappears or decreases in the slow query log.

Summarize:

Using indexes is key to optimizing SQL query performance in Spring Boot projects. The core steps include: identifying slow queries -> Understanding query patterns -> Selecting the appropriate index type (single column, union, overwrite, etc.) ->Create indexes in versioned SQL scripts using the Database Migration Tool (Flyway/Liquibase)-> UseEXPLAINand monitor verification effect.

The above is the detailed content of the detailed steps of SpringBoot using indexes to optimize SQL queries. For more information about SpringBoot index optimization SQL queries, please pay attention to my other related articles!