spring-data-jpa使用oracle作為數據源時in查詢在1000條以上，報ORA_01795的錯誤詳情 - java,jpa,oracle 行走的五花肉博客

目的

解決項目中使用spring-data-jpa，採用oracle 11g作為數據源時，當in查詢後面的條件超過1000條後，oracle報ORA_01795的異常。

思路

問題在於當前版本的oracle不支持單個in查詢超過1000的情形，思路是通過jpa提供給我們的有實體類生成SQL後、未執行前的攔截器，對生成的SQL進行判斷，如果存在上述的超過1000的情形，將SQL拆分成多個in的組合、通過or連接。
例如：

# jpa轉換的SQL通常是如下形式，假如in (?,?,?....)有超過1000個？
select id,name,gender from user where id in(? ,? ,? ,? ,? ,? ,? ,? ,? ,?, ?);
# 將其拆分
select id,name,gender from user where (id in(? ,? ,? ,? ,? ,?) or id in (,? ,? ,?)  or id in(?, ?));

代碼

package com.orient.collab.provider.common.jpa.interceptor;

import org.hibernate.resource.jdbc.spi.StatementInspector;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.autoconfigure.orm.jpa.HibernatePropertiesCustomizer;
import org.springframework.context.annotation.Configuration;

import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * description:
 * when using oracle as data source, custom a sql statement interceptor to solve "select or delete with 'in' " limit 1000 cause !!!
 * !!!不支持not in
 * @author MorningSun
 * @version 1.0
 * @since JDK1.8
 * date 2022/4/27 19:16
 */
@Configuration
@ConditionalOnProperty(name = "spring.datasource.driver-class-name", havingValue = "oracle.jdbc.driver.OracleDriver")
public class JPAConfigurationOnUsingOracle implements HibernatePropertiesCustomizer {

    /**
     *         StringBuilder sql = new StringBuilder("select *.* from . . where *.* in(");
     *         for (int i = 0; i < 999; i++) {
     *             sql.append("? , ");
     *         }
     *         sql.append("?)");
     *         System.out.println(sql.length());//4033
     */
    private static final int SQL_THRESHOLD_LENGTH = 4033;//超過這個才去匹配
    private final static int IN_CAUSE_LIMIT = 1000;
    private final static Pattern pattern = Pattern.compile("[^(^\\()\\s]+\\s+in\\s*\\(+(\\?\\s*,\\s*){"+(IN_CAUSE_LIMIT - 2) +",}+\\?\\)", Pattern.CASE_INSENSITIVE);

    @Override
    public void customize(Map<String, Object> hibernateProperties) {
        hibernateProperties.put("hibernate.session_factory.statement_inspector", new StatementInspector(){

            @Override
            public String inspect(String sql) {
                if (sql.length() <= SQL_THRESHOLD_LENGTH){
                    return sql;
                }

                return this.rewriteSqlToAvoidORA_01795(sql);
            }

            private String rewriteSqlToAvoidORA_01795(String sql) {
//                long startTime = System.currentTimeMillis();
                Matcher matcher = pattern.matcher(sql);
                while (matcher.find()) {
                    String inExpression = matcher.group();
                    long countOfParameters = inExpression.chars().filter(ch -> ch == '?').count();
                    String fieldName = inExpression.substring(0, inExpression.indexOf(' '));
                    StringBuilder transformedInExpression = new StringBuilder(" ( ").append(fieldName).append(" in (");
                    for (int i = 0; i < countOfParameters; i++) {
                        if (i != 0 && i % IN_CAUSE_LIMIT == 0) {
                            transformedInExpression
                                    .deleteCharAt(transformedInExpression.length() - 1)
                                    .append(") or ").append(fieldName).append(" in (");
                        }
                        transformedInExpression.append("?,");
                    }
                    transformedInExpression.deleteCharAt(transformedInExpression.length() - 1).append("))");
                    sql = sql.replaceFirst(Pattern.quote(inExpression), transformedInExpression.toString());
                }
//                long endTime = System.currentTimeMillis();
//                System.out.println("拆分sql耗時："+(endTime - startTime)+"ms");//2ms
                return sql;
            }
        });
    }

}

問題

不支持not in,思路其實一樣的，not in 需要通過and連接，該場景更少，故不作考慮；
關於性能：
- 凡是使用了jpa的dao層操作，均會走到該攔截器，所有基於jpa生成的sql的特徵定義一個sql長度的閾值，生成的sql的length如果小於該閾值，直接返回（實際這個基本已經排除了絕大部分的場景）；
- 關於正則匹配，可能有多種寫法，由於缺乏對底層的相關機制如：回溯、貪心算法等了解，尚有優化的地方（自己試着寫了幾種不同的形式測了一下，耗時都在1、2ms，所以可能出現的性能瓶頸可能不是sql的轉化上，更多的是sql的查詢執行上）

行走的五花肉博客

行走的五花肉博客

博客 / 詳情

spring-data-jpa使用oracle作為數據源時in查詢在1000條以上，報ORA_01795的錯誤

目的

思路

代碼

問題

發佈評論

Product

Company

Support

Company

博客 / 詳情

spring-data-jpa使用oracle作為數據源時in查詢在1000條以上 ，報ORA_01795的錯誤

目的

思路

代碼

問題

發佈 評論

spring-data-jpa使用oracle作為數據源時in查詢在1000條以上，報ORA_01795的錯誤

發佈評論