在NDV良好的列生成直方图导致谓词为该列等值查询的SQL执行计划异常

2020-09-22 00:00:00 索引 字段 执行 计划 直方图

这是来自一个真实案例的测试,该库由于频繁的全表扫描导致直接路径读,由于DML的操作导致大量的Enquiry:KO-Fast Object Checkpoint等待事件,导致系统缓慢,问题时段会持续一个小时左右,导致系统整个缓慢,需要调优处理,我们将等待时间长的SQL都拉出来,逐个分析,发现等待严重的一个SQL的全表扫描无法消除,但是从字段分布看,只需要返回3行记录,但是Oracle依然选择全扫,我们通过模拟这个过程,重现问题根源,后通过本文的后的方法消除全表扫描,使得Oracle找到正确的执行计划。

创建测试表

create table a_t1(id number,district varchar2(70));


插入数据

begin

for i in 1..100000 loop

insert into a_t1 values(i,'ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST'||i);

end loop;

commit;

end;

/



SCOTT@orcl1>select column_name,num_distinct,num_buckets,avg_col_len,histogram from user_tab_columns where table_name='A_T1';


COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS AVG_COL_LEN HISTOGRAM

------------------------------ ------------ ----------- ----------- ---------------

ID                                                                                            NONE

DISTRINCT                                                         NONE



SCOTT@orcl1>select count(distinct DISTRINCT) from a_t1;


COUNT(DISTINCT DISTRINCT)

------------------------

                  100000


Elapsed: 00:00:00.05


在字段DISTRINCT创建索引

SCOTT@orcl1>  create index idx_at1_DISTRINCT on a_t1(DISTRINCT);


Index created.


Elapsed: 00:00:00.43


收集表统计信息

SCOTT@orcl1>select column_name,num_distinct,num_buckets,avg_col_len,histogram from user_tab_columns where table_name='A_T1';


COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS AVG_COL_LEN HISTOGRAM

------------------------------ ------------ ----------- ----------- ---------------

ID                                                  100000              1                              5                  NONE

DISTRINCT                                   99568               1                             46                NONE


Elapsed: 00:00:00.01


SCOTT@orcl1>select table_name,column_name,avg_col_len,num_buckets,histogram from  user_tab_col_statistics where table_name='A_T1';


TABLE_NAME                     COLUMN_NAME                    AVG_COL_LEN NUM_BUCKETS HISTOGRAM

------------------------------ ------------------------------ ----------- ----------- ---------------

A_T1                                ID                                                 5                      1 NONE

A_T1                                DISTRINCT                                  46                     1         NONE


Elapsed: 00:00:00.02


此时没有直方图信息,我们查询SQL

SCOTT@orcl1>select * from a_t1 where DISTRINCT='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552';


        ID DISTRINCT

---------- ----------------------------------------------------------------------

       552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552


Elapsed: 00:00:00.02


Execution Plan

----------------------------------------------------------

Plan hash value: 1710520156


-------------------------------------------------------------------------------------------------

| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                   |     1 |    51 |     4   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| A_T1              |     1 |    51 |     4   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | IDX_AT1_DISTRINCT |     1 |       |     3   (0)| 00:00:01 |

-------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


   2 - access("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')



Statistics

----------------------------------------------------------

          1  recursive calls

            db block gets

          5  consistent gets

            physical reads

            redo size

        640  bytes sent via SQL*Net to client

        524  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

            sorts (memory)

            sorts (disk)

          1  rows processed

这是优的执行计划,因为谓词字段值只有一个。下面我们收集索引列的直方图。

SCOTT@orcl1>exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'A_T1',estimate_percent=>10,method_opt=>'for all indexed columns' , cascade => true);


PL/SQL procedure successfully completed.


Elapsed: 00:00:01.55


SCOTT@orcl1> select table_name,column_name,avg_col_len,num_buckets,histogram from  user_tab_col_statistics where table_name='A_T1';


TABLE_NAME                     COLUMN_NAME                    AVG_COL_LEN NUM_BUCKETS HISTOGRAM

------------------------------ ------------------------------ ----------- ----------- ---------------

A_T1                           ID                                                                       5             1                  NONE

A_T1                           DISTRINCT                                                       46            1                 FREQUENCY


Elapsed: 00:00:00.07

此时是频率直方图,看直方图的桶数据统计

SCOTT@orcl1>select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name='A_T1';


TABLE_NAME                     COLUMN_NAME                    ENDPOINT_NUMBER ENDPOINT_VALUE

------------------------------ ------------------------------ --------------- --------------

A_T1                           DISTRINCT                                 9829     3.3884E+35    <<<<都在一个桶中

A_T1                           ID                                                       1

A_T1                           ID                                           1         100000


再看执行计划

SCOTT@orcl1>/


        ID DISTRINCT

---------- ----------------------------------------------------------------------

       552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552


Elapsed: 00:00:00.01


Execution Plan

----------------------------------------------------------

Plan hash value: 2172434975


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      |   100K|  4980K|   241   (2)| 00:00:03 |

|*  1 |  TABLE ACCESS FULL| A_T1 |   100K|  4980K|   241   (2)| 00:00:03 |

--------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


   1 - filter("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')



Statistics

----------------------------------------------------------

            recursive calls

            db block gets

        822  consistent gets

            physical reads

            redo size

        636  bytes sent via SQL*Net to client

        524  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

            sorts (memory)

            sorts (disk)

          1  rows processed

此时是错误的执行计划,CBO选择了全表扫描。

这里我们来梳理下结果,我们有一个表a_t1 字段DISTRINCT都是值,通过索引找数据应该快,谓词是"DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552' ,所以,因为业务知道数据情况并创建了索引,这是正确的方法,但是Orale确没有走索引,而是直接选择了全表扫,这里需要分析Oracle为了不走索引,也就是索引提供的信息Oracle认为成本更高。下面我们看强制走索引的执行计划。


SCOTT@orcl1>select /*+ index(a_t1,IDX_AT1_DISTRINCT) */ * from a_t1 where DISTRINCT='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552';


        ID DISTRINCT

---------- ----------------------------------------------------------------------

       552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552


Elapsed: 00:00:00.00


Execution Plan

----------------------------------------------------------

Plan hash value: 1710520156


-------------------------------------------------------------------------------------------------

| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                   |   100K|  4980K| 20940   (1)| 00:04:12 |

|   1 |  TABLE ACCESS BY INDEX ROWID| A_T1              |   100K|  4980K| 20940   (1)| 00:04:12 |

|*  2 |   INDEX RANGE SCAN          | IDX_AT1_DISTRINCT |   100K|       |  1309   (1)| 00:00:16 |

-------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


   2 - access("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')



Statistics

----------------------------------------------------------

            recursive calls

            db block gets

          5  consistent gets

            physical reads

            redo size

        640  bytes sent via SQL*Net to client

        523  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

            sorts (memory)

            sorts (disk)

          1  rows processed

我们看第2步骤,|*  2 |   INDEX RANGE SCAN          | IDX_AT1_DISTRINCT |   100K|       |  1309   (1)| 00:00:16 |,也就是Oracle认为要是走索引找一行数据需要返回10万行数据,显然这里跟我们想象的有出入,我们认为索引针对该值"DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552’只有一条符合条件记录,但是Oracle认为有10万条记录满足条件,走索引成本太高,这里我们就有理由相信Oracle这里有问题,通过MOS查找我们找到一篇文章。

Histogram of Character Columns Longer Than 32 Characters Causes Incorrect SQL Plan of Table Full Scan (Doc ID 2413826.1)

也就是当字段平均长度超过32字节,Oracle直方图只会存储前32个字节,这样的索引列当该列因为统计生成了直方图时,这个直方图只会根据字典的前32个字节生成,导致Oracle根据直方图计算执行计划时选择走全表扫描


解决方法:删除列上的直方图(与我们刚创建该表没有收集统计信息一样)

SCOTT@orcl1>exec DBMS_STATS.DELETE_COLUMN_STATS(ownname=>'SCOTT',tabname=>'A_T1',colname=>'DISTRINCT',col_stat_type=>'HISTOGRAM');


PL/SQL procedure successfully completed.


Elapsed: 00:00:00.07

SCOTT@orcl1>select table_name,column_name,avg_col_len,num_buckets,histogram from  user_tab_col_statistics where table_name='A_T1'


TABLE_NAME                     COLUMN_NAME                    AVG_COL_LEN NUM_BUCKETS HISTOGRAM

------------------------------ ------------------------------ ----------- ----------- ---------------

A_T1                           ID                                         5              1 NONE

A_T1                           DISTRINCT                               46             1 NONE


再次执行该语句看执行计划,此时Oracle不再基于直方图,因为该字段的良好分布特性,走索引是Oracle优选择,这次删除直方图后,走对了执行计划。

SCOTT@orcl1>select * from a_t1 where DISTRINCT='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552'  ;


        ID DISTRINCT

---------- ----------------------------------------------------------------------

       552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552


Elapsed: 00:00:00.00


Execution Plan

----------------------------------------------------------

Plan hash value: 1710520156


-------------------------------------------------------------------------------------------------

| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                   |     1 |    51 |     4   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| A_T1              |     1 |    51 |     4   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | IDX_AT1_DISTRINCT |     1 |       |     3   (0)| 00:00:01 |

-------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


   2 - access("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')



Statistics

----------------------------------------------------------

          1  recursive calls

            db block gets

          5  consistent gets

            physical reads

            redo size

        640  bytes sent via SQL*Net to client

        523  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

            sorts (memory)

            sorts (disk)

          1  rows processed






相关文章