字段设计中的空值与索引使用

2021-08-23 00:00:00 索引查询数据字段谓词

1创建测试表和索引
create table nulltest ( col1 number, col2 number,col3 number not null, col4 number not null);
create index nullind1 on nulltest (col1);
create index notnullind3 on nulltest (col3);
说明：在表nulltest的允许空值的col1创建索引，不允许空值的col3创建索引，
查询1： select col1 from nulltest t; 这个查询只是查询列col的值，我们在该字段创建了索引，从常规考虑这个查询
直接从索引拿数即可。下面看该查询计划。
Execution Plan
----------------------------------------------------------
Plan hash value: 3036052294

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 30000 | 11 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| NULLTEST | 10000 | 30000 | 11 (0)| 00:00:01 |
------------------------------------------------------------------------------
分析：因为字段col1允许空值，而b-tree索引不存储空值，此时查找字段col1的所有记录，从正确率考虑不可能
走索引（可能缺数据）。

查询2 使用hint强制走索引。
select /*+ index(t nullind1) */ col1 from nulltest t;

Execution Plan
----------------------------------------------------------
Plan hash value: 3036052294

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 30000 | 11 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| NULLTEST | 10000 | 30000 | 11 (0)| 00:00:01 |
------------------------------------------------------------------------------

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------

1 - SEL$1 / T@SEL$1
U - index(t nullind1)

说明：此时依然走全表扫描，数据库首先要保证找到用户需要的所有数据，而因为没有非空约束，索引不能保证找到全部数据。

查询3 select /*+ index(t) */ col1 from nulltest t;
Execution Plan
----------------------------------------------------------
Plan hash value: 524144022

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 30000 | 49 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| NULLTEST | 10000 | 30000 | 49 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN | NOTNULLIND3 | 10000 | | 20 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

分析：hint要求走索引，此时Oracle选择使用索引NOTNULLIND3，但是要查找的列为col1，且谓词没有关于索引字段col3的过滤条件，此时索引
走index full scan ,再大量回表操作，显然这个操作效率很低。但是由于非空字段col3可以找到所有的记录，所以虽然效率低，但是可以找到正确的数据。

查询4 select /*+ index(t notnullind3) */ col1 from nulltest t;
这个查询直接指定使用索引notnullind3，由于索引字段的非空约束，此时执行计划与查询3类似。

查询5 select col1 from nulltest t where col1 is not null;
我们要使用到索引nullind1,此时必须告诉引擎，我要找的是该字段的非空数据，此时会走索引，因为这些数据可以从索引准确的找出来，我们执行上述查询
Execution Plan
----------------------------------------------------------
Plan hash value: 2862843337

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 30000 | 7 (0)| 00:00:01 |
|* 1 | INDEX FAST FULL SCAN| NULLIND1 | 10000 | 30000 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("COL1" IS NOT NULL)

查询6 select col1 from nulltest t where col1 between 1 and 100;
这个查询的特点是谓词间接说明，我们要找的数据是非空数据，所以此时会走索引。此时走范围扫描。
Plan hash value: 3031189548

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 300 | 2 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| NULLIND1 | 100 | 300 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("COL1">=1 AND "COL1"<=100)

查询7 查询非空字段
select col3 from nulltest t;
执行计划如下
Execution Plan
----------------------------------------------------------
Plan hash value: 3819562278

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 30000 | 7 (0)| 00:00:01 |
| 1 | INDEX FAST FULL SCAN| NOTNULLIND3 | 10000 | 30000 | 7 (0)| 00:00:01 |
------------------------------------------------------------------------------------
此时走索引NOTNULLIND3，因为该字段非空，索引存储了用户需要的所有数据，不需要回表，所以走索引效率高，且可以找到准确的用户数据。
总结：对于字段设计好指定not null约束，如果确实允许空值则对于特殊的查询需要根据字段特点做相应改写如增加 not null或者类似谓词，可以有效提高查询效率。

相关文章