下面一条SQL能够使用索引吗?
select object_id,object_name,object_type from t1 where object_id is null;
有人会说,索引不存储null值,所以这个SQL不会使用索引;也有人会说,能够使用索引啊,只要在object_id列上建位图索引就可以使用索引了,另外对于CLUSTER表的KEY列,null值也可以存储在索引中(也就是能够进行索引)。
本文要探讨的是,null值不能进行索引的真正含义:
实际上对于null值,除了位图索引、CLUSTER表的KEY列,也是有可能索引的。那就是复合索引,也就是多列索引。对于普通的索引,null值不能进行索引的正确理解应该是,对于某一行,索引的所有列的值都是null值时,该行才不能被索引。
那么对于上面的SQL语句,我们就可以想办法让它走索引。下面举例说明:
SQL> desc t1
名称 是否为空? 类型
----------------------------------------- -------- --------------
OWNER VARCHAR2(30)
OBJECT_NAME NOT NULL VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(18)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)
SQL> create index t1_idx on t1(object_id,object_name);
SQL> analyze table t1 compute statistics for table for all indexes for all indexed columns;
表已分析。
SQL> select object_id,object_name,object_type from t1 where object_id is null;
OBJECT_ID OBJECT_NAME OBJECT_TYPE
---------- ------------------------------ ------------------------------
XTY DATABASE LINK
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=38)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=3 Card=1 Bytes
=38)
2 1 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=2 Card
=1)
我们可以看到,这个SQL语句走了我们建的那个索引T1_IDX。
然而,如果我们把object_name列改为允许NULL值,那么情况就发生了变化:
SQL>alter table t1 modify (object_name null);
SQL> select object_id,object_name,object_type from t1 where object_id is null;
OBJECT_ID OBJECT_NAME OBJECT_TYPE
---------- ------------------------------ ------------------------------
XTY DATABASE LINK
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=128 Card=1 Bytes=38)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=128 Card=1 Bytes=38)
把object_name列改为允许NULL值后,object_id和object_name列 有可能同时为NULL值,这个时候就存在着不被索引的行(就是有可能存在object_id值为null的行没有被索引),这样就只能走全表扫描了。
我们也有一种更简便的方法,不需要与其他列建立复合索引,而是与一个常量值建立复合索引:
SQL> create index t1_idx on t1(object_id,0);
SQL> analyze table t1 compute statistics for table for all indexes for all indexed columns;
SQL> explain plan for select object_id,object_name,object_type from t1 where object_id is null;
已解释。
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 81 | 3 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 81 | 3 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 1 | | 2 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
2 - access("T1"."OBJECT_ID" IS NULL)
Note: cpu costing is off
引申开来,如果有类似于col_name is null这样的条件的SQL语句,希望能够使用索引,那么将col_name与其他NOT NULL的列或一个常量值建一个复合索引,也是一个可能的选择。
index