dblink优化案例演示
dblink优化思路
在我们的业务系统中,存在部分数据库由于dbtime过高导致的告警,经分析等待事件皆关于dblink.SQL*Net more data from dblink ,SQL*Net message from dblink
分析dblink导致的性能问题一般从几个方面考虑。
1 remote操作慢,分析远端执行计划
2 remote数据源与本地数据集Jion顺序异常,也就是执行计划不合理
3 网络不稳定。
下面我们通过一个例子体验下Dblink可能带来的问题,以及如何通过远端优化SQL,提高SQL整体的执行效率。
下面是一个查询
employees表在本地departments@orcl表在远端
SQL> select e.first_name,e.email,d.department_name from employees e,departments@orcl d where e.department_id=d.department_id
2 and d.department_name like 'M%';
FIRST_NAME EMAIL
---------------------------------------- --------------------------------------------------
DEPARTMENT_NAME
------------------------------------------------------------
Michael MHARTSTE
Marketing
Pat PFAY
Marketing
Execution Plan
----------------------------------------------------------
Plan hash value: 2986906154
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 897 | 4 (0)| 00:00:01 | | |
| 1 | NESTED LOOPS | | 23 | 897 | 4 (0)| 00:00:01 | | |
| 2 | NESTED LOOPS | | 23 | 897 | 4 (0)| 00:00:01 | | |
| 3 | REMOTE | DEPARTMENTS | 2 | 42 | 3 (0)| 00:00:01 | ORCL | R->S |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 | | |
| 5 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 180 | 1 (0)| 00:00:01 | | |
------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
3 - SELECT "DEPARTMENT_ID","DEPARTMENT_NAME" FROM "DEPARTMENTS" "D" WHERE "DEPARTMENT_NAME" LIKE 'M%'
(accessing 'ORCL' )
分析:这里核心在3,4两个步骤的嵌套操作,这里先执行第3步骤,将原始SQL需要的数据从远端全部拉到本地,这个操作称为remote操作,这个数据集
与本地索引EMP_DEPARTMENT_IX关于department_id字段做嵌套处理,其实这个join的顺序还是优的,嵌套循环将远端表的数据和作为驱动表,从而
减少了对网络的循环调用。如果出现相关dblink的等待这个Remote操作就很可能需要优化,看看远端操作的执行计划,看看问题卡在哪里
。
我们在远端执行SQL:SELECT "DEPARTMENT_ID","DEPARTMENT_NAME" FROM "DEPARTMENTS" "D" WHERE "DEPARTMENT_NAME" LIKE 'M%'
SQL> l
1* SELECT "DEPARTMENT_ID","DEPARTMENT_NAME" FROM "DEPARTMENTS" "D" WHERE "DEPARTMENT_NAME" LIKE 'M%'
SQL> /
DEPARTMENT_ID DEPARTMENT_NAME
------------- ------------------------------------------------------------
20 Marketing
170 Manufacturing
Execution Plan
----------------------------------------------------------
Plan hash value: 4167016233
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 32 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| DEPARTMENTS | 2 | 32 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPARTMENT_NAME" LIKE 'M%')
我们看到此时的操作是全表扫描,如果该表很大,同时这个SQL的执行频率又突然增加,就很可能引起DBtime过高,发生系统告警,所以我们需要优化这条
SQL。这里根据谓词条件创建一个索引。
SQL> create index idx_dept_name on departments(DEPARTMENT_NAME);
Index created.
SQL>
SQL> SELECT "DEPARTMENT_ID","DEPARTMENT_NAME" FROM "DEPARTMENTS" "D" WHERE "DEPARTMENT_NAME" LIKE 'M%';
DEPARTMENT_ID DEPARTMENT_NAME
------------- ------------------------------------------------------------
170 Manufacturing
20 Marketing
Execution Plan
----------------------------------------------------------
Plan hash value: 37820000
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 32 | 2 (0) | 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEPARTMENTS | 2 | 32 | 2 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_DEPT_NAME | 2 | | 1 (0) | 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DEPARTMENT_NAME" LIKE 'M%')
filter("DEPARTMENT_NAME" LIKE 'M%')
这里通过索引做了优化,使得SQL执行速度更快,这里也可以通过组合索引进一步瘦表来优化,具体不演示了。
下面我们使用hint让本地表做驱动表。
select /*+ ordered */ e.first_name,e.email,d.department_name from employees e,departments@orcl d where e.department_id=d.department_id
and d.department_name like 'M%';
Execution Plan
----------------------------------------------------------
Plan hash value: 2553653536
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 897 | 6 (0)| 00:00:01 | | |
|* 1 | HASH JOIN | | 23 | 897 | 6 (0)| 00:00:01 | | |
| 2 | TABLE ACCESS FULL| EMPLOYEES | 107 | 1926 | 3 (0)| 00:00:01 | | |
| 3 | REMOTE | DEPARTMENTS | 2 | 42 | 3 (0)| 00:00:01 | ORCL | R->S |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
Remote SQL Information (identified by operation id):
----------------------------------------------------
3 - SELECT "DEPARTMENT_ID","DEPARTMENT_NAME" FROM "DEPARTMENTS" "D" WHERE
"DEPARTMENT_NAME" LIKE 'M%' (accessing 'ORCL' )
分析:此时Oracle很聪明,走了hash,而没有使用order这个hint前,其选择使用nested loop操作,如果此时继续使用nested loop,因为remote表是被驱动表,将会被
多次访问,而每访问一次就会执行一次remote操作,这个耗时是惊人的所以,Oracle选择了hash操作,两个结果集做一次hash计算即可。
下面我们强制使用嵌套,且本地表做驱动表。看看执行新计划的成本如何。
select /*+ ordered use_nl(e ,d) */ e.first_name,e.email,d.department_name from employees e,departments@orcl d where e.department_id=d.department_id
and d.department_name like 'M%';
Execution Plan
----------------------------------------------------------
Plan hash value: 1026140700
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 897 | 110 (0)| 00:00:01 | | |
| 1 | NESTED LOOPS | | 23 | 897 | 110 (0)| 00:00:01 | | |
| 2 | TABLE ACCESS FULL | EMPLOYEES | 107 | 1926 | 3 (0) | 00:00:01 | | |
| 3 | REMOTE | DEPARTMENTS | 1 | 21 | 1 (0) | 00:00:01 | ORCL | R->S |
--------------------------------------------------------------------------------------------------
Remote SQL Information (identified by operation id):
----------------------------------------------------
3 - SELECT /*+ USE_NL ("D") */ "DEPARTMENT_ID","DEPARTMENT_NAME" FROM "DEPARTMENTS"
"D" WHERE :1="DEPARTMENT_ID" AND "DEPARTMENT_NAME" LIKE 'M%' (accessing 'ORCL' )
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
2 - SEL$1 / E@SEL$1
U - use_nl(e ,d)
分析:这里本地表做嵌套循环的驱动表,使得远端操作多次执行,驱动表返回多少行记录,被驱动表就被访问多少次,如果此时远端操作的执行计划不是优,这个网络操作
的成本将是巨大的,这里cost为110远远高于走hash的6以及走远端数据集做驱动表的nested loop;
这里我们通过优化远端表以及优化连接顺序说明如何首先dblink的优化。只要顺着这个思路,就可以有效分析和解决Dblink带来的性能问题。
相关文章