物化视图刷新失败ora-600[qcsprfro_tree:jrs present]的案例分析
告警日志的信息如下
Fatal NI connect error 12545, connecting to:
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=x.x.x.43)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=x.x.x.43)(P
ORT=1521)))(CONNECT_DATA=(SERVICE_NAME=srv1)(CID=(PROGRAM=oracle)(HOST=host2)(USER=oracle))))
VERSION INFORMATION:
TNS for Linux: Version 12.2.0.1.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 12.2.0.1.0 - Production
Time: 30-JUL-2021 08:55:53
Tracing not turned on.
Tns error struct:
ns main err code: 12545
TNS-12545: Connect failed because target host or object does not exist
ns secondary err code: 12560
nt main err code: 515
TNS-00515: Connect failed because target host or object does not exist <<<<<<怀疑监听的网络连接问题
nt secondary err code: 111
nt OS err code: 0
Errors in file /oracle/diag/rdbms/xxxx18/xxxx182/trace/xxxx182_j000_296318.trc (incident=529181):
ORA-00600: internal error code, arguments: [qcsprfro_tree:jrs present], [0x7FFFF213B1A0], [], [], [], [], [], [], [], [], [] <<<<<<查询可能是bug导致
, []
Incident details in: /oracle/diag/rdbms/xxxx18/xxxx182/incident/incdir_529181/xxxx182_j000_296318_i529181.trc
2021-07-30T08:55:54.837946+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2021-07-30T08:56:41.214166+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2021-07-30T08:56:41.214621+08:00
On demand
MV user1.mv_tab_LOCATION was not refreshed successfully. <<<<<<<物化视图刷新失败
Number of MV refresh failures: 1.
Encountered error ORA-12008.
kkzifr3g: Encountered error ORA-12008. <<<<<<<物化视图失败遇到的错误是ora-12008
2021-07-30T08:56:41.216277+08:00
Errors in file /oracle/diag/rdbms/xxxx18/xxxx182/trace/xxxx182_j000_296318.trc:
ORA-12012: error on auto execute of job 36
ORA-12008: error in materialized view or zonemap refresh path <<<<<<物化视图失败遇到的错误是ora-12008的解释
ORA-00600: internal error code, arguments: [qcsprfro_tree:jrs present], [0x7FFFF213B1A0], [], [], [], [], [], [], [], [], [] <<<<<<触发600错误
, []
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 85
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 245
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2352
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2908
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3191
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3247
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 41
ORA-06512: at "SYS.DBMS_IREFRESH", line 703
ORA-06512: at "SYS.DBMS_REFRESH", line 214
ORA-06512: at line 1
2021-07-30T08:56:41.231984+08:00
Errors in file /oracle/diag/rdbms/xxxx18/xxxx182/trace/xxxx182_j000_296318.trc:
ORA-12012: error on auto execute of job 36
ORA-12008: error in materialized view or zonemap refresh path
ORA-00600: internal error code, arguments: [qcsprfro_tree:jrs present], [0x7FFFF213B1A0], [], [], [], [], [], [], [], [], []
, []
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 85
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 245
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2352
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2908
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3191
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3247
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 41
ORA-06512: at "SYS.DBMS_IREFRESH", line 703
ORA-06512: at "SYS.DBMS_REFRESH", line 214
ORA-06512: at line 1
opidrv aborting process J000 ospid (296318) as a result of ORA-12012
2021-07-30T08:56:41.399291+08:00
Dumping diagnostic data in directory=[cdmp_20210730085641], requested by (instance=2, osid=296318 (J000)), summary=[incident
=529181].
2021-07-30T08:57:11.601981+08:00
下面我们分析下物化视图的具体信息,看看该物化视图有什么特点。
job 36,每隔10分钟刷新物化视图。
09:13:53 SQL> select job,what,LOG_USER,SCHEMA_USER,INTERVAL,FAILURES from dba_jobs where job=36
JOB WHAT LOG_USER SCHEMA_USER INTERVAL FAILURES
---------- ---------------------------------------- ---------- -------------------- -------------------- ----------
36 dbms_refresh.refresh('"user1"."MV_GIS_TR user1 user1 SYSDATE+ 1/24/6 4
AINLINE_LOCATION"');
1 row selected.
exec dbms_refresh.refresh('"MV"."MV_EMP"');
[oracle@host2 ~]$oerr ora 06512
06512, 00000, "at %sline %s"
// *Cause: Backtrace message as the stack is unwound by unhandled
// exceptions.
// *Action: Fix the problem causing the exception or write an exception
// handler for this condition. Or you may need to contact your
// application administrator or DBA.
[oracle@host2 ~]$oerr ora 12008
12008, 00000, "error in materialized view or zonemap refresh path"
// *Cause: Table SNAP$_<mview_name> reads rows from the view
// MVIEW$_<mview_name>, which is a view on the master table
// (the master may be at a remote site). Any
// error in this path will cause this error at refresh time.
// For fast refreshes, the table <master_owner>.MLOG$_<master>
// is also referenced.
// *Action: Examine the other messages on the stack to find the problem.
// See if the objects SNAP$_<mview_name>, MVIEW$_<mview_name>,
// <mowner>.<master>@<dblink>, <mowner>.MLOG$_<master>@<dblink>
// still exist.
确认下物化视图对象以及创建时间,看是否是新建的对象,从查询结果看已经是很早创建的了。
SQL> select owner,object_name,object_type,created from dba_objects where object_name='mv_tab_LOCATION' and owner='user1';
OWNER OBJECT_NAME OBJECT_TYPE CREATED
-------------------- ---------------------------------------- ----------------------- -------------------
user1 mv_tab_LOCATION TABLE 2019-02-26 15:10:51
user1 mv_tab_LOCATION MATERIALIZED VIEW 2019-02-26 15:10:56
下面查询该物化视图的定义
METADATA
--------------------------------------------------------------------------------
CREATE MATERIALIZED VIEW "user1"."mv_tab_LOCATION" ("IDX", "CC", "TYPE1", "TYPE2", "SOURCE_NODE_NAME", "TAR
GET_NODE_NAME", "PLAN_SOURCE_TIME", "PLAN_TARGET_TIME", "SOURCE_TIME", "TARGET_T
IME", "CURRENT_NODE_NAME", "CURRENT_NODE_PLAN_SOURCE_TIME", "CURRENT_NODE_PLAN_T
ARGET_TIME", "CURRENT_NODE_SOURCE_TIME", "CURRENT_NODE_TARGET_TIME", "LONGITUDE"
, "LATITUDE", "DELAY_TIME", "LOCO_COUNT", "LOCO_TYPE", "LOCO_NO", "SOURCE_BUREAU
_CODE", "SOURCE_BUREAU_NAME", "SOURCE_BUREAU_SHORTNAME", "TARGET_BUREAU_CODE", "
TARGET_BUREAU_NAME", "TARGET_BUREAU_SHORTNAME", "CURRENT_BUREAU_CODE", "CURRENT_
BUREAU_NAME", "CURRENT_BUREAU_SHORTNAME")
ORGANIZATION HEAP PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "TBS_user1DEFAULT"
BUILD IMMEDIATE
USING INDEX
REFRESH FORCE ON DEMAND START WITH sysdate+0 NEXT SYSDATE+ 1/24/6
USING DEFAULT LOCAL ROLLBACK SEGMENT
USING ENFORCED CONSTRAINTS DISABLE ON QUERY COMPUTATION DISABLE QUERY REWRITE
AS SELECT
........
........
FROM gis1 GIS
LEFT JOIN bconf_org T1
ON T1.ORGCODE = LPAD(GIS.SOURCE_BUREAU_CODE, 2, '0')
LEFT JOIN bconf_org T2
ON T2.ORGCODE = LPAD(GIS.TARGET_BUREAU_CODE, 2, '0')
LEFT JOIN bconf_org T3
ON T3.ORGCODE = LPAD(GIS.CURRENT_BUREAU_CODE, 2, '0')
发现物化视图定义中同步数据的表有一个为同义词,且该同义词为一个dblink定义的对象
SQL> select object_name,object_type,owner from dba_objects where object_name='gis1' and owner='user1'
OBJECT_NAME OBJECT_TYPE OWNER
------------------------------ ----------------------- --------------------
gis1 SYNONYM user1
SQL> select object_name,object_type,owner from dba_objects where object_name='bconf_org' and owner='user1';
OBJECT_NAME OBJECT_TYPE OWNER
------------------------------ ----------------------- --------------------
bconf_org TABLE user1
同义词定义查询,其为dblink的一张远程表。
SQL> select dbms_metadata.get_ddl(upper('SYNONYM'),upper('gis1'),upper('user1')) METADATA from dual;
METADATA
--------------------------------------------------------------------------------
CREATE OR REPLACE EDITIONABLE SYNONYM "user1"."gis1" FOR "T_
XD"."gis1"@"DBLINK_srv1"
下面看下该dblink指向哪里,
SQL> select owner,username,host from dba_db_links where db_link='DBLINK_srv1' and owner='user1';
OWNER USERNAME HOST
-------------------- -------------------- --------------------------------------------------
user1 T_XD (DESCRIPTION =
(ADDRESS_LIST =(ADDRESS =(PROTOCOL = T
CP)(HOST = x.x.x.43)(PORT = 1521))
(ADDRESS =(PROTOCOL = TCP)(HO
ST = x.x.x.43)(PORT = 1521)))
(CONNECT_DATA =(SERVICE_NAME
= srv1)))
开始通过网络排查确实报错跟数据库日志一样,开始怀疑是本机地址被防火墙在应用层屏蔽了,尝试i使用sqlplus连接数据库,发现与数据库日志执行job
时类似的告警。并且此时通过vip公网地址问题一样,但是在dblink数据库服务器端时没问题的。
sqlplus system/passwordx.x.x.43:1521/srv1
TNS-12545: Connect failed because target host or object does not exist
这个问题我们就卡在这,之前的用户操作就是节点2换硬件,早上重启2节点集群,43地址跑在2节点上。重启后物化视图刷新开始报错。
后续我们将问题聚焦到600错误,根据MOS的提示ORA-600 [qcsprfro_tree:jrs present] When Materialized View Refreshed
(Doc ID 1380002.1),现象为ORA-12008 & ORA-00600 errors in Alert.log Verifying the tracefile could point to using
the materialezed view through dblink; 这个条件跟我们现象很相似,报错信息以及物化视图通过dblink刷新数据。
根据该Doc 要求重建物化视图,我们尝试重新编译该视图看看问题是否依旧
SQL> select owner,mview_name,STALENESS from dba_mviews where mview_name='mv_tab_LOCATION'
OWNER MVIEW_NAME STALENESS
-------------------- ------------------------------ -------------------
user1 mv_tab_LOCATION FRESH
SQL> alter MATERIALIZED VIEW user1.mv_tab_LOCATION compile;
Materialized view altered.
SQL> select owner,mview_name,STALENESS from dba_mviews where mview_name='mv_tab_LOCATION';
OWNER MVIEW_NAME STALENESS
-------------------- ------------------------------ -------------------
user1 mv_tab_LOCATION FRESH
手工执行该刷新,故障恢复。
SQL> exec dbms_refresh.refresh('"user1"."mv_tab_LOCATION"');
PL/SQL procedure successfully completed.
后续该问题还是断续报了几次错,不是那么频繁,根据MOS提示的bug,这个问题得通过重建解决,后续需要跟业务沟通
找个时间窗口重建该雾化视图。
说明:在断续报警几次后,通过监控没有再发生,是否重建还是根据监控再试试把。
相关文章