ASM磁盘头块分析与磁盘头损坏恢复(自动备份的恢复)

2021-09-14 00:00:00 数据 磁盘 信息 大小 第二个

asm磁盘格式化的单位是AU 默认一个AU为1M,ASM磁盘中存放两种数据元数据和用户数据(数据文件,归档文件,日志文件,密码文件),ASM磁盘
的个AU用户存储ASM元数据。AU以元数据库的方式格式化,一个元数据块大小为4k,对于AU为1M的ASM磁盘,个AU有256个块,其中
个块,即block 0 是ASM磁盘头。

ASM磁盘头存放与磁盘组和磁盘本身相关的数据

SQL> select group_number,name,path, mount_status,header_status,state from v$asm_disk where group_number=1 order by name;

GROUP_NUMBER NAME PATH MOUNT_STATUS HEADER_STATUS STATE
------------ ---------- -------------------- --------------------- -------------------- -------------------- ----------------------
1 DISK1 /dev/sde1 CACHED MEMBER NORMAL
1 DISK2 /dev/sdf1 CACHED MEMBER NORMAL
1 DISK3 /dev/sdg1 CACHED MEMBER NORMAL
1 DISK4 /dev/sdh1 CACHED MEMBER NORMAL

kfed read
[grid@rac1 ~]$ kfed read /dev/sde1
kfbh.endian: 1 ; 0x000: 0x01 <<<<<<系统字节序 0-大字节序, 1-小字节序
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD <<<<<<ASM块类型,KFBTYP_DISKHEADASM表示磁盘头类型块
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0 <<<<<<ASM 块号
kfbh.block.obj: 2147483652 ; 0x008: disk=4
kfbh.check: 1472778488 ; 0x00c: 0x57c8d0f8
kfbh.fcn.base: 1302 ; 0x010: 0x00000516
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 <<<<<说明是非ASMLIB磁盘,ASMLIB磁盘是ORCLDISK
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186647040 ; 0x020: 0x0b200200
kfdhdb.dsknum: 4 ; 0x024: 0x0004 <<<<<<ASM磁盘编号
kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL <<<<<<冗余类型
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER <<<<<<ASM磁盘头状态
kfdhdb.dskname: DISK1 ; 0x028: length=5 <<<<<<ASM磁盘名字
kfdhdb.grpname: DATA1 ; 0x048: length=5 <<<<<<ASM磁盘组名字
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9 <<<<<<ASM失败组名字
kfdhdb.siteguid[0]: 0 ; 0x088: 0x00
kfdhdb.siteguid[1]: 0 ; 0x089: 0x00
kfdhdb.siteguid[2]: 0 ; 0x08a: 0x00
kfdhdb.siteguid[3]: 0 ; 0x08b: 0x00
kfdhdb.siteguid[4]: 0 ; 0x08c: 0x00
kfdhdb.siteguid[5]: 0 ; 0x08d: 0x00
kfdhdb.siteguid[6]: 0 ; 0x08e: 0x00
kfdhdb.siteguid[7]: 0 ; 0x08f: 0x00
kfdhdb.siteguid[8]: 0 ; 0x090: 0x00
kfdhdb.siteguid[9]: 0 ; 0x091: 0x00
kfdhdb.siteguid[10]: 0 ; 0x092: 0x00
kfdhdb.siteguid[11]: 0 ; 0x093: 0x00
kfdhdb.siteguid[12]: 0 ; 0x094: 0x00
kfdhdb.siteguid[13]: 0 ; 0x095: 0x00
kfdhdb.siteguid[14]: 0 ; 0x096: 0x00
kfdhdb.siteguid[15]: 0 ; 0x097: 0x00
kfdhdb.ub1spare[0]: 0 ; 0x098: 0x00
kfdhdb.ub1spare[1]: 0 ; 0x099: 0x00
kfdhdb.ub1spare[2]: 0 ; 0x09a: 0x00
kfdhdb.ub1spare[3]: 0 ; 0x09b: 0x00
kfdhdb.ub1spare[4]: 0 ; 0x09c: 0x00
kfdhdb.ub1spare[5]: 0 ; 0x09d: 0x00
kfdhdb.ub1spare[6]: 0 ; 0x09e: 0x00
kfdhdb.ub1spare[7]: 0 ; 0x09f: 0x00
kfdhdb.ub1spare[8]: 0 ; 0x0a0: 0x00
kfdhdb.ub1spare[9]: 0 ; 0x0a1: 0x00
kfdhdb.ub1spare[10]: 0 ; 0x0a2: 0x00
kfdhdb.ub1spare[11]: 0 ; 0x0a3: 0x00
kfdhdb.ub1spare[12]: 0 ; 0x0a4: 0x00
kfdhdb.ub1spare[13]: 0 ; 0x0a5: 0x00
kfdhdb.ub1spare[14]: 0 ; 0x0a6: 0x00
kfdhdb.ub1spare[15]: 0 ; 0x0a7: 0x00
kfdhdb.crestmp.hi: 33121553 ; 0x0a8: HOUR=0x11 DAYS=0x8 MNTH=0x9 YEAR=0x7e5 <<<<<磁盘添加到磁盘组的时间
kfdhdb.crestmp.lo: 1820732416 ; 0x0ac: USEC=0x0 MSEC=0x18b SECS=0x8 MINS=0x1b <<<<<磁盘添加到磁盘组的时间
kfdhdb.mntstmp.hi: 33121617 ; 0x0b0: HOUR=0x11 DAYS=0xa MNTH=0x9 YEAR=0x7e5 <<<<<磁盘近一次被挂载的时间
kfdhdb.mntstmp.lo: 437690368 ; 0x0b4: USEC=0x0 MSEC=0x1a8 SECS=0x21 MINS=0x6
kfdhdb.secsize: 512 ; 0x0b8: 0x0200 <<<<<ASM磁盘的扇区大小
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 <<<<<ASM磁盘元数据块大小
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 <<<<<<AU的大小(bytes),默认是1M
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff <<<<<磁盘大小(以AU为单位),
kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002
kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000
kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000
kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi: 33121513 ; 0x0e4: HOUR=0x9 DAYS=0x7 MNTH=0x9 YEAR=0x7e5
kfdhdb.grpstmp.lo: 1539068928 ; 0x0e8: USEC=0x0 MSEC=0x315 SECS=0x3b MINS=0x16
kfdhdb.vfstart: 0 ; 0x0ec: 0x00000000
kfdhdb.vfend: 0 ; 0x0f0: 0x00000000
kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 <<<<<ASM spfile参数文件的AU号
kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000 <<<<<ASM spfile参数文件标识,1表示改参数文件位于参数kfdhdb.spfile的AU
kfdhdb.flags: 0 ; 0x0fc: 0x00000000
kfdhdb.f1b1fcn.base: 0 ; 0x100: 0x00000000
kfdhdb.f1b1fcn.wrap: 0 ; 0x104: 0x00000000
kfdhdb.ip[0]: 110 ; 0x108: 0x6e
kfdhdb.ip[1]: 121 ; 0x109: 0x79
kfdhdb.ip[2]: 1 ; 0x10a: 0x01
kfdhdb.ip[3]: 145 ; 0x10b: 0x91
kfdhdb.modstmp: 1631264793 ; 0x10c: 0x613b2019
kfdhdb.checklbl: 0 ; 0x110: 0x00
kfdhdb.verlbl: 0 ; 0x111: 0x00
kfdhdb.ub2spare: 0 ; 0x112: 0x0000
kfdhdb.sitelbl: ; 0x114: length=0
kfdhdb.fglbl: ; 0x124: length=0
kfdhdb.vsnnum: 318767104 ; 0x144: 0x13000000
kfdhdb.patchvsn: 0 ; 0x148: 0x0000
kfdhdb.operation: 0 ; 0x14a: 0x0000
kfdhdb.xtnd[0]: 0 ; 0x14c: 0x0000
kfdhdb.xtnd[1]: 0 ; 0x14e: 0x0000
kfdhdb.xtnd[2]: 0 ; 0x150: 0x0000
kfdhdb.xtnd[3]: 0 ; 0x152: 0x0000
kfdhdb.xtnd[4]: 0 ; 0x154: 0x0000
kfdhdb.xtnd[5]: 0 ; 0x156: 0x0000
kfdhdb.ub4spare[0]: 0 ; 0x158: 0x00000000
kfdhdb.ub4spare[1]: 0 ; 0x15c: 0x00000000
kfdhdb.ub4spare[2]: 0 ; 0x160: 0x00000000
kfdhdb.ub4spare[3]: 0 ; 0x164: 0x00000000
kfdhdb.ub4spare[4]: 0 ; 0x168: 0x00000000
kfdhdb.ub4spare[5]: 0 ; 0x16c: 0x00000000
kfdhdb.ub4spare[6]: 0 ; 0x170: 0x00000000
kfdhdb.ub4spare[7]: 0 ; 0x174: 0x00000000
kfdhdb.ub4spare[8]: 0 ; 0x178: 0x00000000
kfdhdb.ub4spare[9]: 0 ; 0x17c: 0x00000000
kfdhdb.ub4spare[10]: 0 ; 0x180: 0x00000000
kfdhdb.ub4spare[11]: 0 ; 0x184: 0x00000000
kfdhdb.ub4spare[12]: 0 ; 0x188: 0x00000000
kfdhdb.ub4spare[13]: 0 ; 0x18c: 0x00000000
kfdhdb.ub4spare[14]: 0 ; 0x190: 0x00000000
kfdhdb.ub4spare[15]: 0 ; 0x194: 0x00000000
kfdhdb.ub4spare[16]: 0 ; 0x198: 0x00000000
kfdhdb.ub4spare[17]: 0 ; 0x19c: 0x00000000
kfdhdb.ub4spare[18]: 0 ; 0x1a0: 0x00000000
kfdhdb.ub4spare[19]: 0 ; 0x1a4: 0x00000000
kfdhdb.ub4spare[20]: 0 ; 0x1a8: 0x00000000
kfdhdb.ub4spare[21]: 0 ; 0x1ac: 0x00000000
kfdhdb.ub4spare[22]: 0 ; 0x1b0: 0x00000000
kfdhdb.ub4spare[23]: 0 ; 0x1b4: 0x00000000
kfdhdb.ub4spare[24]: 0 ; 0x1b8: 0x00000000
kfdhdb.ub4spare[25]: 0 ; 0x1bc: 0x00000000
kfdhdb.ub4spare[26]: 0 ; 0x1c0: 0x00000000
kfdhdb.ub4spare[27]: 0 ; 0x1c4: 0x00000000
kfdhdb.ub4spare[28]: 0 ; 0x1c8: 0x00000000
kfdhdb.ub4spare[29]: 0 ; 0x1cc: 0x00000000
kfdhdb.ub4spare[30]: 0 ; 0x1d0: 0x00000000
kfdhdb.acdb.aba.seq: 0 ; 0x1d4: 0x00000000
kfdhdb.acdb.aba.blk: 0 ; 0x1d8: 0x00000000
kfdhdb.acdb.ents: 0 ; 0x1dc: 0x0000
kfdhdb.acdb.ub2spare: 0 ; 0x1de: 0x0000
ASM元数据块有两种类型,kfbh为前缀的块头信息和以kfdhdb为前缀的磁盘头信息


磁盘头备份
ASM磁盘头信息在个AU的个元数据块中,即block 0中,知道了AU大小以及元数据块大小,就可以计算出个AU中 倒数第二个块的编号
从而使用kfed工具完成备份。

[grid@rac1 ~]$ ausize=`kfed read /dev/sde1 | grep ausize | tr -s ' ' | cut -d' ' -f2`
[grid@rac1 ~]$ blksize=`kfed read /dev/sde1 | grep blksize | tr -s ' ' | cut -d' ' -f2`
[grid@rac1 ~]$ let n=$ausize/$blksize-2
[grid@rac1 ~]$ echo $n
254

下面我们读取AU1的倒数第二个块即254号块
kfed read /dev/sde1 aun=1 blkn=254
[grid@rac1 ~]$ kfed read /dev/sde1 aun=1 blkn=254
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 254 ; 0x004: blk=254
kfbh.block.obj: 2147483652 ; 0x008: disk=4
kfbh.check: 1472778246 ; 0x00c: 0x57c8d006
kfbh.fcn.base: 1302 ; 0x010: 0x00000516
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186647040 ; 0x020: 0x0b200200
kfdhdb.dsknum: 4 ; 0x024: 0x0004
kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: DISK1 ; 0x028: length=5
kfdhdb.grpname: DATA1 ; 0x048: length=5
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9
kfdhdb.siteguid[0]: 0 ; 0x088: 0x00
kfdhdb.siteguid[1]: 0 ; 0x089: 0x00
kfdhdb.siteguid[2]: 0 ; 0x08a: 0x00
kfdhdb.siteguid[3]: 0 ; 0x08b: 0x00
kfdhdb.siteguid[4]: 0 ; 0x08c: 0x00
kfdhdb.siteguid[5]: 0 ; 0x08d: 0x00
kfdhdb.siteguid[6]: 0 ; 0x08e: 0x00
kfdhdb.siteguid[7]: 0 ; 0x08f: 0x00
kfdhdb.siteguid[8]: 0 ; 0x090: 0x00
kfdhdb.siteguid[9]: 0 ; 0x091: 0x00
kfdhdb.siteguid[10]: 0 ; 0x092: 0x00
kfdhdb.siteguid[11]: 0 ; 0x093: 0x00
kfdhdb.siteguid[12]: 0 ; 0x094: 0x00
kfdhdb.siteguid[13]: 0 ; 0x095: 0x00
kfdhdb.siteguid[14]: 0 ; 0x096: 0x00
kfdhdb.siteguid[15]: 0 ; 0x097: 0x00
kfdhdb.ub1spare[0]: 0 ; 0x098: 0x00
kfdhdb.ub1spare[1]: 0 ; 0x099: 0x00
kfdhdb.ub1spare[2]: 0 ; 0x09a: 0x00
kfdhdb.ub1spare[3]: 0 ; 0x09b: 0x00
kfdhdb.ub1spare[4]: 0 ; 0x09c: 0x00
kfdhdb.ub1spare[5]: 0 ; 0x09d: 0x00
kfdhdb.ub1spare[6]: 0 ; 0x09e: 0x00
kfdhdb.ub1spare[7]: 0 ; 0x09f: 0x00
kfdhdb.ub1spare[8]: 0 ; 0x0a0: 0x00
kfdhdb.ub1spare[9]: 0 ; 0x0a1: 0x00
kfdhdb.ub1spare[10]: 0 ; 0x0a2: 0x00
kfdhdb.ub1spare[11]: 0 ; 0x0a3: 0x00
kfdhdb.ub1spare[12]: 0 ; 0x0a4: 0x00
kfdhdb.ub1spare[13]: 0 ; 0x0a5: 0x00
kfdhdb.ub1spare[14]: 0 ; 0x0a6: 0x00
kfdhdb.ub1spare[15]: 0 ; 0x0a7: 0x00
kfdhdb.crestmp.hi: 33121553 ; 0x0a8: HOUR=0x11 DAYS=0x8 MNTH=0x9 YEAR=0x7e5
kfdhdb.crestmp.lo: 1820732416 ; 0x0ac: USEC=0x0 MSEC=0x18b SECS=0x8 MINS=0x1b
kfdhdb.mntstmp.hi: 33121617 ; 0x0b0: HOUR=0x11 DAYS=0xa MNTH=0x9 YEAR=0x7e5
kfdhdb.mntstmp.lo: 437690368 ; 0x0b4: USEC=0x0 MSEC=0x1a8 SECS=0x21 MINS=0x6
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff
kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002
kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn: 0 ; 0x0d4: 0x00000000
kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000
kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi: 33121513 ; 0x0e4: HOUR=0x9 DAYS=0x7 MNTH=0x9 YEAR=0x7e5
kfdhdb.grpstmp.lo: 1539068928 ; 0x0e8: USEC=0x0 MSEC=0x315 SECS=0x3b MINS=0x16
kfdhdb.vfstart: 0 ; 0x0ec: 0x00000000
kfdhdb.vfend: 0 ; 0x0f0: 0x00000000
kfdhdb.spfile: 0 ; 0x0f4: 0x00000000
kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000
kfdhdb.flags: 0 ; 0x0fc: 0x00000000
kfdhdb.f1b1fcn.base: 0 ; 0x100: 0x00000000
kfdhdb.f1b1fcn.wrap: 0 ; 0x104: 0x00000000
kfdhdb.ip[0]: 110 ; 0x108: 0x6e
kfdhdb.ip[1]: 121 ; 0x109: 0x79
kfdhdb.ip[2]: 1 ; 0x10a: 0x01
kfdhdb.ip[3]: 145 ; 0x10b: 0x91
kfdhdb.modstmp: 1631264793 ; 0x10c: 0x613b2019
kfdhdb.checklbl: 0 ; 0x110: 0x00
kfdhdb.verlbl: 0 ; 0x111: 0x00
kfdhdb.ub2spare: 0 ; 0x112: 0x0000
kfdhdb.sitelbl: ; 0x114: length=0
kfdhdb.fglbl: ; 0x124: length=0
kfdhdb.vsnnum: 318767104 ; 0x144: 0x13000000
kfdhdb.patchvsn: 0 ; 0x148: 0x0000
kfdhdb.operation: 0 ; 0x14a: 0x0000
kfdhdb.xtnd[0]: 0 ; 0x14c: 0x0000
kfdhdb.xtnd[1]: 0 ; 0x14e: 0x0000
kfdhdb.xtnd[2]: 0 ; 0x150: 0x0000
kfdhdb.xtnd[3]: 0 ; 0x152: 0x0000
kfdhdb.xtnd[4]: 0 ; 0x154: 0x0000
kfdhdb.xtnd[5]: 0 ; 0x156: 0x0000
kfdhdb.ub4spare[0]: 0 ; 0x158: 0x00000000
kfdhdb.ub4spare[1]: 0 ; 0x15c: 0x00000000
kfdhdb.ub4spare[2]: 0 ; 0x160: 0x00000000
kfdhdb.ub4spare[3]: 0 ; 0x164: 0x00000000
kfdhdb.ub4spare[4]: 0 ; 0x168: 0x00000000
kfdhdb.ub4spare[5]: 0 ; 0x16c: 0x00000000
kfdhdb.ub4spare[6]: 0 ; 0x170: 0x00000000
kfdhdb.ub4spare[7]: 0 ; 0x174: 0x00000000
kfdhdb.ub4spare[8]: 0 ; 0x178: 0x00000000
kfdhdb.ub4spare[9]: 0 ; 0x17c: 0x00000000
kfdhdb.ub4spare[10]: 0 ; 0x180: 0x00000000
kfdhdb.ub4spare[11]: 0 ; 0x184: 0x00000000
kfdhdb.ub4spare[12]: 0 ; 0x188: 0x00000000
kfdhdb.ub4spare[13]: 0 ; 0x18c: 0x00000000
kfdhdb.ub4spare[14]: 0 ; 0x190: 0x00000000
kfdhdb.ub4spare[15]: 0 ; 0x194: 0x00000000
kfdhdb.ub4spare[16]: 0 ; 0x198: 0x00000000
kfdhdb.ub4spare[17]: 0 ; 0x19c: 0x00000000
kfdhdb.ub4spare[18]: 0 ; 0x1a0: 0x00000000
kfdhdb.ub4spare[19]: 0 ; 0x1a4: 0x00000000
kfdhdb.ub4spare[20]: 0 ; 0x1a8: 0x00000000
kfdhdb.ub4spare[21]: 0 ; 0x1ac: 0x00000000
kfdhdb.ub4spare[22]: 0 ; 0x1b0: 0x00000000
kfdhdb.ub4spare[23]: 0 ; 0x1b4: 0x00000000
kfdhdb.ub4spare[24]: 0 ; 0x1b8: 0x00000000
kfdhdb.ub4spare[25]: 0 ; 0x1bc: 0x00000000
kfdhdb.ub4spare[26]: 0 ; 0x1c0: 0x00000000
kfdhdb.ub4spare[27]: 0 ; 0x1c4: 0x00000000
kfdhdb.ub4spare[28]: 0 ; 0x1c8: 0x00000000
kfdhdb.ub4spare[29]: 0 ; 0x1cc: 0x00000000
kfdhdb.ub4spare[30]: 0 ; 0x1d0: 0x00000000
kfdhdb.acdb.aba.seq: 0 ; 0x1d4: 0x00000000
kfdhdb.acdb.aba.blk: 0 ; 0x1d8: 0x00000000
kfdhdb.acdb.ents: 0 ; 0x1dc: 0x0000
kfdhdb.acdb.ub2spare: 0 ; 0x1de: 0x0000

这个信息与ASM磁盘AU1的个元数据块内容一样。

如果磁盘头损坏或者丢失,使用kfed repair /dev/sde1即可修复,如果AU不是1M需要指定au大小


kfed read /dev/sde1 aun=0 blkn=0 | egrep 'name|size|type';

kfed read /dev/sde1 aun=1 blkn=254 | egrep 'name|size|type';


ASM 磁盘头信息在AU 0的个块(aun=0 blkn=0),以及AU1 的倒数第二个块(aun=1 blkn=254)



下面确认ASM磁盘头块类型
[grid@rac1 ~]$ kfed find /dev/sde1 | grep 'Block 0'
Block 0 has type 1

[grid@rac1 ~]$ kfed find /dev/sde1 aun=1 | grep '510'
Block 510 has type 1
一个AU1024K,一个元数据块4k所以一个AU256个块,两个AU512个块,计数从0到511,所以,第二个AU的倒数第二个块是block 510

[grid@rac1 ~]$ kfed read /dev/sde1 aun=0 blkn=0 | egrep 'name|size|type';
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname: DISK1 ; 0x028: length=5
kfdhdb.grpname: DATA1 ; 0x048: length=5
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff

[grid@rac1 ~]$ kfed read /dev/sde1 aun=1 blkn=254 | egrep 'name|size|type';
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname: DISK1 ; 0x028: length=5
kfdhdb.grpname: DATA1 ; 0x048: length=5
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff

下面模拟个AU的个块损坏
fked if/dev/zero of=/dev/sde1 bs=4096 count=1
[grid@rac1 ~]$ dd if=/dev/zero of=/dev/sde1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00536697 s, 763 kB/s

下面kfed读取磁盘头信息块
kfed read /dev/sde1 aun=0 blkn=0
[grid@rac1 ~]$ kfed read /dev/sde1 aun=0 blkn=0
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
000000000 00000000 00000000 00000000 00000000 [................]
Repeat 255 times
KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

[grid@rac1 ~]$ dd if=/dev/zero of=/dev/sdg1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00084925 s, 4.8 MB/s
[grid@rac1 ~]$ kfed read /dev/sdg1 aun=0 blkn=0
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
000000000 00000000 00000000 00000000 00000000 [................]
Repeat 255 times
KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

下面我们分析ASM磁盘状态,以及告警日志信息
SQL> select group_number,name,path, mount_status,header_status,state from v$asm_disk where group_number=1 order by 2;

GROUP_NUMBER NAME PATH MOUNT_STATUS HEADER_STA STATE
------------ -------------------- ---------- --------------------- ---------- ------------------------
1 DISK1 /dev/sde1 CACHED MEMBER NORMAL
1 DISK2 /dev/sdf1 CACHED MEMBER NORMAL
1 DISK3 /dev/sdg1 CACHED MEMBER NORMAL
1 DISK4 /dev/sdh1 CACHED MEMBER NORMAL

查看默认备份的ASM磁盘头信息块是否正常
kfed read /dev/sde1 aun=1 blkn=254 | egrep 'name|size|type';
[grid@rac1 ~]$ kfed read /dev/sde1 aun=1 blkn=254 | egrep 'name|size|type';
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD <<<<磁盘头备份块正常
kfdhdb.dskname: DISK1 ; 0x028: length=5
kfdhdb.grpname: DATA1 ; 0x048: length=5
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff

下面修复磁盘头信息
kfed repair /dev/sde1

[grid@rac1 ~]$ kfed read /dev/sde1 aun=0 blkn=0 | egrep 'name|size|type'
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname: DISK1 ; 0x028: length=5
kfdhdb.grpname: DATA1 ; 0x048: length=5
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff

kfed repair /dev/sde1
[grid@rac1 ~]$ kfed read /dev/sdf1 aun=0 blkn=0 | egrep 'name|size|type'
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname: DISK2 ; 0x028: length=5
kfdhdb.grpname: DATA1 ; 0x048: length=5
kfdhdb.fgname: DATA1_FG1 ; 0x068: length=9
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize: 1023 ; 0x0c4: 0x000003ff






如果有一个失败组正常,数据库正常运行没有影响,如果只剩下一个失败组,而且其中两块盘中的一个盘故障,根据具体数据分布对业务影响不同。这里
我们格式化了失败组中磁盘的磁盘头信息,关闭磁盘组,重新挂起报错
SQL> alter diskgroup data1 mount;
alter diskgroup data1 mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA1" cannot be mounted
ORA-15040: diskgroup is incomplete

下面ASM 告警日志
ERROR: alter diskgroup data1 mount
2021-09-14T10:30:58.379413+08:00
SQL> alter diskgroup data1 mount
2021-09-14T10:30:58.543249+08:00
NOTE: cache registered group DATA1 1/0x1DAAEF49
NOTE: cache began mount (first) of group DATA1 1/0x1DAAEF49
2021-09-14T10:30:59.576263+08:00
NOTE: Assigning number (1,5) to disk (/dev/sdf1)
NOTE: Assigning number (1,3) to disk (/dev/sdh1)
NOTE: Assigning number (1,4) to disk (/dev/sde1)
NOTE: Assigning number (1,2) to disk (/dev/sdg1)
2021-09-14T10:30:59.710636+08:00
cluster guid (dd16d1353f274f80ff1bc81e95eeee15) generated for PST Hbeat for instance 1
2021-09-14T10:31:05.783569+08:00
NOTE: GMON heartbeating for grp 1 (DATA1)
GMON querying group 1 at 40 for pid 40, osid 3851
2021-09-14T10:31:05.796914+08:00
NOTE: cache is mounting group DATA1 created on 2021/09/07 09:22:59
NOTE: cache opening disk 2 of grp 1: DISK3 path:/dev/sdg1
NOTE: group 1 (DATA1) high disk header ckpt advanced to fcn 0.1715
NOTE: 09/14/21 10:31:05 DATA1.F1X0 found on disk 2 au 101 fcn 0.1715 datfmt 1
NOTE: cache opening disk 3 of grp 1: DISK4 path:/dev/sdh1
NOTE: cache opening disk 4 of grp 1: DISK1 path:/dev/sde1
NOTE: cache opening disk 5 of grp 1: DISK2 path:/dev/sdf1
NOTE: 09/14/21 10:31:05 DATA1.F1X0 found on disk 5 au 2 fcn 0.1715 datfmt 1
2021-09-14T10:31:05.807241+08:00
NOTE: cache mounting (first) normal redundancy group 1/0x1DAAEF49 (DATA1)
2021-09-14T10:31:05.833786+08:00
* allocate domain 1, valid ? 0
kjbdomatt send to inst 2
2021-09-14T10:31:06.013656+08:00
NOTE: attached to recovery domain 1
2021-09-14T10:31:06.069544+08:00
NOTE: crash recovery of group DATA1 will recover thread=1 ckpt=5.217 domain=1 inc#=2 instnum=2
NOTE: crash recovery of group DATA1 will recover thread=2 ckpt=4.26 domain=1 inc#=4 instnum=1
validate pdb 1, flags x4, valid 0, pdb flags x204
* validated domain 1, flags = 0x200
NOTE: advancing ckpt for group 1 (DATA1) thread=1 ckpt=5.217 domain inc# 0
NOTE: advancing ckpt for group 1 (DATA1) thread=2 ckpt=4.26 domain inc# 0
NOTE: cache recovered group 1 to fcn 0.1750
NOTE: redo buffer size is 256 blocks (1056768 bytes)
2021-09-14T10:31:06.317112+08:00
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA1)
NOTE: LGWR found thread 1 closed at ABA 5.216 lock domain=0 inc#=0 instnum=2
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA1)
2021-09-14T10:31:06.433571+08:00
NOTE: LGWR opened thread 1 (DATA1) at fcn 0.1750 ABA 6.217 lock domain=1 inc#=4 instnum=1 gx.incarn=497741641 mntstmp=2021/09/14 10:31:06
.406000
2021-09-14T10:31:06.436440+08:00
NOTE: cache mounting group 1/0x1DAAEF49 (DATA1) succeeded
NOTE: cache ending mount (success) of group DATA1 number=1 incarn=0x1daaef49
2021-09-14T10:31:06.621728+08:00
NOTE: Instance updated compatible.asm to 11.2.0.2.0 for grp 1 (DATA1).
2021-09-14T10:31:06.624398+08:00
NOTE: Instance updated compatible.asm to 11.2.0.2.0 for grp 1 (DATA1).
2021-09-14T10:31:06.627258+08:00
NOTE: Instance updated compatible.rdbms to 10.1.0.0.0 for grp 1 (DATA1).
2021-09-14T10:31:06.629069+08:00
NOTE: Instance updated compatible.rdbms to 10.1.0.0.0 for grp 1 (DATA1).
2021-09-14T10:31:06.648852+08:00
SUCCESS: diskgroup DATA1 was mounted
2021-09-14T10:31:06.716456+08:00
SUCCESS: alter diskgroup data1 mount
2021-09-14T10:31:07.009372+08:00
NOTE: diskgroup resource ora.DATA1.dg is online
2021-09-14T10:31:45.578603+08:00
NOTE: Flex client id 0x0 [prod1:prod:rac-cluster] attempting to connect
NOTE: will remove stale ownerid 0x41e8e55d02db63de for client prod1:prod:rac-cluster (asmb#:1, startid:1082913100)
(reconnected as ownerid 0x41e8e55d02e04c12, startid:1083234689)
NOTE: registered owner id 0x41e8e55d02e04c12 for prod1:prod:rac-cluster
NOTE: Flex client prod1:prod:rac-cluster registered, osid 4385, mbr 0x0, asmb 4377 (reg:1159154903)
2021-09-14T10:31:47.634655+08:00
NOTE: found stale ownerid 0x41e8e55d02db63de for client prod1:prod:rac-cluster
NOTE: removing stale ownerid 0x41e8e55d02db63de for client prod1:prod:rac-cluster (reg:3398679131)
NOTE: released resources held for client id 0x41e8e55d02db63de (reg:3398679131)
2021-09-14T10:31:49.414085+08:00
NOTE: m-asmb client prod1:prod:rac-cluster assigned CGID 0x10005 for group 2
2021-09-14T10:31:53.350782+08:00
NOTE: client prod1:prod:rac-cluster mounted group 2 (DATA)
2021-09-14T10:32:02.783312+08:00
NOTE: m-asmb client prod1:prod:rac-cluster assigned CGID 0x10006 for group 1
2021-09-14T10:32:05.370710+08:00
NOTE: client prod1:prod:rac-cluster mounted group 1 (DATA1)










相关文章