發(fā)現(xiàn)問(wèn)題
某網(wǎng)友的數(shù)據(jù)庫(kù)由于壞盤了,并且存儲(chǔ)掉電,導(dǎo)致數(shù)據(jù)庫(kù)無(wú)法open了。單看其數(shù)據(jù)庫(kù)alert log的錯(cuò)誤來(lái)看,是非常之簡(jiǎn)單的,如下:
Fri Oct 26 10:33:53 2018Recovery of Online Redo Log: Thread 1 Group 3 Seq 39 Reading mem 0Mem# 0: /fs/fs/oradata/orcl/redo03.logBlock recovery stopped at EOT rba 39.77.16Block recovery completed at rba 39.77.16, scn 0.1002048587ORACLE Instance orcl (pid = 8) - Error 600 encountered while recovering transaction (9, 30) on object 9149.Fri Oct 26 10:33:53 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_smon_192644.trc:ORA-00600: internal error code, arguments: [6856], [0], [43], [], [], [], [], []Fri Oct 26 10:33:56 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_smon_192644.trc:ORA-00600: internal error code, arguments: [4194], [33], [36], [], [], [], [], []Doing block recovery for file 2 block 713Block recovery from logseq 39, block 82 to scn 1002048595
對(duì)于這種錯(cuò)誤,很明顯,屏蔽回滾段即可,屏蔽之后可順利打開數(shù)據(jù)庫(kù),不過(guò)后面很快又會(huì)crash掉,因此重建undo也就繞過(guò)這個(gè)問(wèn)題了。
打開數(shù)據(jù)庫(kù)之后,再去觀察數(shù)據(jù)庫(kù),會(huì)發(fā)現(xiàn)alert log有不少的錯(cuò)誤,如下所示:
Fri Oct 26 11:01:46 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [17147], [0x110549070], [], [], [], [], [], []Fri Oct 26 11:01:46 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_m001_373218.trc:ORA-00600: internal error code, arguments: [kdddgb5], [196650], [0], [], [], [], [], []ORA-600 encountered when generating server alert SMG-4120Fri Oct 26 11:01:47 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [KGHALO4], [0x11047F6F0], [], [], [], [], [], []ORA-600 encountered when generating server alert SMG-4121Fri Oct 26 11:01:48 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [KGHALO4], [0x11047F6F0], [], [], [], [], [], []ORA-600 encountered when generating server alert SMG-4121Fri Oct 26 11:01:50 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_m001_373218.trc:ORA-00600: internal error code, arguments: [kdddgb5], [196650], [0], [], [], [], [], []Fri Oct 26 11:02:22 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [17114], [0x110549070], [], [], [], [], [], []Fri Oct 26 11:02:23 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [kebm_mmon_main_1], [39], [], [], [], [], [], []ORA-00039: error during periodic actionORA-00600: internal error code, arguments: [17114], [0x110549070], [], [], [], [], [], []Fri Oct 26 11:03:30 2018Restarting dead background process MMON
除此之外,由于之外alert log有壞塊報(bào)錯(cuò),因此對(duì)system進(jìn)行了dbv檢查,發(fā)現(xiàn)確實(shí)存在少量壞塊,如下:
DBVERIFY: Release 10.2.0.4.0 - Production on Fri Oct 26 10:37:20 2018 Copyright (c) 1982, 2007, Oracle. All rights reserved. DBVERIFY - Verification starting : FILE = system01.dbf DBV-00200: Block, DBA 4255202, already marked corruptBlock Checking: DBA = 4258751, Block Type = KTB-managed data blockdata header at 0x11022a05ckdbchk: fsbo(596) wrong, (hsz 4178)Page 64447 failed with check code 6129Block Checking: DBA = 4259386, Block Type = KTB-managed data block**** kdxcofbo = 208 != 24---- end index block validationPage 65082 failed with check code 6401Block Checking: DBA = 4269609, Block Type = Unlimited data segment headerIncorrect extent count in the extent map: 16777317Block Checking: DBA = 4269612, Block Type = KTB-managed data block**** kdxcofbo = 224 != 216---- end index block validationPage 75308 failed with check code 6401Block Checking: DBA = 4269615, Block Type = KTB-managed data block**** actual rows locked by itl 2 = 1 != # in trans. header = 0---- end index block validationPage 75311 failed with check code 6401Page 85271 is influx - most likely media corruptCorrupt block relative dba: 0x00414d17 (file 1, block 85271)Fractured block found during dbv:Data in bad block:type: 6 format: 2 rdba: 0x00414d17last change scn: 0x0000.3afaf495 seq: 0x1 flg: 0x04spare1: 0x0 spare2: 0x0 spare3: 0x0consistency value in tail: 0xfe830601check value in block header: 0x96c6computed block checksum: 0x3c6b Page 85383 is influx - most likely media corruptCorrupt block relative dba: 0x00414d87 (file 1, block 85383)Fractured block found during dbv:Data in bad block:type: 6 format: 2 rdba: 0x00414d87last change scn: 0x0000.3b6b9d19 seq: 0x1 flg: 0x06spare1: 0x0 spare2: 0x0 spare3: 0x0consistency value in tail: 0x970f0601check value in block header: 0xe825computed block checksum: 0x3c6b DBVERIFY - Verification complete Total Pages Examined : 640000Total Pages Processed (Data) : 116312Total Pages Failing (Data) : 1Total Pages Processed (Index): 65914Total Pages Failing (Index): 3Total Pages Processed (Other): 64634Total Pages Processed (Seg) : 0Total Pages Failing (Seg) : 0Total Pages Empty : 393138Total Pages Marked Corrupt : 3Total Pages Influx : 2Highest block SCN : 1002028510 (0.1002028510)
這部分錯(cuò)誤,其實(shí)處理起來(lái)也不困難,部分是業(yè)務(wù)表的index,但是其他的幾乎都是AWR相關(guān)基表,有2個(gè)壞塊跟是system相關(guān)的基表和索引,分別是I_H_OBJ#_COL#和COM$ ,HISTGRM$。
對(duì)于業(yè)務(wù)索引,很簡(jiǎn)單,直接drop 重建即可,對(duì)于這個(gè)sys的index,可以通過(guò)設(shè)置38003 event進(jìn)行drop重建。
對(duì)于基表COM$,HISTGRM$,由于是非bootstrap$核心對(duì)象,其實(shí)也可以處理掉的。
處理方法
不過(guò)考慮到這種畢竟是存儲(chǔ)掉電,undo異常的情況,還是重建庫(kù)更穩(wěn)妥一些。最后補(bǔ)充一點(diǎn),這個(gè)庫(kù)稍微有點(diǎn)奇葩的地方是全庫(kù)1.2TB,其中有個(gè)表的LOB自動(dòng)980GB,重建數(shù)據(jù)庫(kù)是相對(duì)較慢的。對(duì)于大表,且有LOB自動(dòng),通常建議基于分片,否則會(huì)報(bào)ORA-01555錯(cuò)誤的,如下是常用的一個(gè)基于rowid的分片腳本,供大家參考:
set verify off undefine rowid_ranges undefine segment_name undefine owner set head off set pages 0 set trimspool on select 'where rowid between ''' || sys.dbms_rowid.rowid_create(1, d.oid, c.fid1, c.bid1, 0) || ''' and ''' || sys.dbms_rowid.rowid_create(1, d.oid, c.fid2, c.bid2, 9999) || '''' || ';' from (select distinct b.rn, first_value(a.fid) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) fid1, last_value(a.fid) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) fid2, first_value(decode(sign(range2 - range1), 1, a.bid + ((b.rn - a.range1) * a.chunks1), a.bid)) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) bid1, last_value(decode(sign(range2 - range1), 1, a.bid + ((b.rn - a.range1 + 1) * a.chunks1) - 1, (a.bid + a.blocks - 1))) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) bid2 from (select fid, bid, blocks, chunks1, trunc((sum2 - blocks + 1 - 0.1) / chunks1) range1, trunc((sum2 - 0.1) / chunks1) range2 from (select /*+ rule */ relative_fno fid, block_id bid, blocks, sum(blocks) over() sum1, trunc((sum(blocks) over()) / &&rowid_ranges) chunks1, sum(blocks) over(order by relative_fno, block_id) sum2 from dba_extents where segment_name = upper('&&segment_name') and owner = upper('&&owner')) where sum1 > &&rowid_ranges) a, (select rownum - 1 rn from dual connect by level <= &&rowid_ranges) b where b.rn between a.range1 and a.range2) c, (select max(data_object_id) oid from dba_objects where object_name = upper('&&segment_name') and owner = upper('&&owner') and data_object_id is not null) d /
總結(jié)
以上就是這篇文章的全部?jī)?nèi)容了,希望本文的內(nèi)容對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,如果有疑問(wèn)大家可以留言交流,謝謝大家對(duì)VeVb武林網(wǎng)的支持。
|
新聞熱點(diǎn)
疑難解答
圖片精選