DG库gap故障案例

环境说明:

主备库均为asm方式建库。数据库版本为11.2.0.4。
主库:192.168.99.237 oracle1
备库:192.168.99.238 oracle2
实验内容:主要先关闭备库,在主库做归档丢失备库无法同步归档文件,备库产生GAP,然后增量备份恢复备库。

实验模拟gap故障:

1、先检查主备库的dg状态,dg状态正常:

主库:

SQL> select switchover_status,database_role,open_mode from v$database;

SWITCHOVER_STATUS    DATABASE_ROLE    OPEN_MODE
-------------------- ---------------- --------------------
TO STANDBY           PRIMARY          READ WRITE
SQL>  select thread#,process,status,sequence#,DELAY_MINS from v$managed_standby;

   THREAD# PROCESS   STATUS        SEQUENCE# DELAY_MINS
---------- --------- ------------ ---------- ----------
         0 ARCH      CONNECTED             0          0
         0 ARCH      CONNECTED             0          0
         1 ARCH      CLOSING             120          0
         0 ARCH      CONNECTED             0          0
         0 RFS       IDLE                  0          0
         1 RFS       IDLE                121          0
         0 RFS       IDLE                  0          0
         1 MRP0      APPLYING_LOG        121          0

备库:

SQL> select switchover_status,database_role,open_mode from v$database;

SWITCHOVER_STATUS    DATABASE_ROLE    OPEN_MODE
-------------------- ---------------- --------------------
NOT ALLOWED          PHYSICAL STANDBY READ ONLY WITH APPLY

2、备库操作:下边取消实时应用,也就是关掉了mpr进程,再把备库关闭。

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> select thread#,process,status,sequence#,DELAY_MINS from v$managed_standby;

   THREAD# PROCESS   STATUS        SEQUENCE# DELAY_MINS
---------- --------- ------------ ---------- ----------
         0 ARCH      CONNECTED             0          0
         0 ARCH      CONNECTED             0          0
         1 ARCH      CLOSING             120          0
         0 ARCH      CONNECTED             0          0
         0 RFS       IDLE                  0          0
         1 RFS       IDLE                121          0
         0 RFS       IDLE                  0          0
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

3、主库上操作:模拟归档丢失,这时备库已经关掉,RFS进程无法把主库归档传送过去。

3.1、主库切换产生归档
SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> select max(sequence#) from v$archived_log;

MAX(SEQUENCE#)
--------------
           123
3.2、主库上查看刚才产生的归档日志文件:
image.png
3.3、备库上查看之前的归档文件信息:
image.png
3.4、删除主库中的122和123归档:
ASMCMD> rm -rf thread_1_seq_122.303.1082899809
ASMCMD> rm -rf thread_1_seq_123.304.1082899811

备注:这里让主库模拟丢失的本来是121和122和123,但是备库一直出现不了GAP的状态,后来一看备库日志说的是121已经再被运输中(in transit),所以说这里如果备库重新启动121是直接被应用的,如果把121号归档rm掉,是产生不了GAP的,所以要模拟rm掉归档 就rm 122号归档,也就是切换的第二个归档,这里需要注意下。

4、备库上:启动备库,查看GAP

4.1、启动备库,打开实时同步,检查gap信息:
SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

SQL> select thread#,process,status,sequence#,DELAY_MINS from v$managed_standby;

   THREAD# PROCESS   STATUS        SEQUENCE# DELAY_MINS
---------- --------- ------------ ---------- ----------
         0 ARCH      CONNECTED             0          0
         0 ARCH      CONNECTED             0          0
         0 ARCH      CONNECTED             0          0
         1 ARCH      CLOSING             121          0
         0 RFS       IDLE                  0          0
         0 RFS       IDLE                  0          0
         1 RFS       IDLE                125          0
         1 MRP0      WAIT_FOR_GAP        122          0

8 rows selected.
SQL> select * from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
         1           122            123
4.2、备库:查看alert日志信息,也能看到报错:
image.png
4.3、备库:在备库查看一下缺失的归档文件,发现和主库丢失的一样,所以就出现了gap错误。
image.png

5、主库:查询到122号归档的scn号,然后做增量备份,这里要做的是在主库上查询到122号归档的scn号,也就是在对122号归档做操作之后的记录,然后在备库增量恢复。

5.1、查询到24号归档之前的操作
SQL> select FIRST_CHANGE#  from v$archived_log where SEQUENCE#=122;

FIRST_CHANGE#
-------------
      1062975
5.2、查看当前的归档号
SQL> select current_scn from v$database;

CURRENT_SCN
-----------
    1066169

6、模拟在test01表空间下创建2个数据文件:

SQL> alter  tablespace test01 add datafile '+datadg' size 100M;

Tablespace altered.

SQL> alter  tablespace test01 add datafile '+datadg' size 200M;

Tablespace altered.
image.png

7、主库上做基于1062975的备份(也就是SEQUENCE#=122号的scn号)。

RMAN> backup as compressed backupset incremental from SCN 1062975 database format '/home/oracle/standby_%d_%T_%U.bak' include current controlfile for standby filesperset=5 tag 'FOR STANDBY';

Starting backup at 10-SEP-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=130 device type=DISK
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATADG/orcl/datafile/system.260.1082804591
input datafile file number=00002 name=+DATADG/orcl/datafile/sysaux.261.1082804595
input datafile file number=00008 name=+DATADG/orcl/datafile/test01.273.1082877021
input datafile file number=00004 name=+DATADG/orcl/datafile/users.264.1082804607
input datafile file number=00009 name=+DATADG/orcl/datafile/test01.274.1082904145
channel ORA_DISK_1: starting piece 1 at 10-SEP-21
channel ORA_DISK_1: finished piece 1 at 10-SEP-21
piece handle=/home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak tag=FOR STANDBY comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=+DATADG/orcl/datafile/undotbs1.262.1082804599
input datafile file number=00005 name=+DATADG/orcl/datafile/test01.266.1082807589
input datafile file number=00010 name=+DATADG/orcl/datafile/test01.275.1082904199
input datafile file number=00006 name=+DATADG/orcl/datafile/test01.267.1082807665
input datafile file number=00007 name=+DATADG/orcl/datafile/test01.272.1082876997
channel ORA_DISK_1: starting piece 1 at 10-SEP-21
channel ORA_DISK_1: finished piece 1 at 10-SEP-21
piece handle=/home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak tag=FOR STANDBY comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including standby control file in backup set
channel ORA_DISK_1: starting piece 1 at 10-SEP-21
channel ORA_DISK_1: finished piece 1 at 10-SEP-21
piece handle=/home/oracle/standby_ORCL_20210910_0t08njpv_1_1.bak tag=FOR STANDBY comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 10-SEP-21

这里查看一下备份的文件:

[root@oracle1 oracle]# ll
-rw-r----- 1 oracle asmadmin 1409024 Sep 10 14:46 standby_ORCL_20210910_0r08njpo_1_1.bak
-rw-r----- 1 oracle asmadmin 1302528 Sep 10 14:46 standby_ORCL_20210910_0s08njps_1_1.bak
-rw-r----- 1 oracle asmadmin 1114112 Sep 10 14:46 standby_ORCL_20210910_0t08njpv_1_1.bak

8、主库:把主库备份集传输到备库/home/oracle下

[root@oracle1 oracle]# scp standby_ORCL_20210910_0*  oracle@192.168.99.238:/home/oracle
oracle@192.168.99.238's password:
Permission denied, please try again.
oracle@192.168.99.238's password:
standby_ORCL_20210910_0r08njpo_1_1.bak                                                 100% 1376KB   1.3MB/s   00:00
standby_ORCL_20210910_0s08njps_1_1.bak                                                 100% 1272KB   1.2MB/s   00:00
standby_ORCL_20210910_0t08njpv_1_1.bak                                                 100% 1088KB   1.1MB/s   00:00

9、备库:在备库上进行恢复scn号,首先恢复控制文件:

9.1、关闭备库,启动到nomount状态下:
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup nomount
ORACLE instance started.

Total System Global Area 2471931904 bytes
Fixed Size                  2255752 bytes
Variable Size             620758136 bytes
Database Buffers         1828716544 bytes
Redo Buffers               20201472 bytes
9.2、先用备份集恢复控制文件,并把数据库启动到mount阶段:
RMAN> restore standby controlfile from  '/home/oracle/standby_ORCL_20210910_0t08njpv_1_1.bak';

Starting restore at 10-SEP-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=131 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=+DATADG/orcl2/controlfile/control01.ctl
output file name=+FLASHDG/orcl2/controlfile/control02.ctl
Finished restore at 10-SEP-21

RMAN> alter database mount;
database mounted
released channel: ORA_DISK_1
9.3、注册备库原有数据文件:
RMAN> catalog start with '+datadg';

Starting implicit crosscheck backup at 10-SEP-21
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=135 device type=DISK
Crosschecked 1 objects
Finished implicit crosscheck backup at 10-SEP-21

Starting implicit crosscheck copy at 10-SEP-21
using channel ORA_DISK_1
Crosschecked 2 objects
Finished implicit crosscheck copy at 10-SEP-21

searching for all files in the recovery area
cataloging files...
cataloging done

List of Cataloged Files
9.4、注册增量备文件:
RMAN> catalog start with '/home/oracle';

searching for all files that match the pattern /home/oracle

List of Files Unknown to the Database
=====================================
File Name: /home/oracle/standby_ORCL_20210910_0t08njpv_1_1.bak
File Name: /home/oracle/.bash_history
File Name: /home/oracle/.bashrc
File Name: /home/oracle/.bash_logout
File Name: /home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak
File Name: /home/oracle/.bash_profile
File Name: /home/oracle/.kshrc
File Name: /home/oracle/.ssh/known_hosts
File Name: /home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak
File Name: /home/oracle/.viminfo

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /home/oracle/standby_ORCL_20210910_0t08njpv_1_1.bak
File Name: /home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak
File Name: /home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak
9.5、还原增量数据文件(如果没有新增数据文件 跳过):

主库查询增加的数据文件号:(上面模拟的是添加的9号和10号文件)

SQL> select  file_name,file_id from dba_data_files;

FILE_NAME                                                       FILE_ID
------------------------------------------------------------ ----------
+DATADG/orcl/datafile/system.260.1082804591                           1
+DATADG/orcl/datafile/sysaux.261.1082804595                           2
+DATADG/orcl/datafile/undotbs1.262.1082804599                         3
+DATADG/orcl/datafile/users.264.1082804607                            4
+DATADG/orcl/datafile/test01.266.1082807589                           5
+DATADG/orcl/datafile/test01.267.1082807665                           6
+DATADG/orcl/datafile/test01.272.1082876997                           7
+DATADG/orcl/datafile/test01.273.1082877021                           8
+DATADG/orcl/datafile/test01.274.1082904145                           9
+DATADG/orcl/datafile/test01.275.1082904199                          10

恢复9号和10号数据文件:

RMAN> restore datafile 9;

Starting restore at 10-SEP-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=194 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00009 to +DATADG/orcl2/datafile/test01.274.1082904145
channel ORA_DISK_1: reading from backup piece /home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak
channel ORA_DISK_1: piece handle=/home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak tag=FOR STANDBY
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 10-SEP-21

RMAN> restore datafile 10;

Starting restore at 10-SEP-21
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00010 to +DATADG/orcl2/datafile/test01.275.1082904199
channel ORA_DISK_1: reading from backup piece /home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak
channel ORA_DISK_1: piece handle=/home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak tag=FOR STANDBY
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 10-SEP-21
9.6、recover问题:recover增量恢复备份文件,发现系统提示,需要把所有数据文件都要restore。
RMAN> recover database noredo;

Starting recover at 10-SEP-21
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 09/10/2021 15:06:58
RMAN-06094: datafile 1 must be restored

RMAN> restore datafile 1;

Starting restore at 10-SEP-21
using channel ORA_DISK_1

channel ORA_DISK_1: restoring datafile 00001
input datafile copy RECID=9 STAMP=1082904975 file name=+DATADG/orcl2/datafile/system.258.1082872823
destination for restore of datafile 00001: +DATADG/orcl2/datafile/system.260.1082804591
channel ORA_DISK_1: copied datafile copy of datafile 00001
output file name=+DATADG/orcl2/datafile/system.306.1082905653 RECID=0 STAMP=0
Finished restore at 10-SEP-21

RMAN> recover database noredo;

Starting recover at 10-SEP-21
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 09/10/2021 15:07:49
RMAN-06094: datafile 2 must be restored
9.7、把剩下所有数据文件全部restore:
RMAN> restore datafile 1;
RMAN> restore datafile 2;
RMAN> restore datafile 3;
RMAN> restore datafile 4;
RMAN> restore datafile 5;
RMAN> restore datafile 6;
RMAN> restore datafile 7;
RMAN> restore datafile 8;
9.8、再次recover增量恢复备份文件:
RMAN>  recover database noredo;

Starting recover at 10-SEP-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=70 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: +DATADG/orcl2/datafile/system.306.1082905653
destination for restore of datafile 00002: +DATADG/orcl2/datafile/sysaux.307.1082905779
destination for restore of datafile 00004: +DATADG/orcl2/datafile/users.309.1082905809
destination for restore of datafile 00008: +DATADG/orcl2/datafile/test01.313.1082905851
channel ORA_DISK_1: reading from backup piece /home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak
channel ORA_DISK_1: piece handle=/home/oracle/standby_ORCL_20210910_0r08njpo_1_1.bak tag=FOR STANDBY
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00003: +DATADG/orcl2/datafile/undotbs1.308.1082905795
destination for restore of datafile 00005: +DATADG/orcl2/datafile/test01.310.1082905821
destination for restore of datafile 00006: +DATADG/orcl2/datafile/test01.311.1082905833
destination for restore of datafile 00007: +DATADG/orcl2/datafile/test01.312.1082905841
channel ORA_DISK_1: reading from backup piece /home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak
channel ORA_DISK_1: piece handle=/home/oracle/standby_ORCL_20210910_0s08njps_1_1.bak tag=FOR STANDBY
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished recover at 10-SEP-21
9.9、备库:打开数据库,并启动数据库至mount状态下,打开闪回区:
SQL> alter database open;

Database altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Database mounted.
SQL> alter database flashback on;

Database altered.

SQL> alter database open;

Database altered.

10、备库:打开MRP进程,并查看进程状态:

10.1、备库下打开MPR进程(MPR进程状态为APPLYING_LOG),查看到所有进程正常:
SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.
SQL> select thread#,process,status,sequence#,DELAY_MINS from v$managed_standby;

   THREAD# PROCESS   STATUS        SEQUENCE# DELAY_MINS
---------- --------- ------------ ---------- ----------
         0 ARCH      CONNECTED             0          0
         0 ARCH      CONNECTED             0          0
         0 ARCH      CONNECTED             0          0
         1 ARCH      CLOSING             128          0
         0 RFS       IDLE                  0          0
         0 RFS       IDLE                  0          0
         1 RFS       IDLE                129          0
         1 MRP0      APPLYING_LOG        129          0

8 rows selected.
10.2、备库:查看gap状态为空,证明gap故障解决:
SQL> select * from v$archive_gap;

no rows selected

11、查询dg两个库的状态:

11.1、主库:
[oracle@oracle1 tmp]$ sqlplus / as sysdba
SQL> select switchover_status,database_role,open_mode from v$database;

SWITCHOVER_STATUS    DATABASE_ROLE    OPEN_MODE
-------------------- ---------------- --------------------
TO STANDBY           PRIMARY          READ WRITE
11.2、备库:
SQL> select switchover_status,database_role,open_mode from v$database;

SWITCHOVER_STATUS    DATABASE_ROLE    OPEN_MODE
-------------------- ---------------- --------------------
NOT ALLOWED          PHYSICAL STANDBY READ ONLY WITH APPLY

12、验证:查看之前dg损坏后,在主库创建的2和文件是否存在:

image.png

13、截止到目前我们把gap故障已经解决,数据库dg状态正常。

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,752评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,100评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,244评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,099评论 1 286
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,210评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,307评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,346评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,133评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,546评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,849评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,019评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,702评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,331评论 3 319
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,030评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,260评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,871评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,898评论 2 351

推荐阅读更多精彩内容