14 Replies Latest reply: Sep 16, 2013 4:46 AM by ANDREAS.JANKOWIAK RSS

SMSAP/SMO 3.3 on AIX: Clone Database does not work

helmut NetApp Employee Novice
Currently Being Moderated

Hi, 

 

I installed SMSAP 3.3.0 in an AIX environment and discovered the following problem which looks very much like a bug to me.

 

When cloning a DB form an existing Snapshot-Backup and selecting the new 3.3 function "Recover Database" in the GUI Wizard (or when using CLI run the  smsap clone create WITHOUT the "-no-resetlogs" option then it fails with the messages:

 

--[ERROR] SMSAP-04083: Error Recovering cloned database

.....

--[ERROR] SMSAP-13032: Cannot perform operation: Clone Create.  Root cause: ORACLE-00001: Error executing SQL: [ALTER DATABASE OPEN RESETLOGS;]

 

we use FCP Luns and Volume Groups (AIX Standard)

 

if you have a closer look in the logs the you can see that soon after it mounted the archivlogs from the Backup to the temporary location, and when it prepares for the database recovery , suddenly a disconnect of the archivlogs  is issued - and after that its clear that the Recovery fails.

 

This part of the sequence looks e.g. like that in the log-file:

........

ORACLE-00000: Executing SQL command: SELECT STATUS FROM V$INSTANCE;

2013-03-06 10:09:10,116 [RMI TCP Connection(2)-10.10.11.133] [DEBUG]: ORACLE-20007: Database instance ABC is in state MOUNTED.

2013-03-06 10:09:10,116 [RMI TCP Connection(2)-10.10.11.133] [DEBUG]: Opening connection for JDBC descriptor jdbc:oracle:thin:sys/XXXXXXXX@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.11.133)(PORT = 1527)))(CONNECT_DATA = (SID = ABC)))

2013-03-06 10:09:10,122 [RMI TCP Connection(2)-10.10.11.133] [DEBUG]: Adding logrequest null already tried from backup Snap1_logs

2013-03-06 10:09:10,123 [RMI TCP Connection(2)-10.10.11.133] [INFO ]: SMSAP-03055: Disconnecting backup Snap1_logs.

  --> suddenly starts disconnecting the archlog Filesystem!

013-03-06 10:09:10,520 [default0 aed43035b10a261cacf8a6b4f6185072] [INFO ]: SD-00016: Discovering storage resources for oraarchvg_0.

2013-03-06 10:09:10,559 [default0 aed43035b10a261cacf8a6b4f6185072] [DEBUG]: EXE-00000: Executing shell command:

0:/bin/sh -c "/usr/sbin/snapdrive" "storage" "show" "-vg" "oraarchvg_0"

  ........

2013-03-06 10:09:13,798 [default0 aed43035b10a261cacf8a6b4f6185072] [DEBUG]: EXE-00000: Executing shell command:

0:/bin/sh -c "/usr/sbin/snapdrive" "snap" "disconnect" "-vg" "oraarchvg_0" "-full"

1:/usr/sbin/snapdrive snap disconnect -vg oraarchvg_0 -full

2013-03-06 10:09:19,508 [Execution Monitor Thread [/usr/sbin/snapdrive snap disconnect -vg oraarchvg_0 -full]] [DEBUG]: EXE-0000 1: Shell result [0:00:05.710] (Exit Value: 0):

 

deleting disk group oraarchvg_0

  - fs /opt/NetApp/smsap/mnt/-oracle-KL1-oraarch-20130306100853455_0 ... deleted

  - hostvol oraarchvg_0/kl1oraarchlv_0 ... deleted

  - dg oraarchvg_0 ... deleted

  - LUN janis:/vol/SnapManager_20130306100853461_clone_prodclu_oraarchvg/clone_prodclu_oraarchvg_01.lun ... disconnected

  - deleting volume clone ... janis:/vol/SnapManager_20130306100853461_clone_prodclu_oraarchvg  done

 

2013-03-06 10:09:19,509 [default0 aed43035b10a261cacf8a6b4f6185072] [INFO ]: SD-00038: Finished disconnecting volume groups [oraarchvg_0].

2013-03-06 10:09:19,565 [RMI TCP Connection(2)-10.10.11.133] [ERROR]: SMSAP-04083: Error Recovering cloned database

 

I reproduced this on two AIX Systems with different SAP Instances.

It happend when cloning to an alternate host as well as when cloning to the same host

 

 

I downgraded to SMSAP 3.2P3 - and the problem disappeared - Clone to the same host or to another host worked without any error with SMSAP 3.2

 

 

I think it would be a good idea if NetApp would setup a test in their qualification lab to fix the issue.

It should be easily reproducable, at least with AIX and FCP Luns, but maybe even with other Unix OS.

 

Helmut

  • Re: SMSAP/SMO 3.3 on AIX: Clone Database does not work
    ashwath NetApp Employee Novice
    Currently Being Moderated

    Hi,

     

          I am currently looking into the issue you reported and will revert soon with an update.

     

    Thanks,

    Ashwath

  • Re: SMSAP/SMO 3.3 on AIX: Clone Database does not work
    ashwath NetApp Employee Novice
    Currently Being Moderated

    Hi Helmut,

     

          Can you please provide me the complete logs of Oracle and SMO?

     

    Thanks,

    Ashwath

  • Re: SMSAP/SMO 3.3 on AIX: Clone Database does not work
    SHEETZDBA Novice
    Currently Being Moderated

    I think I have a similar issue on RedHat 5 using SMO 3.3. Cloning used to work in 3.2, but now it doesn't function. The archivelog ASM disk is removed before the operation is complete.

     

    --[ INFO] SMO-03054: Mounting backup Backup1 on host dcvmtestdb.xxx.com to feed archivelogs.

    --[ INFO] SD-00028: Beginning to connect device(s) [/dev/sdf] from snapshot smo_dcvmtestdb_test2db_f_h_2_8aa08ae33d7ec35b013d7ec362300001_1.

    --[ INFO] SD-00029: Finished connecting device(s) [/dev/sdf] from snapshot smo_dcvmtestdb_test2db_f_h_2_8aa08ae33d7ec35b013d7ec362300001_1.

    --[ INFO] ASM-10002: Assigning ASMLib disk identifier SM_DSK_1363631733567 to device /dev/sdm1.

    --[ INFO] ASM-00500: Changing ASM Disk Group name in header of disk /dev/sdm1 from REC to REC_20130318143533654_0.

    --[ INFO] ASM-00000: Mounting ASM Disk Group REC_20130318143533654_0 on host dcvmtestdb.xxx.com.

    --[ INFO] SMO-03061: Perform recovery using backup Backup1

    --[ INFO] SMO-03081: Perform recovery using all connected backups Backup1

    --[ INFO] ORACLE-30023: Beginning batch recovery process for database Clone1.

    --[ INFO] ORACLE-30008: Beginning recovery process for database Clone1.

    --[ INFO] ORACLE-30010: Database recovery point objective: earliest consistent point

    --[ INFO] ORACLE-30017: Recovery point objective of earliest consistent point reached after applying archived/online redo logs [+REC_20130318143533654_0/test2db/archivelog/2013_03_18/thread_1_seq_1.256.810397791].

    --[ INFO] ORACLE-30024: Finished batch recovery process for database Clone1.

    --[ INFO] SMO-03066: Recovery succeeded.

    -->REMOVES THE ARCHIVELOG ASM DISK

    --[ INFO] SMO-03055: Disconnecting backup Backup1.

    --[ INFO] ASM-00002: Dropping ASM Disk Group REC_20130318143533654_0.

    --[ INFO] ASM-10003: Deleting ASMLib disk SM_DSK_1363631733567.

    --[ INFO] SD-00019: Discovering storage resources for all system devices.

    --[ INFO] SD-00020: Finished storage discovery for all system devices.

    --[ INFO] SD-00034: Beginning to disconnect device(s) [/dev/sdm].

    --[ INFO] SD-00035: Finished disconnecting device(s) [/dev/sdm].

    --[ INFO] SMO-07320: Opening database "Clone1" with RESETLOGS option.

    --[ INFO] SMO-07316: Shutting down database "Clone1".

    --[ INFO] SMO-07321: Starting up database "Clone1" in MOUNT mode.

    --[ INFO] SMO-07307: Generating new Database ID for "Clone1".

    --[ERROR] SMO-07315: Error running DBNEWID utility: .

    --[ERROR] SMO-13032: Cannot perform operation: Clone Create.  Root cause: SMO-07315: Error running DBNEWID utility: .

    -->ERROR DUE TO QUERYING ASM DISK IT ALREADY REMOVED

    --[ERROR] FLOW-11019: Failure in Disconnect: ASM-10000: Error executing ASMLib command "/etc/init.d/oracleasm querydisk -d SM_DSK_1363631733567": Disk "SM_DSK_1363631733567" does not exist or is not instantiated

    --[ERROR] FLOW-11008: Operation failed: ASM-10000: Error executing ASMLib command "/etc/init.d/oracleasm querydisk -d SM_DSK_1363631733567": Disk "SM_DSK_1363631733567" does not exist or is not instantiated

    --[ERROR] SMO-08002: Error while aborting operation: SMO-11008: Error destroying cloned storage: FLOW-11019: Failure in Disconnect: ASM-10000: Error executing ASMLib command "/etc/init.d/oracleasm querydisk -d SM_DSK_1363631733567": Disk "SM_DSK_1363631733567" does not exist or is not instantiated

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points