14 Replies Latest reply: Jul 21, 2013 9:06 PM by aborzenkov RSS

Replace FAS2040 controller module

jgiang72 Novice
Currently Being Moderated

I need assistant from your expert people. We have fas2040 setup as active/active configuration and both heads have disk and storage assigned to each controller. This is strictly cifs/NSF environment without block or FC. Last week controller 2 have die on us and we currently in a takeover mode on controller 1. I purchased a refurbished controller this week to replace controller 2. I took controller 2 out of the chassis, remove the CF boot card, nvram  battery,  SPF module and put in the new controller that I purchased. I put back in the storage. I interrupted the controller with Crl+C and get in the BMC shell via the console. I issue BMC config and noticed that BMC does not has the setting of my old controller so I can't telnet via ssh to BMC shell. So I go ahead updated the BMC ip, gateway, etc but unable to update the controller name. I can now telnet to BMC interface but the password I have on my system doesn't work with the new  controller via BMC shell.

 

I thought it supposes to boot from the cf card and load all the configuration from my dead controller to the new controller and all I have to do is assign the disks to the new system I'd but apparently it is not the case. So right now the new controller is in the chassis up but doesn't have the correct configuration. I have the controller sit at the boot loader.

What is the correct way step by step to get the new controller up and running with configuration from the dead controller without wipe out my existing data and configuration on my Netapp and own the disks that were belong to the dead controller.

So here is the quick capture of my system stage. Controller 1 is currently in takeover mode. Controller 2 is in the system with cf card from my old controller but not boot up nor have the correct configuration as it should be. Controller 2 is in LOADER-B stage.

 

Please help as the instruction replace fas20xx controller module from Netapp is so outdated or not correct.

 

Thanks!

J

  • Re: Replace FAS2040 controller module
    jgiang72 Novice
    Currently Being Moderated

    Our on tap version is 8.1 7-mode by the way...

  • Re: Replace FAS2040 controller module
    aborzenkov Grand Marshal
    Currently Being Moderated

    BMC is synchronized by Data ONTAP when it boots and you need to complete controller replacement and perform giveback to allow Data ONTAP to boot on replacement controller. I reviewed controller replacement instructions and personally I found them pretty much accurate. Did you try to follow them before stating that they are outdated?

  • Re: Replace FAS2040 controller module
    jgiang72 Novice
    Currently Being Moderated

    Ok, I have tried as followed.

    1. Get the CF card, battery over to the new controller
    2. Reconnect all cabling and put the controller back in the chassis with controller 1 is still in takeover mode
    3. Interrupted the boot process via the console with Ctrl + C during boot
    4. From the  promt “LOADER-B>”  I type boot_diags
    5. Run mb
    6. Exit
    7. Boot_ontap
    8. Ctrl-c during boot menu and got the menu below

    Please choose one of the following:

     

    (1) Normal Boot.

    (2) Boot without /etc/rc.

    (3) Change password.

    (4) Clean configuration and initialize all disks.

    (5) Maintenance mode boot.

    (6) Update flash from backup config.

    (7) Install new software first.

    (8) Reboot node.

    Selection (1-8)?

    1. Select option 5 to enter Maintenance mode
    2. I then get the following message

    In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode.

     

    FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED

     

    NOTE: It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up.

    Jul 18 13:43:19 [localhost:shelf.config.spha:info]: System is using single path HA attached storage only.

    Please answer yes or no.

    Continue with boot? no

    1. I have to select NO on the above as there is no reference anywhere in the document.
    2. The system halting… after that. I am not able to continue on and the controller rebooted so I have to do ctrl+C to get it to loader prompt and has it stay there for now....as i don't want it wipe out my configuration, data or bring the other controller down just to be on a safe side.

    any input on how to get this working again greatly appreciated ....

    • Re: Replace FAS2040 controller module
      saranraj456 Cyclist
      Currently Being Moderated

      you can give "yes" at continue with boot? option.


      Saran

    • Re: Replace FAS2040 controller module
      aborzenkov Grand Marshal
      Currently Being Moderated

      It is safe to just boot into maintenance mode to just record systemid. The prompt also says it: "It is okay to use 'show/status' sub-commands such as 'disk show or aggr status' in Maintenance mode while the partner is up."

       

      If you know new systemid already, you can simply skip it and proceed with disk reassignment.

      • Re: Replace FAS2040 controller module
        jgiang72 Novice
        Currently Being Moderated

        I can lookup the system id in BMC without enter in maintenance mode. but need to get into maintenance mode to reassign the disk. By select the option 5, I got a prompt "In a High Availablity configuration, you MUST ensure that the partner node is (and remains) down, or that takeover is manually disabled on the partner node, because High Availability software is not started or fully enabled in Maintenance mode.  FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED" no where in the document tell me what to do and how to make sure "that takeover is manually disabled on the partner node". I wish that netapp can produce a clear and better document and more over that they have netapp engineer monitor the forum and help us out. As right now neither one of us agreed on the correct way of doing it...

        • Re: Replace FAS2040 controller module
          aborzenkov Grand Marshal
          Currently Being Moderated

          need to get into maintenance mode to reassign the disk

          You do NOT need to go into maintenance mode for that. What gave you that idea? Documentation quite clearly states that it is done from partner controller.

          I wish that netapp can produce a clear and better document

          There is feedback button. But I again have feeling that you did not even read documentation.

          • Re: Replace FAS2040 controller module
            jgiang72 Novice
            Currently Being Moderated

            below is straight from netapp document. read step 1 on page 11 of "Replacing the controller module in a FAS20xx system". Did I miss read that or did you?

             

            Reassigning disks on a system operating in 7-Mode

            You must reassign disks before you boot the software. Some of the steps are different depending on whether the system is stand-

            alone or in an HA pair.

            About this task

            •    You must apply the commands in these steps on the correct systems:

            •    The target node is the node on which you are performing maintenance.

            •    The partner  node is the HA partner of the target node.

            •    Do not issue any commands relating to aggregates until the entire procedure is completed.

            Steps

            1. If you have not already done so, reboot the target node, interrupt the boot process by entering Ctrl-C, and then select the option to boot to Maintenance mode from the displayed menu.

            You must enter y  when prompted to override the system ID due to a system ID mismatch.

            2. View the new system IDs by entering the following command

            • Re: Replace FAS2040 controller module
              aborzenkov Grand Marshal
              Currently Being Moderated

              And where pray do you see that you need to perform reassignment in maintenance mode?

               

              It explains how to lookup systemid.

              • Re: Replace FAS2040 controller module
                jgiang72 Novice
                Currently Being Moderated

                Didn't I cut and Pasted that section from the document, bold print it and also told you where to find it.  I have a feeling that you not reading at all and just jump in conclusion what you think it should be. thanks for all the comments. I was hoping some one can collaborate with how-to and better construct then this...not sure which document you're reading but certainly what you stated is no where to be found in the document below.

                here is the link to the document.   https://library.netapp.com/ecm/ecm_download_file/ECMM1280334

                thank you and have a nice day

                 

                Anyone beside ABORZENKOV have any experience with replace the FAS2040 controller do feel free to help me out. So far this is go now where ....

                • Re: Replace FAS2040 controller module
                  aborzenkov Grand Marshal
                  Currently Being Moderated

                  I did perform controller replacement more than once, that is why I have reasons to state that replacement procedure is correct. If you have reasons to believe this procedure is incorrect, you need to open support case and discuss it with NetApp engineer.

                   

                  For the last time - you do not perform disk reassignment from maintenance mode. You halt controller after confirming systemid and reassign disks from partner node.

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points