16 Replies Latest reply: Oct 6, 2013 4:52 AM by CHRISMAKI RSS

Unable to add second node - 8.2 cDOT - ESXi 5.1

CHRISMAKI
Currently Being Moderated

I am having problems adding a second node to my virtual cluster. The first node started up fine and I ran through the cluster create script. I've got the first two vNICs on a separate vSwitch as they are the cluster interfaces. Here's what happens on booting of the second node:

 

1: Join or create? join

2: Are these the IPs you want (169.254.*)? yes

3: Enter the name of the cluster: [ClusterName] <enter>

4: Joining cluster …

5: Network set up …

6: Node check …

7: Restarting Cluster Setup …

8: Revert to step #1

 

When I try to ping the two IPs presented in step 2 from Node 1 I am able, so the networking is setup properly but the second node won't join. Any thoughts on what I'm missing?

  • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
    CHRISMAKI
    Currently Being Moderated

    I should also note that the only error message I get is:

     

    Error: Cluster join membership failed.

     

    It does this during the Node Check so perhaps it's related to that?

  • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
    SEANLUCEOST
    Currently Being Moderated

    It sounds like you haven't changed the second node's System ID and Serial Number.

     

    From page 31 of the Installation and Setup Guide:

    9. Press the space bar when the Hit [Enter] to boot immediately, or any other key for command prompt. Booting in 10 seconds... message is displayed.

              You should see a VLOADER> prompt.

    10. Change the Serial Number and System ID for this node:

              VLOADER> setenv SYS_SERIAL_NUM 4034389-06-2

              VLOADER> setenv bootarg.nvram.sysid 4034389062

    11. Verify that the information was saved correctly by entering the following two commands:

              VLOADER> printenv SYS_SERIAL_NUM

              VLOADER> printenv bootarg.nvram.sysid

    12. Enter the boot command to boot the node: VLOADER> boot

              The simulator begins the boot process with the new system id and serial number

     

    This needs to be done before you boot the second node for the first time.  If you have already done an 'option 4' on the second node, unpack a new copy and start fresh.

    It is critical that you use the values provided for the Serial Number and System ID as the new 8.2 licenses are node locked based on these values.

     

    To get the most out of the 8.2 simulator check out these blog posts:

    http://www.cosonok.com/2013/08/clustered-ontap-82-sim-maximizing.html

    http://www.cosonok.com/2013/09/a-new-sim-recipe.html

     

    Here is a link to the install guide for the sim: http://support.netapp.com/knowledge/docs/simulate_ontap/Simulate_ONTAP_8.2_Installation_and_Setup_Guide.pdf

     

    I hope this helps!

     

    Sean Luce

    Open Systems Technologies

    • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
      CHRISMAKI
      Currently Being Moderated

      The answer RTFM probably would have been well deserved up until I performed those steps and then had a new problem. After changing both of those parameters I ended up with a new issue. I even deployed a new version of the VM that had yet to be booted as per the instructions but ended up with the following:

       

      --------------------------------------------------

      PANIC: Can't find device with WWN 0x1400322304. Remove '/sim/dev/,disks/reservations' and restart. in SK process vha_disk_resv on release 8.2 (C) on Sun Sep 15 21:12:48 GMT 2013

      version: 8.2: Tue May 21 05:58:22 PDT 2013

      compile flags: x86_64

      recursive PANIC: page_t has no physical address

      cpuid = 0

      Uptime: 38s

       

      The operating system has halted.

      Please press any key to reboot.

       

      System halting...

      cpu_reset called on cpu#0

      --------------------------------------------------

       

      Any further advice?

      • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
        sgrandjean
        Currently Being Moderated

        I have exact the same problem. The setup guide is made for ESX 4.1 and for 5.x things work different. The situation with al the small VMDKs does not work. When you use VM converter this is fixed or you just remove the harddisk4 en recreate it.

         

        But after changing the system ID the error occurs with a panic. Maintenance mode can not be booted.

         

        So what's wrong here?

        • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
          sgrandjean
          Currently Being Moderated

          As searched there is multiple trouble and everything seems to be with the initial startup. As soon as the cristeen sim has been started and you want to change the system ID it will not work.

          As found on the net I did deploy the files again and followed the script. This time it worked.

          Deploy the files en add to inventory but remove the the sim VMDK's en recreate it as a big flat file (make adjustments in vmx and vmdk config files). Otherwise it will not work in ESX 5.x. Then boot and press <space> as in the script and follow the adjustment of altering the system ID. boot OnTap and join the cluster.

           

          The NetApp PDF needs some slight adjustments and it should be pointed out the keer de tar bal or the deployed VM. As soon as you start it you have to redeploy.

           

          I now have a working two node and single cDOT cluster.

          • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
            CHRISMAKI
            Currently Being Moderated

            I guess since I figured this out a few days ago I should have posted an update.

             

            I was working from a thick-provisioned VMDK provided to me by a colleague at NetApp. When following the PDF more recently instead of just bashing ahead, I had much more success.

             

            Proceed as follows:

            1. Download the vsim_esx-cm.tgz and transfer it to your datastore.
            2. tar -xvzf vsim_esx-cm.tgz
            3. Now how many nodes do you want, two? Maybe a third for a replication "cluster"? Make as many copies of the vsim_esx-cm directory, -1 as you want nodes, naming them accordingly. Once done, rename the initial directory (vsim_esx-cm) to whatever the final node is so that you have a uniform naming syntax.
            4. Browse your datastore for the first node's directory and import the VMX file. Boot this node to start your cluster.
            5. Browse your datastore for the second directory, on the VERY FIRST BOOT enter the loader and change the SYSID stuff listed in the setup guide.
            6. Subsequent nodes in the same cluster will all require new SYSIDs but really two nodes in a cluster should be sufficient.
            7. On first full boot, i.e.: not having entered the loader, you'll want to hit up the maintenance menu for option 4.

             

            I have a third node in a single node cluster that I intend to use as a snapmirror target though I have yet to set that part up, hopefully the fact that it's SYSID will match Node 1 in the 2 node cluster won't matter. If it does, I'll start from scratch, changing the SYSID on first boot.

            • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
              sgrandjean
              Currently Being Moderated

              Dear Chris,

               

              Thank you for your find back. Your email just find me as I go home (late, again). But with a smile as I managed to gets this running.

               

              I also have a dual node cluster and one single now, including all the licenses. I was just starting to make the vServer.

               

              The procedure is clear once know you what to done and it just takes some time to understand. Luckly I’m VCP so I understand some ESX.

               

              The bit of the first boot is very crucial. Some thing happens when you first boot the VM.

               

              Thin of thick provisioned does not matter. But the VMDK that make up the sim flat file will not work ou-of-the-bos that is the only thing.

               

              And yes I agree copying the cristeen files is very very important.

               

              But then again how will you create the “keys” for two extra nodes to make a four node cluster?

               

              Also I find that moving the managment LIF breaks my SSH session which I did not expected… Maybe the LIF of the Vserver will be non-disruptive. ☺

               

              I go home now and will start to make a Vserver tomorrow maybe, I still have some work to do.

               

              Again thank you very much!

               

              With kind regards,

               

              Sebastian

              • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
                CHRISMAKI
                Currently Being Moderated

                Sebastian,

                 

                Just for fun I will create a third node today for the primary cluster today and see if the keys work, it is exactly this reason I have hesitated as I'm pretty sure they won't work.

                 

                The reason your SSH session drops is because it is a stateful protocol and once the IP moves, the switch(es) ARP, you're going to get dropped. There's nothing you can do about this

                 

                Non-technical: Twice now you've used the word "cristeen" which isn't actually a word, I think you meant to type "pristine" perhaps? Either way, "a fresh copy of the vsim" is what is required.

                 

                I'll go deploy node 3 now and update soon.

                 

                Oh, also back on the technical side, I'm doing the following on my ESXi 5.1 box:

                -------------

                # cat /etc/rc.local.d/local.sh

                #!/bin/sh

                # -- Loading Module multiextent to support NetApp vSIM 8.1.1 --

                /sbin/vmkload_mod multiextent

                -------------

                 

                Not sure if this is still required, but it doesn't appear to be hurting. Also, when I do a "vmkload_mod -l", I've got hits under the "Used" column.

  • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
    SAVBUTEAM
    Currently Being Moderated

    I am trying to add  the second node to the DataONTAP simulator cluster on ESXi5.1 .And I saw your blog as I googled for the issue I encouterd.

    I have the same issue you mentioned in your blog:

    PANIC: Can't find device with WWN 0x1400322304. Remove '/sim/dev/,disks/reservations' and restart. in SK process vha_disk_resv on release 8.2 (C) on Sun Sep 15 21:12:48 GMT 2013

    version: 8.2: Tue May 21 05:58:22 PDT 2013

    compile flags: x86_64

    recursive PANIC: page_t has no physical address

    cpuid = 0

    Uptime: 38s

    The operating system has halted.

    Please press any key to reboot.

    I have already switch the disk 4 from the “thin provision” to the “thick eager provision” and the other three disk are all under think provisoin.

    But I after I change the system ID and bootarg .nvram id according to the PDF guide I still see the above error.

    Could you please give me some guidance?

    Thank you very much.

    Lin

    • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
      CHRISMAKI
      Currently Being Moderated

      Hey everyone, I just wrote a document on how to do this in 17 steps, go here.

    • Re: Unable to add second node - 8.2 cDOT - ESXi 5.1
      sgrandjean
      Currently Being Moderated

      Dear Lin,

       

      It is a bit silly how this works…

       

      But the things that the procedure only works when followed exactly as documented but they forget to tell you that you can only use the clean files.

       

      So you have node one running. You can not use these files for node two. You really have to download again the initial files.

       

      Thin or thick provisioned has nothing to do with it. But the initial files wil not work on ESX5.x.

       

      So first download the tgz en extract and “add it to inventory”. Do not start it!!!

       

      Then delete the hard disk4 in ESX VM settings en re-create it with 250gb space.

       

      When you delete hard disk 4 please select delete files from datastore aswell.

       

      Now you are set. The big importance is the nvram file should be untouched everything depends on this

       

      Start the Console of the VM en then start the VM. Press ”. Now follow the procedure and change the SYSTEM ID. “boot” and choose option 4.

       

      If for some reason things the VM boots the NVRAM vmdk gets touched and you can not change the SYSTEM ID because this will lead to a PANIC.

       

      I hope this will help you in getting it running, please let me know.

       

      With kind regards,

       

      Sebastian

More Like This

  • Retrieving data ...