6 Replies Latest reply: Dec 16, 2013 5:28 AM by ostiguy RSS

Problems while collecting data for Switches

sanadmin_stadtdo Novice
Currently Being Moderated

Hi all,

 

we have OnCommand Insight in use for one month and not much experience with the tool. The installation/configuration is done by Netapp.

 

No we have constant problems with collecting data from switches. So it is often, that no data is collected, but also indicates no error. After stopping and starting the DataSource it works again - for a few hours, days.

 

We are using 2 identical SAN fabrics with CISCO switches (pure FC-switches like DS-C9509, C9124, C9134 and FCoE-Switches like Nexus-C5548). Switch port performance is only displayed/collected in one Fabric for 2 of 9 switches - in the other fabric (with identical switches) no data is shown, whatever the reason?

 

Has anyone similar problems or solutions?

 

I would be grateful for any information.

Thanks a lot.

 

Michael

  • Re: Problems while collecting data for Switches
    ostiguy NetApp Employee Powerboat Racer
    Currently Being Moderated

    Hey Michael,

     

    This is a bit weird. For Cisco devices, OCI uses SNMP for both inventory and performance collection. So, usually both inventory and perf work, or both do not work. Since inventory has apparently worked for both fabrics at times, I am a bit surprised you are not seeing at least some performance data. Nonetheless, this should not be happening - long, consistent outages in collection make me wonder if the OCI server is having a resources problem.

     

    Do you know if your NetApp contacts configured OCI to send OCI ASUP (autosupport)? OCI ASUP can send what we call extended logs - which is a collection of logs and .zip files from OCI's acqusition (data collection) attempts. If you are sending OCI ASUP, could you send me an email at ostiguy at netapp dot com with what your OCI sit name is - in the title bar of your OCI client, you will see it in parentheses , and you will also see it in the upper right hand corner of the OCI HTTP management portal ( Site: ______ ).

     

    If we are receiving extended logs, I could take a look.

     

    Matt

    • Re: Problems while collecting data for Switches
      sanadmin_stadtdo Novice
      Currently Being Moderated

      Hallo Matt,

       

      thanks for your answer.

      At the moment the Server does not send ASUPs, but this is an internal problem with our mail-admins.

      Which logs would be helpful? I found some zip-logs at the Insight-Server under "SANscreen => log" witch start with cognos_ or dwh_.

       

      Michael

      • Re: Problems while collecting data for Switches
        ostiguy NetApp Employee Powerboat Racer
        Currently Being Moderated

        Hey Michael,

         

        This is a tricky one, as it might not only be a data collection problem - question - are the Cisco datasources the only datasources that exhibit this?

         

        Lets start with:

         

        ../sanscreen/acq/log

         

        This is where acquisition does all of its logging

         

        There are some acq*.log* files - these are the master acquisition log files. Copy them to a folder

         

        Then

         

        foundation_xyz....zip

         

        Where xyz is the OCI datasource name - copy those to the same folder.

         

        Then

         

        \..SANscreen\jboss\server\onaro\log

         

        From this folder :

        jboss.log

        server.log

        performance.log

         

        Copy those 3 files to the same folder.

         

        Create a zip of the folder - if it is under 10MB, you can email me the zip at ostiguy at netapp dot com .

         

        This will allow me to triage your system - the master acquisition logs will let me see what the "SANscreen Acq" windows service is up to. The 3 server logs will let me know if jboss (.log) is healthy, if the server (.log) is processing acquisition reports correctly, and whether performance (.log) reports are being processed correctly

  • Re: Problems while collecting data for Switches
    ostiguy NetApp Employee Powerboat Racer
    Currently Being Moderated

    Hey all,

     

    Just want to close this thread out :

     

    Michael had 2 issues:

    #1 A pair of switches that were going to be decommissioned - these were highly unreliable for communication via SNMP for some reason, despite them being on the same ethernet subnet as the rest of his switches (which would tend to rule out WAN latency as a root cause). They were removed from his environment, and his datasources were highly reliable from that point forward - prior to this, his datasources would fail with "partial success N-1 of N", because the datasources knew that each fabric had N switches.

     

    #2. A lack of vfc port statistics on his FCoE interfaces on his Cisco Nexus switches. A data source patch resolved it - this patch is NOT going to be a part of OCI 6.4.2.0.1  as it was resolved after the freeze date for 6.4.2.0.1 . This patch will be part of a future data source service pack for OCI 6.4.[1-2]

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points