95 Replies Latest reply: Apr 5, 2013 1:10 AM by vims RSS

reallocate TR?

danpancamo
Currently Being Moderated

After fighting with our DBAs for months now about IO performance, we finally narrowed down a growing performance issue to disk fragmentation.   We accidentally discovered that a copy of a database was 2x faster to backup than the original copy which lead us to the conclusion that we did actually have an issue with IO. 

 

 

We ran a reallocate measure which resulted in a threshold of 3.   I'm not sure exactly how this is calculated, but obviously the result is misleading. The results should have been 10 IMHO.

 

We ran reallocate start -f -p on the volume and it immediately reduced the backup time in half.   Disk util, disk caching, latency all were significantly better after the reallocate completed.

 

 

It appears that a the Sybase full reindex basically tries to optimize by re-writing data...   This process somehow causes the disk fragmentation.

 

 

I've only been able to find limited information on the reallocate command, however with this significant performance increase, there should be a whitepaper on the subject that includes the effects of reallocate on Cache, PAM, Dedup, Vmware, Databases, Exchange, etc...

 

 

Is there a TR document in the works on reallocate?  If not, someone should start one.

  • Re: reallocate TR?
    __frostbyte_9045
    Currently Being Moderated

    I'll second the need for further documentation.  As part of my PM work (things are slow right now) I found that some of my volumes came back with 6 and 7's.  However, the documentatioin does seem to be very light!  I've been playing around but don't know if is really helping since we did't do any benchmarking prior to reallocate being run.

  • Re: reallocate TR?
    BrendonHiggins
    Currently Being Moderated

    Hi

     

    I am just posting as I would be keen to know more about reallocate.  I have used the command a couple of times in the past and have had issues due to aggregates being create 7.1 and the command being used on 7.3 filers.

     

    How did you "we finally narrowed down a growing performance issue to disk fragmentation".  Are you using statit and looking at chain lengths and RAID stats?

     

    Thanks

     

    Bren

  • Re: reallocate TR?
    jasonczerak
    Currently Being Moderated

    I'd have to agree.  I'm looking into performance tuning now that our main business application will be moving to RAC and NetApp this Summer.  Highly transactional kind of stuff.

  • Re: reallocate TR?
    jeremypage
    Currently Being Moderated

    Hey NetApp, please document this better - most of your customers are not willing to read these boards to find stuff like this and the docs in the 7.3 manual are very sparse. Would be nice to have a decent scheduling system set up too.

     

    And the same thing was true for our system, reallocate made a huge difference in sequential read type stuff.

    • Re: reallocate TR?
      jeremypage
      Currently Being Moderated

      In fact I'd be happy just to know if aggr reallocates take care of everything under them. I can handle one large flood of snapshots if I'm ready for them but I'd prefer not to get ready if it's not worth my time...

      • Re: reallocate TR?
        aborzenkov
        Currently Being Moderated

        In fact I'd be happy just to know if aggr reallocates take care of everything under them. I can handle one large flood of snapshots if I'm ready for them but I'd prefer not to get ready if it's not worth my time...


        According to official NetApp manuals aggregate reallocation does not optimize file layout (which is logical when you think about it - aggregate does not know anything about files that are too far above). It compacts used blocks to create more contiguous free space.

         

        So aggregate reallocation may help with disk writes, but it shouldn't have any effect on large sequential disk reads.

        • Re: reallocate TR?
          jasonczerak
          Currently Being Moderated

          We all know NetApp filers need massive help with writes

           

          I just have a problem with the resources the reallocate -A has affects.  7.3.3 supposes to help make it better.. 8.0 completly solve it.

          • Re: reallocate TR?
            jeremypage
            Currently Being Moderated

            Not sure what you mean by that, what problems do you have with writes? Sized properly the NVRAM should be handling most of that load.

            • Re: reallocate TR?
              jasonczerak
              Currently Being Moderated

              When ever large write workloads are kicked off, say, bulding up temp table space, table splits, or file copies. Once the write thoughput reached 200MB/sec we start to see some increase read latency, once it's at 250MB/sec the NVRAM can not keep up, even on disks that are not utilizied (under 10% IO and space utiliziation).  Filer wide latency is increased.  This is on a 6080 filer 7.3.1.1.  average thoughput 9-5 is 400MB/sec on each 6080 node in the cluster. at times we push well over 500. 50-75MB/sec average write work load.   Write workloads just kill things when pushed.

               

              We've worked to limit write's to off hours and what not so it's not a big deal.

              • Re: reallocate TR?
                jeremypage
                Currently Being Moderated

                I gotcha, I think OnTap is probably tuned to expect the NVRAM to keep up with the writes and it sounds like you're going well beyond it's ability. Maybe you can get some inside-out PAM cards

                 

                We're super (read "filer used as RAM because the DBA has no clue") read intensive here so I don't see that problem. Our biggest single system is an Oracle 10g DB that can peak in the 200mBs range but usually is between 100 and 85 - but 98% of that is reads and 90% of those are being serviced by the cache. Sad part is the AIX host that Oracle is running on has at least 10 gig of free memory

                • Re: reallocate TR?
                  jasonczerak
                  Currently Being Moderated

                  We migrated from HP + oracle 9 to Linux + oracle 10g + RAC + NFS + 10Ge. New to NetApp at the same time.  After a year we started to explore some tuning.  just doubling some SGA or what ever IO usage drop 50% on the filer side. The DBA's were new to RAC and used "monolithic tuning" on RAC at first. It was safe call at first. Plus the new env was 150% faster (before the memory changes) then the old so there wasn't any more call to tune.

                   

                  We'll be doing some more tuning linux side and netapp side this summer if we can find some time.

                  • Re: reallocate TR?
                    jeremypage
                    Currently Being Moderated

                    I'd kill to get rid of our AIX+10g or at least move it to NFS. Right now the DBAs are terrified of IP storage (10 gigE), it's so slow compared to 2gig FC...

                     

                    Nevermind that they don't know what a zone or an MTU size is, they just know it's slower. Fibre is a pain in the butt.

                    • Re: reallocate TR?
                      jasonczerak
                      Currently Being Moderated

                      Tell the DBA's you'll handle the infrastructure. Put your foot down! LOL

                       

                      If they had a clue, they'd  know that the majority of Oracle's own DB servers at oracle run over nfs.

                       

                      A friend of mine needed some help to sell NFS to a client. The guy was scared of corrupting data when packets would go missing and stale file handles.    Yeah, if you used UDP and NFS version 1. Sure.   Not the case. He even suggested trying out FCoE, WTF? why? What a useless stop gap idea.

                       

                      You have to tune your oracle to use bigger blocks, bump up the MTU. Save costs on infrastrure and win on flexibility.Why wouldn't you go NFS?

                       

                      Right now all we have on FC is exchange on NetApp (Windows08 wouldn't do iSCSI luns and support exchange 2010 when we deployed it) And some old oracle DB's that were on aging EMC disk, soon to be moved to RAC.

                      • Re: reallocate TR?
                        __frostbyte_9045
                        Currently Being Moderated

                        Snip <Why wouldn't you go NFS?> snip

                        We are using FC becuase I can by 3 whole sets of 8GB Fiber switches for what NetApp wants to charge for the NFS license for our 3140 cluster.  Not to mention the cost of 10gE switch ports.  Plus, being a SQL shop <not virtualized> we could only benifit from NFS on or vSphere infrastructure <everything but SQL>.

                         

                         

                         

                         

                        Also, I've enjoied this discussion.  It has provided some interesting insights as to the odd and undocumented aspects of WAFL.

                        • Re: reallocate TR?
                          jasonczerak
                          Currently Being Moderated

                          Folks are starting to virutalize SQL these days. It's on the drawing board over here.

                           

                           

                          Yeah, the NFS license is insane.   just about any software licience is 40k.

                          • Re: reallocate TR?
                            __frostbyte_9045
                            Currently Being Moderated

                            We would discuss it, if the Processor licensing model change to per physical and not to per vCPU.  Because of the way it is licensed, we run a two node active/active SQL cluster.  Isolation is faciliated by having multiple SQL instances, which works out to a manual load balancing act, much like NetApp's active/active cluster

                            • Re: reallocate TR?
                              jeremypage
                              Currently Being Moderated

                              Our main SQL 2008 server and our DWH are both VMs, albeit larger than most of them. They are relatively low utilization though, the SQL averages in the 2k IOPS range and the DWH bounces up and down but never exceeds 4k (which is spindle limited but I am guessing will shoot through the roof when we get our PAMII cards).

                               

                              I did a very non-scientific test with an intel x25 (in poor hardware, it was obviously bottlenecked at the controller) where I ran SQLIO against it under Win2008R2 x64 and averages right at 2800 IOPS across 5 tests. Then exported it via NFS and ran the same thing on a VM over my storage IP network. A single 1g from the workstation and then 2 10g to the filer. Just over 2700 IOPS after 5 tests.

                               

                              I *just* got a better system set up today to test with, I'll  report back on the results but with paravirtual devices and a well tuned VM you can certainly have a decent sized database running in a VM over NFS with minimal loss in efficiency. And you get all the goodies like snapshots and replication etc. Good stuff!

                              • Re: reallocate TR?
                                dnewtontmw
                                Currently Being Moderated

                                FWIW, in our environment (3160, FC 15K, 28 spindles) to our DW, I ran SQLIO tests recently -- making the tests long/large enough to saturate the cache -- and we're seeing 1800-4800 IOPs, depending upon the test.

                          • Re: reallocate TR?
                            Currently Being Moderated

                            For what it's worth, Bundles are great here -- ask your NetApp rep about them next time you're looking at an upgrade and/or new system. Basically if you're doing a couple key pieces of software you're now into Bundle territory where you get most stuff included (there's a couple different bundles....my favorite being the Complete Bundle...just as I can then know that the customer has every possible piece of software so I can purely talk technical architecture rather than getting side-tracked on pricing).

        • Re: reallocate TR?
          jeremypage
          Currently Being Moderated

          I'm not the greatest storage administrator out there but I did work at NetApp for a few years and still have contact there. Although the aggregate level reallocate does not explicitly give you better read performance it can if you've added new disks (which is what I said earlier) because it DOES move data more or less evenly across them. So instead of reading only from the old spindles it will now be able to pull data from the newly added ones as well.

           

          I have not verified this first hand with testing but it makes sense. In addition it probably reduces seek times depending on how full your aggregates are simply because the heads don't have to travel as far to reach the next block, although that's purely speculation & I am not sure there would be a measurable difference there.

           

          As far as MSSQL, are you running it on a VM or a LUN?  Is it deduped or not? If you're running it on a non-ASIS LUN I'd do a reallocate measure and see, a volume level reallocate made a substantial difference on  our Oracle LUNs.

          • Re: reallocate TR?
            erick.moore
            Currently Being Moderated

            Don't confuse aggregate reallocation vs volume level.  This should help explain it better at a high level:  http://www.theselights.com/2010/03/understanding-netapp-volume-and.html

            As for the NetApp not being able to handle writes, actually it is basically a write optimized SAN.  You will probably get better write perfomance on a NetApp system then you will any other SAN.  One of the biggest problems they faced was sequential read after random write, but as of OnTap 7.3 they added the read_realloc option to volumes which will sequentially reallocate data after it has been read once.  The best way to check the performance issues with such a heavy write workload is to do a: sysstat -x -s -c 60 1

             

            Look at the column CP ty.  I am curious to see if you are experiencing any back-to-back CP's (B) or deffered back-to-back CP's (b).

            • Re: reallocate TR?
              jasonczerak
              Currently Being Moderated

              Right now writes are not very back-to-back or defered,  we tuned our apps (and users).   Since Flex share is a piece of crap and fails, we have to manualy handle things more then we figured we would

               

              I'll see where I can induce sio load on the 3160 cluster that's not in prod yet and get it close to what I've seen on the 6080.

            • Re: reallocate TR?
              jeremypage
              Currently Being Moderated

              I'm not confusing the different allocation methods, I am just trying to dispel some of the misinformation posted previously in this thread. I verified what I posted with the people who write those portions of Ontap so I'm reasonably certain it's accurate. In most cases an aggregate level reallocate does not do any good but when adding new disks to an existing RG it can be worth while. Ontap will eventually spread the data across all the disks anyways so it may not be worth the resources to run it, YMMV.

               

              In short:

              Running an aggregate level reallocate will make the filer attempt to make the blocks on disk contigious. That is all, it knows nothing about file systems so this is not going to give any benefits to sequential reads (unless the reduction in seek time makes a difference). It is supposed to spread the exiting data across all the disks in the RG's that belong to the aggr, allowing your reads to be done against more spindles.

               

              Running a volume level reallocate will try and make the file systems contiguous - this is different than above since it actually should make you have more sequential reads and is far more likely to improve performance.

              • Re: reallocate TR?
                erick.moore
                Currently Being Moderated

                Jeremy, I hate to say it, but you are incorrect about the aggregate rellocate, please read the link I posted.

                 

                You stated, "In most cases an aggregate level reallocate does not do any good but when adding new disks to an existing RG it can be worth while. "

                 

                This is from the NetApp manual reagarding aggregate reallocation: "Do not use -A after growing an aggregate if you wish to optimize the layout of existing data; instead use `reallocate start -f /vol/<volname>' for each volume in the aggregate."

                 

                Doing an aggregate reallocate when you grow an aggregate will not spread existing data across the new disk.  On the HP EVA there is a process called "leveling".  This is basically what a volume reallocate does, it spreads the data out across all the spindles in the aggregate.  I think NetApp needs to change the terminology for these similar but different processes.  Perhaps aggregate reallocate should be "reallocate", and volume reallocate should be "redistribute".

                • Re: reallocate TR?
                  jeremypage
                  Currently Being Moderated

                  I think you're confusing optimizing the file system with spreading it across spindles. Aggregates can't optimize a filesystem since there is no concept of a filesystem at the aggregate level. That does not mean it can't help things run faster under certain conditions.

                   

                  Having power to my filer does not optimize my filesystems either but it makes it run a heck of a lot better.

                  • Re: reallocate TR?
                    erick.moore
                    Currently Being Moderated

                    OK, maybe we are saying the same thing, but I like you want to clear up any mis-information that is lingering in this post.  You do not run an aggregate reallocate after growing an aggregate.  That will not gain you anything, and it says as much in the manual.  That is not me saying it, that is NetApp.   If you are having performance issues with something like an SQL LUN, you would start by checking some LUN stats:

                     

                    lun stats -o -i 1 /vol/volname/lunname

                     

                    Checking the reallocation level of the volume or the file in the volume (a LUN as it were in this case)

                     

                    reallocate measure /vol/volname/lunname

                     

                    If you want to optimize performance to that LUN you would then run a reallocate against it:

                     

                    reallocate start -f -p /vol/volname/lunname

                     

                    Additionally you may want to setup scheduled reallocation jobs, with a threshold setting (3 in this case) to run during off-hours, like every Saturday night at 23:00:

                     

                    reallocate start -t 3 -p /vol/volname/lunname

                    reallocate schedule -s "0 23 * 6"

                     

                    Best Regards,

                     

                    Erick

                    • Re: reallocate TR?
                      jeremypage
                      Currently Being Moderated

                      I certainly agree with all of those points. In addition to be clear it's not a benefit at the aggr level when you do the -A as much as at the RAID group level because really that's where spindle count comes in, which is the only thing -A should effect (well, with the possibility of seek time but I think that's a minimal impact and not an issue for most people - and if it is you probably should not be using a NetApp).

                    • Re: reallocate TR?
                      aborzenkov
                      Currently Being Moderated

                       

                      You do not run an aggregate reallocate after growing an aggregate.  That will not gain you anything, and it says as much in the manual.  That is not me saying it, that is NetApp.

                      E-h-h - no, it is not what NetApp is saying, it is how you read it NetApp says: Do not use -A after growing an aggregate if you wish to optimize the layout of existing data. But that is exactly what Jeremy was telling you all the time. Aggregate reallocation won't improve layout of data - but it will improve distribution of data over disks.

                      If you are having performance issues

                      I would stop here and ask - which performance issues? Performance is not equal performance. I have customers who never run reallocate and are quite happy - for their specific workload.

                    • Re: reallocate TR?
                      joebutchinski
                      Currently Being Moderated

                      Erick Moore wrote:

                       

                      Additionally you may want to setup scheduled reallocation jobs, with a threshold setting (3 in this case) to run during off-hours, like every Saturday night at 23:00:

                       

                      reallocate start -t 3 -p /vol/volname/lunname

                      reallocate schedule -s "0 23 * 6"

                       

                      This caught my attention.  The manpage says "Reallocation  processing  operates  as  a background task." so I've always scheduled with the assumption that file service would trump the reallocate.  I wonder if anyone has observed a negative impact on performance during reallocation.  I haven't formally tested this but have received no complaints about performance during a reallocate.

                      • Re: reallocate TR?
                        jasonczerak
                        Currently Being Moderated

                        It causes all kinda of latency issues on a 6080 cluster on 7.3.1.1L1P2 if it's doing more then one volume at a time ON THE FILER, not just per aggr.

                         

                        We'll be bumping to 7.3.3P-something this weekend.  I might kick off a few and see what happens

                    • Re: reallocate TR?
                      Igor Stojnov
                      Currently Being Moderated

                      Hello Eric,

                       

                      I ran the commands as you suggested against one of the LUNs here, by settings the threshold to 4 and establishing a twice-a-week schedule - at 11PM on Sundays and Wednesdays:

                       

                      reallocate start -t 4 -p /vol/TEST/test.lun
                      reallocate schedule -s "0 23 * 3,0" /vol/TEST/test.lun

                       

                      I had expected the reallocation (optimization) process to commence automatically once the threshold is reached, but it doesn't. I only keep getting system messages in my Autosupport, advising me to run reallocate:

                       

                      Wed Dec 29 23:00:00 CET [wafl.scan.start:info]: Starting WAFL layout measurement on volume TEST.
                      Wed Dec 29 23:10:19 CET [wafl.reallocate.check.highAdvise:info]: Allocation check on '/vol/TEST/test.lun' is 4, hotspot 19 (threshold 4), consider running reallocate.

                       

                      Sun Jan  2 23:00:00 CET [wafl.scan.start:info]: Starting WAFL layout measurement on volume TEST.
                      Sun Jan  2 23:10:16 CET [wafl.reallocate.check.highAdvise:info]: Allocation check on '/vol/TEST/test.lun' is 5, hotspot 19 (threshold 4), consider running reallocate.

                       

                      Surely this should've been done automatically by now?

                       

                      Cheers,

                      Igor

  • Re: reallocate TR?
    dnewtontmw
    Currently Being Moderated

    Any updated documentation or thinking on this topic?

     

    We're running SQL Server 2005 on a NetApp FAS3160 server.  We do SQL index rebuilds on the weekends, and I wonder if it's same as Sybase under the covers, with regards to how it behaves at the storage level...

    • Re: reallocate TR?
      BrendonHiggins
      Currently Being Moderated

      As part of this SQL lun latency issue http://communities.netapp.com/thread/7456 I will be running the volume reallocate against the lun tomorrow night.  If will post back the result of the work at the end of the week.  Should be a good test of weather or not it works as described.

       

      Bren

    • Re: reallocate TR?
      Currently Being Moderated

      I just wanted to chime in (after a long hiatus) to say that I would REALLY love to see a TR on this. I'm currently working through reallocate questions for multiple customers with multiple scenarios (straight-up reallocate, reallocate after adding 1-2 disks to an aggr (disks were waiting until the 7.3 upgrade allowed a bit larger aggrs), dedup and reallocate, reallocate and VMware, etc.).

      • Re: reallocate TR?
        erick.moore
        Currently Being Moderated

        1-2 disk add on an aggregate will require a reallocate on every volume in that aggregate.  NetApp PS recommends never adding less than 4 disk at a time to an aggregate, but depending on the rate of change even that could be too low for some workloads.  Also dedup blocks will not ever get reallocated.

        • Re: reallocate TR?
          jasonczerak
          Currently Being Moderated

          We took 2 64disk aggr's and added an entire RG of 16 disks to each.  6TB per aggr took nearly a week to reallocated, manually one volume at a time as to not impact anything else. on a 6080.

           

          It's a good idea, it's just badly implamneted. like scrubbing, wouldn't it make sense to use idle-ish IO to keep data optimized?

          • Re: reallocate TR?
            erick.moore
            Currently Being Moderated

            Well you can enable read_realloc on volumes after 7.3.1.  I agree that it would work better if we didn't have to manage this at all.  Let the system realloc in the background while idle.  I fully support a wizard when adding new disk to an aggregate that ask me if I want to reallocate all my volumes for that aggregate in the background.

            • Re: reallocate TR?
              jasonczerak
              Currently Being Moderated

              yeah, but that's a real time CPU and IO hit if it has to reallocate after a read.  And isn't that a bad idea when using de-dup?  As far as the wizard idea, so long as the reallocate DOES NOT imact latency, sure.... 

               

              There needs to be a chart of what "features" work well togehter and which ones shouldn't.  like read_reallocate and extents and dedup + reallocate = bad or something along them lines.

              • Re: reallocate TR?
                erick.moore
                Currently Being Moderated

                Dedup blocks can't get reallocated.  They are stuck where they are once deduped.  We haven't seen any performance hit with read_realloc since it only reallocates the portions of the volume that are being asked to be read sequentially.  In fact, I don't even know if read_realloc actually reallocates blocks or if it just updates the metadata for the readset to include the blocks that it wouldn't normally get on that read request.

                • Re: reallocate TR?
                  jasonczerak
                  Currently Being Moderated

                  Intresting. that would make sense I suppose.

                   

                  Stil tho, it would be nice to have a nice chart out side of default options.

                   

                  Not everyone has a "test" netapp.  I've been lucky with 2 3160 clusters for the past 6 months to do some major expermients with. but that luxary is going away soon for me, back to experimenting on production hardware ;-)

                • Re: reallocate TR?
                  radek.kubka
                  Currently Being Moderated

                  Dedup blocks can't get reallocated.  They are stuck where they are once deduped.

                  Interesting stuff.

                   

                  Are you sure this is also the case when using reallocate -p (physical) option?

                   

                  Regards,
                  Radek

                  • Re: reallocate TR?
                    erick.moore
                    Currently Being Moderated

                    I know for a fact it doesn't work with read_realloc per TR-3505, and I would assume the same applies to standard reallocation.

                  • Re: reallocate TR?
                    aborzenkov
                    Currently Being Moderated

                    Even if they were reallocated, you can’t please ‘em all. You have a block that is shared by dozen of files. (Logically) adjacent block is shared by dozen different files. You have no way to reallocate them both and keep all files that share these blocks sequentially laid out.

                     

                    So it is not that they are really set in stone – it is more that shuffling de-duplicated blocks around is not going to improve total picture much; de-duplicating one data set will inevitably fragment another.

                     

                     

                    ---

                    With best regards

                     

                    Andrey Borzenkov

                    Senior system engineer

                    Service operations

  • Re: reallocate TR?
    aborzenkov
    Currently Being Moderated

    Cleaning up disk I found tr3599 which is titled "REALLOCATE BEST PRACTICES GUIDE " Now it is marked confidential and I am not able to find it on fieldportal (which does not mean anything, it is near to impossible to find anything on purpose there). It is dated 2007 so I guess I have it from old partner portal. It does not contain anything excitingly new that was not already covered in various discussions; the main value is, it represents official NetApp position about use of reallocation.

     

    I'd love to see it updated to include new physical/aggregate reallocation. Anyone knows if newer version exists or why it was removed?

     

     

  • Re: reallocate TR?
    danpancamo
    Currently Being Moderated

    running: Release 7.3.2P2:

     

    We continue to see 50%  (yes 50%) or more gains after some reallocate runs. Mainly heavy sequental reads and mainly on large database files that get updated frequently...

     

    TIP:  We have discovered that if you are using a NFS volume you can run reallocate on individual files instead of the whole volume which can take the reallocate time down from hours to minutes.    However you need to create a schedule for each file.

     

    TIP: We have seen reallocate measure results over 20, so 10 must not be the max..

     

     

     

    new stuff....

     

    I'm back posting today to try and understand the effects of running reallocate on our VDI VMWARE volumes which are deduped.   reallocate measure is reporting a 6 with a hotspot 19 with a recommendation to run reallocate.

     

     

     

    From reading the above posts, someone stated that deduped blocks cant be reallocated, so does that mean that on a volume with multiple duplicate VMs, reallocate wont do anything?   

     

    We are seeing a volume latency issue  (2 to 60ms) when deduplication and virus scans run at the same time.    I initially thought that reallocating should speed on the virus scans, however if dedup blocks are not reallocated this may do nothing??

     

     

    what does hotspot 19 mean?

     

    Again I vote for a TR on this!

     

     

    Dan Pancamo

    • Re: reallocate TR?
      radek.kubka
      Currently Being Moderated

      Hi Dan,

       

      I'm loving this thread more & more! Many people at NetApp are still in a constant state of denial that fragmentation even exists

       

      Re dedupe & reallocation - what Andrey wrote above makes sense to me:

      Even if they were reallocated, you can’t please ‘em all. You have a  block that is shared by dozen of files. (Logically) adjacent block is  shared by dozen different files. You have no way to reallocate them both  and keep all files that share these blocks sequentially laid out.

      Having said this, if you can test it in practice, that would be awesome (is reallocate a silver bullet for every problem? )

       

      Regards,

      Radek

    • Re: reallocate TR?
      aborzenkov
      Currently Being Moderated

      We continue to see 50%  (yes 50%) or more gains after some reallocate  runs. Mainly heavy sequental reads and mainly on large database files  that get updated frequently...

      I have seen 200% on backup (dump) of volume with database (50MB/s => 200 MB/s). This was test run (dump to null); real life is limited by other factors still it cut down backup time in half.

      We have seen reallocate measure results over 20, so 10 must not be the max..

      Max is 32. I have seen over 20 on the system I mentioned as well.

       

      reallocate measure is reporting a 6 with a hotspot 19

      Could you show exact message and where do you get it? I do not remember having seen it.

      • Re: reallocate TR?
        rnugent2068
        Currently Being Moderated

        The search for reallocation info continues . . .

         

        aborzenkov wrote:

         

        We have seen reallocate measure results over 20, so 10 must not be the max..

        Max is 32. I have seen over 20 on the system I mentioned as well.

         

        reallocate measure is reporting a 6 with a hotspot 19

        Could you show exact message and where do you get it? I do not remember having seen it.

         

        I don't think 32 is the top of the scale. I just got this from a reallocate measure:

         

        [server1: wafl.reallocate.check.highAdvise:info]: Allocation check on '/vol/TEST/test.lun' is 36, hotspot 0 (threshold 4),

        consider running reallocate.

         

        Does anyone know where I can find a definitive resource for reallocation info?  NetApp engineers - are you there ? 

         

        (Referred to this post from my original question: http://communities.netapp.com/message/47200#47200 )

         

        Thanks

        • Re: reallocate TR?
          aborzenkov
          Currently Being Moderated

          That’s quite possible. In the past target allocation size was 64K (resulting in 16), then it was increased to 128K (giving 32); I can well believe today it is  increased even more. 256K does not sound like unreasonable.

           

          Anyone knows counters related to this (i.e. how many full extents were written)?

  • Re: reallocate TR?
    aborzenkov
    Currently Being Moderated

    TR-3929 Reallocate Best Practices Guide

     

    https://fieldportal.netapp.com/viewcontent.asp?qv=1&docid=33904

     

    Not that it contains anything that was not already beaten to death here ...

    • Re: reallocate TR?
      radek.kubka
      Currently Being Moderated

      Wow, I can't believe my eyes - it really happened (at last)!

       

      The doc itself is rather brief I would say & arguably doesn't cover many corner cases, which typically are the most confusing ones.

       

      That said, it is way better than nothing!

       

      (NB - not available via http://www.netapp.com/us/library/, so the subject is still "sensitive" I reckon)

       

      Regards,
      Radek

    • Re: reallocate TR?
      Igor Stojnov
      Currently Being Moderated

      In so many words, we've 2 reallocate approaches:

       

           Traditional reallocate - does the job, introduces snapshot space impact, SnapMirror impact

           Physical reallocate - does the job, introduces snapshot read impact, no SnapMirror impact

       

      And then, there's also Read reallocate option for volumes (didn't know about this one) - which follows the same principles, but uses regular read workload for optimization rather than scanning mechanism, and is apparently recommended to work along side regular reallocate processes. Interesting... "optimal layout of frequently accessed data".

       

      I wonder when exactly does Read reallocate take place? Because, if it's performance demanding I wouldn't like it kick in just anytime. Also, am I right to assume it's performance impact will reduce over time, after running a few passes under the same work load and slow build-up of data?

      • Re: reallocate TR?
        aborzenkov
        Currently Being Moderated

        I was always uneasy about read reallocation. It appears to take place at exactly wrong time – after we already paid penalty of non-sequential read ☺ Depending on environment, next time we need to read data it may have been fragmented again … so it apparently makes sense only in highly static environment.

         

        I wish this TR explained how can we estimate (or better – get real counters) of how effective read reallocation was. Something about how often reallocated data was read subsequently before being fragmented again.

    • Re: reallocate TR?
      jakub.wartak
      Currently Being Moderated

      Hi,

       

      i've started playing with this in a more repeatable/scientific way. Results are on my blog and more are about to come:

       

      1) WAFL performance VS sequential reads: part I, FC LUN performance from AIX vs FlexVol utilization <-- intro, contains descripton of the env used, i know it is not ideal but i don't have anything better currently without requiring Change Controls. http://jakub.wartak.pl/blog/?p=316

      2) WAFL performance VS sequential reads: part II, FC LUN defragmentation <- a nice show case for "reallocate start -f -p".  http://jakub.wartak.pl/blog/?p=330

       

      -Jakub.

       

      Message was edited by: Jakub Wartak

    • Re: reallocate TR?
      dnewtontmw
      Currently Being Moderated

      Is anyone else having trouble getting to this doc?

       

      If I click on the TR link above, I get error messages about being unable to log in (to the iCentera site, which tries to redirect to the fieldportal URL.)...

      • Re: reallocate TR?
        radek.kubka
        Currently Being Moderated

        Is anyone else having trouble getting to this doc?

        It's posted on Field Portal, which is accessible for NetApp insiders & resellers only (don't shoot the messenger though )

        • Re: reallocate TR?
          aborzenkov
          Currently Being Moderated

          I checked external library before posting link but it was not available. TR is not marked as NDA so I presume it is just matter of time.

          • Re: reallocate TR?
            lwei
            Currently Being Moderated

            I just posted a blog on read_realloc, with some simple test results.

            http://blogs.netapp.com/pseudo_benchmark/2011/06/read_realloc.html

             

            Regards,

            Wei

            • reallocate TR?
              jakub.wartak
              Currently Being Moderated

              Wei,

               

              it is much better than 6% over time I've started from something like 8MB/s in reads (worst case scenario) and it self-optimized to a state nearly just after fresh "reallocate start -f -p"

               

              Take a look http://jakub.wartak.pl/blog/?p=343 (WAFL performance VS sequential reads: part III, FC LUN performance from AIX vs read_realloc)

               

              -Jakub.

              • reallocate TR?
                lwei
                Currently Being Moderated

                Jakub,

                 

                Thanks for the post and I'm glad you ran your own tests and found it's much better than 6%. I just used very simple tests to illustrate the effectiveness of read_realloc to some workloads. The improvement will probably vary under different scenarios. I think the worst case is probably 0% improvement. On the other hand, it could be much better than 6%.

                 

                Thanks,

                Wei

              • Re: reallocate TR?
                avbohemen
                Currently Being Moderated

                I have another question about read_realloc: If I would enable read_realloc, will it also optimize volume layout if do ndmp backups for a volume with a single lun? NDMP reads a single file (or object, in this case a LUN) sequentially, so I tend to think that read_realloc will help if I do frequent (daily) full backups... or not? NDMP creates a snapshot first, so basically it backups data in that snapshot, not the active filesystem. Will the volume get optimized if I turn on read_realloc and only do NDMP sequential i/o? The case I have in mind is a database lun, which mostly does random i/o, but ndmp backups get slower and slower over time.

                 

                The second question is: does "read_realloc=space_optimized" work on SyncMirrored aggregates / MetroCluster? I know "reallocate -p" does not work on MetroCluster, and TR3929 tells me that the space_optimized setting for read_realloc is based on physical (-p) reallocation.

                • Re: reallocate TR?
                  avbohemen
                  Currently Being Moderated

                  Just found the answer to my second question: space_optimized is not possible on MetroCluster.... produces this error:

                   

                  filer1> vol options vol1 read_realloc space_optimized

                  vol options: UNKNOWN error reason: 226 (CR_VOLUME_IS_MIRRORED)

                   

                  Can anyone shed some light on whether reallocate is going to be supported on MetroCluster somewhere in the future?

                  vol options: UNKNOWN error reason: 226 (CR_VOLUME_IS_MIRRORED)vol options: UNKNOWN error reason: 226 (CR_VOLUME_IS_MIRRORED)
                  • Re: reallocate TR?
                    lwei
                    Currently Being Moderated

                    Anton,

                     

                    Is the NDMP backup reading from a data LUN directly, or a snapshot? If it's a LUN, then turning on read_realloc may help. Note that if you just read the data only once, then read_realloc is not going to help, since on the first pass, it tries to optimize the layout. Only on the 2nd or later pass does it help.

                     

                    Thanks,

                    Wei

                    • Re: reallocate TR?
                      KINYUAWANJUGUNA
                      Currently Being Moderated

                      Hi Wei,

                       

                      Atleast am coming across something better than NetApp support.

                      My environment is as below:

                      Sun M5000 server connected to FAS2020/FC Direct with Data OnTap 7.3.2. The aplication is T24 running on Jbase DB.

                      Backups are taking too long and backup times deteriorating with time. The wafl scan measure layout returned 13.9 for the LUN

                      We ran reallocate -f -p <lun> and later measure layout which returned 1.3. Backup went on fine with backup time reduced from 4 Hrs to 40MIns.

                      The next day backup time was 2Hrs, measure layout at 6.9;

                      Seems when there are writes during the day, layout ratio deteriorates hence longer backup times. Basically backup does a tar to a different location. It Tars the Contents of the LUN to a different LUN or tape.

                       

                      Reads are slowing by the day!

                       

                      Can we ran read_reallocate on this LUN. What are the effects in terms of peformance when read_allocate is ON?

                      Should we run reallocate -f -p everyday? the first one took 9 Hrs offpeak which we cannot afford. Can we run it in pick Hours?

                       

                      I have searched for JBASE + Netapp/Data OnTap best practices nowhere to be found.

                       

                      Regards

                       

                      dan

                      • Re: reallocate TR?
                        jeremypage
                        Currently Being Moderated

                        Sounds like your volume/aggrs need more free space.

                        • Re: reallocate TR?
                          KINYUAWANJUGUNA
                          Currently Being Moderated

                          Hi Jeremy,

                           

                          Thanks for your feedback.

                          I can drop afew LUNs to have more free space. Will i need to run reallocate command for the free space to be useful in performance of my original remaining LUN?

                           

                          Regards

                          dan

                          • Re: reallocate TR?
                            jeremypage
                            Currently Being Moderated

                            Yes but the issue is not really that the LUN is full, it's that the volume/aggr is full enough that when you write WAFL cannot find contigious blocks so your system is becoming fragmented very quickly.

                             

                            You need free space at the aggr/volume level (I dunno how you are set up). An aggr show_space -h  can give you a good view.

                            • Re: reallocate TR?
                              KINYUAWANJUGUNA
                              Currently Being Moderated

                              My system is very transactional. That might explain the fragmentation.

                              On volume space issue, below is the output.

                              netapp-prod1*> aggr show_space -h

                              Aggregate 'aggr0'

                               

                                  Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG           A-SIS

                                       2187GB           218GB            98GB          1870GB             0KB             0KB

                               

                              Space allocated to volumes in the aggregate

                               

                              Volume                          Allocated            Used       Guarantee

                              vol0                                 12GB           765MB          volume

                              test_vol                           1220MB            61MB          volume

                              u2_vol                              285GB           247GB          volume

                              u1_vol                              342GB           279GB          volume

                              u3_vol                              285GB           214GB          volume

                              bu_vol                              171GB            77GB          volume

                               

                              Aggregate                       Allocated            Used           Avail

                              Total space                        1096GB           819GB           773GB

                              Snap reserve                         98GB            11GB            86GB

                              WAFL reserve                        218GB            22GB           195GB

                               

                               

                              netapp-prod1*>

                               

                              We have also disabled snapshots for now.

                               

                              Regards

                              • Re: reallocate TR?
                                radek.kubka
                                Currently Being Moderated

                                If I am reading this correctly, you have tons of free space in your aggregate, so this is not the issue.

                                 

                                Any chances you can ditch traditional backup (which is basically a large, sequential read not liked by NetApp filers) in favour by snapshots, followed by a NDMP backup to a secondary target?

                            • Re: reallocate TR?
                              aborzenkov
                              Currently Being Moderated

                              It is misunderstanding where fragmentation comes from.

                               

                              Let’s say you have file with contiguous blocks 1,2,3,4,5,6. Now 2,4,6 are overwritten. Maybe in the same CP timeframe. So you are left with

                               

                              1,hole,3,hole,5,hole

                              contiguous 2,4,6

                               

                              So file is fragmented; there are no two blocks located contiguously. Even though there was enough space to write new blocks sequentially.

                               

                              It is hard to say whether read_reallocate will help. It happens in the wrong time (i.e. - first data is accessed and only then it is reallocated). So it is dependent on workload; but it will create permanent additional disk load which is again hard to quantify.

                               

                              Try to run reallocation more often, may be every day before backup, and to check whether it has noticeable impact on your system. Official statement is, reallocation scan runs in background and has low priority. I have seen multiple people saying they did use it during normal working hours without any performance impact.

                               

                              Or consider dropping tape backup in favor of snapmirror/snapvault. May be consider offloading tape backup to snapmirror/snapvault destination. This will allow you to reallocate destination without impact to source and be more flexible with backup window.

                              • Re: reallocate TR?
                                KINYUAWANJUGUNA
                                Currently Being Moderated

                                Hi Radek/Aborzenkov,

                                We are currently running rellocate 2Hrs before backup, it works fine for now. But it seems its more of a workaround than a real solution.

                                We have not tried read_reallocate, its poorly documented and the outcome for our scenario where at backup time we must read all the data can not be envisioned..

                                We are exploring the snapshot/NMDP backup option since it seems to be the longterm solution. In the meantime, just to brig you to speed, this is how application vendor has recommended we do the backup: (used to work perfect with the old Sun StorageTek & V890)

                                Steps:

                                1. Run Close Of Business Before Backup

                                2. Run Pre-Batch Backup (Basically a tar of the Location holding the files-)

                                3. Run End Of Day (EoD)

                                4. Run Post-Batch Backup (Same Tar as in 2 to take care of the changes of EoD in 3.

                                 

                                If am to use snapshots; our approach will be most likely look like;

                                 

                                1. Run Close Of Business Before Backup

                                Take snapshot_pre-batch

                                2. Run Pre-Batch Backup - From snapshot_pre-Batch

                                3. Run End Of Day (EoD)

                                Take snapshot_post-batch

                                4. Run Post-Batch Backup (Use snapshot_pre-batch take care of the changes of EoD in 3.

                                 

                                My question is, since snapshots do not copy data but only keep an inode reference to the original data, will the backups using snapshots be any faster?

                                Thanks for your responses.

                                • Re: reallocate TR?
                                  aborzenkov
                                  Currently Being Moderated

                                  creation of snapshot is almost instantaneous. It takes very little time.

                                   

                                  But if you ask, whether using snapshot to backup to tape will be faster - no, it won’t. Actually, tape backup using NDMP starts with creating snapshot that is then written to tape.

  • Re: reallocate TR?
    aborzenkov
    Currently Being Moderated

    As indicated by http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=585280, you also can reallocate directory. Wow!

    reallocate start -o -f -p /vol/<volname>/<dirname>

  • Re: reallocate TR?
    ASUNDSTROM
    Currently Being Moderated

    According to the Reallocate Best Practices Guide TR-3929 published June 2012:

     

    Deduplication and Compression

    Starting in Data ONTAP 8.1 deduplicated data can be reallocated using physical reallocation or read_realloc space_optimized. Although data may be shared by multiple files when deduplicated, reallocate uses an intelligent algorithm to only reallocate the data the first time a shared block is encountered. Prior versions of Data ONTAP do not support reallocation of deduplicated data and will skip any deduplicated data encountered. Compressed data will not be reallocated by reallocate or read reallocate, and it is not recommended to run reallocate on compressed volumes.

  • Re: reallocate TR?
    danpancamo
    Currently Being Moderated

    We still run reallocation on large Database/Vmware/VDI volumes to help with "fragmentation"...

     

    I hear that a new option in Ontap 8.1.2 called Read-Ahead in WAFL and just-in-time CSC (continuous segment cleaning) that may help with fragmentation issues...  

     

    All I'm able to find on the subject is the Netapp Patent information and some engineers that worked on the project...


    Can anyone shed some light on on CSC?

    • Re: reallocate TR?
      Erick Moore
      Currently Being Moderated

      CSC has been renamed to FSR (Free Space Reallocate). This setting is an always on process that will optimize free space in an aggregate to allow WAFL to keep writing fast and efficiently even as the filesystem ages. This process is very similar to what reallocate -A does, only it does it in real-time by removing existing blocks in a soon to be written location (the allocation area) in order to allow our write allocator to lay the data down in a more efficient method. The existing data blocks are then written to disk alongside user data in a future CP operation.

       

      If you want to enable this on an existing aggregate, please check with NetApp support first or your local SE to make sure you have the necessary headroom. If your existing on disk structure is not optimal there are performance considerations as the FSR process tries to optimize the existing data.

       

      Thanks,

       

      Erick

  • Re: reallocate TR?
    korso
    Currently Being Moderated

    Free space reallocation of aggregates

    Starting in Data ONTAP 8.1.1, you can enable free space reallocation on aggregates to improve write performance. Free space reallocation improves write performance by optimizing the free space within an aggregate.

     

    Free space reallocation works best on workloads that perform a mixture of small random overwrites and sequential or random reads. You can expect additional CPU utilization when you enable free space reallocation. You should not enable free space reallocation if your storage system has sustained, high CPU utilization.

    Note: You can use the statistics show-periodic command to monitor CPU utilization.

     

    For best results, you should enable free space reallocation when you create a new aggregate. If you enable free space reallocation on an existing aggregate, there might be a period where Data ONTAP performs additional work to optimize free space. This additional work can temporarily impact system performance.

     

    Always check with NetApp Support before enable this feature...