Currently Being Moderated

Fractional Reservation - LUN Overwrite

Posted by chriskranz on Mar 5, 2009 2:01:07 PM

I seem to get questioned about Fractional Reservation at least once a week, and find myself explaining it over and over. I have found quite a simple way of explaining this now, unfortunately much of the documentation doesn't make it quite so simple to understand. I’ve got a much better understanding of what it actually is now. It makes more sense as NetApp have changed it’s description in some places, in Operations Manager 3.7 it’s now referred to as “Overwrite Reserved Space”.

 


This is easiest to explain with pictures. We should have all seen a standard snapshot graphic. When we snapshot the filesystem, we take a copy of the file allocation tables and this locks the data blocks in place. Any new or changed data is written to a new location, and the changed blocks are preserved on disk.

 

snap00285.bmp


So basically a snapshot locks the data blocks of the data referenced by it in place. This means that any new or changed blocks (D1 in the above graphic) in the active file-system are written to a different location. This concept is fundamentally the same as what Fractional Reservation is.

 

As the LUN gets filled up with data, we take a snapshot and that data is locked in place. Potentially all this data could change, and we need to guarantee not only this existing data, but also the potential that we need to write totally new data blocks. Any changed data gets written into the Fractional Reservation area rather than into the area that the existing LUN data is in. (I know that in reality this is spread across all the disks and these areas don’t actually exist, but it makes it easier to visual and understand explaining it this way). As changed data blocks are written, old data blocks get preserved in the snapshot reservation area. Fractional Reservation is preserving the maximum rate of change we could potentially get.

 

snap00286.bmp

Don't confuse this with the snapshot reservation area. The snapshot reservation includes saved data blocks from previous snapshots, where-as the Fractional Reservation is protecting your Active File System (AFS in the above graphic) from it's own potential rate of change.

 

 

So the reason a LUN may be switched offline if the fractional reservation area is set to 0, is that the filer needs to protect the existing data that is locked between the active file system and the most recent snapshot, plus any additional changes that happen to the active file system. If the volume / LUN / frac-res and snap reserve are full, then this space is not available and the filer needs to take action to prevent these writes from failing. The filer guarantees no data loss, but with no space free and nowhere to write the new data, it has to offline the LUN to prevent the writes from failing.

 

So fractional reservation is in constant use by the filer as an over-write area for the LUN. Without it, you need to make sure that sufficient space is free to allow for the maximum rate of change you would expect. Defaults are good, but trimming down on these you need to monitor the rate of change and make sure the worst case scenario is within a buffer of free space that you allow. If you reduce the Fractional Reservation to 0, you need to make sure the rate of change is within the volume size, or you need to make sure the volume can auto-grow when required or even snap auto-delete to reduce the reserved blocks and free up space (although I am not a huge fan of snap auto-delete for various reasons).

 

And that is Fractional Reservation!

 

Quick last thoughts... A-SIS won’t make any difference to the Fractional Reservation area as such, but it can help as the data blocks within the LUN will get de-duped, but the Fractional Reservation area per-se would always be required as you need this LUN over-write area for changing data. If you reduce the footprint of the non-changing data with A-SIS, you reduce the potential reservation area required. Space savings aren't apparenty when you have things thick provisioned. Reducing Fractional Reservation and thin provisioning can be a dangerous game.

 

The most important rule is to monitor and understand your data. If you understand your rate of change, you can tweak a lot of areas of the storage system.

 

 

 

3,574 Views Tags: fractional_reservation, lun_overwrite, lun_overwrite, fractional_reservation


Mar 6, 2009 1:41 AM ChrisHolloway ChrisHolloway    says:

Great post.  This document is the resource library:

 

http://media.netapp.com/documents/tr-3483.pdf

 

does a really good job of explaining fractional reserves, snap reserves, thin provisioning etc.

 

Couple of things I wanted to add:

-The fractional reserve (although it is reserved at the time of the snap), is only used if all other space in the volume is gone in order to allow you overwrite the LUN.

-If the fractional reserve is less than 100% you can't change the volume guarantee to none.

 

 

Personally, I always reduce the fractional reserve to 0 and use volume auto-grow to protect against LUN overwrites.

 

It is worth pointing out that snapshot autodelete is available in SME and SMSQL, but the trigger for those auto-deletes if useage of the fractional reserve and not the volume, so in this instance I would use a small (around 10%) fractional reserve to allow SnapManager to detect space useage and delete snaps as appropriate.  This would only come into play when the volume has already autosized itself to the maximum size it can.  Don't use snap autodelete on the filer side with any SnapManager product, as they'll get very upset.

 

Cheers

Chris

Mar 6, 2009 9:54 AM chriskranz chriskranz    says in response to ChrisHolloway:

Cheers for the feedback Chris,

 

Another point you reminded me to make is with the SMAI auto-delete stuff. The  triggers can sometimes be a bit too slow for it to successfully free-up enough  space. Snap auto-delete from within SMAI requires deleting not only snapshots,  but also backup records and saved logs files. This process could be seconds to  minutes, and during this time the extra space could be essential (depending on  your limits set).

 

I'd always set vol auto-grow to give you some padding in there, and I'd agree  with keeping frac-res to 10% in SMAI environments.

 

Also if you change the fractional reservation, yes you can't change the  volume guarantee (at all), so it's best to shrink the volume in actual size, and  use vol auto-grow as needed.

 

Although the documentation does a good job of explaining technically what  Fractional Reservation does, I get a lot of questions about it still. Frac-Res  still seems to be a bit of a dark art, and most people just accept it as a given  overhead, but the more we encourage people to reduce it, the more I think you  need to fully understand what it is, and how it affects the storage  system. Hence why I came up with the above description. Hopefully it adds some  clarity for people.

 

Cheers...



Mar 22, 2009 10:30 AM ianaforbes ianaforbes    says in response to ChrisHolloway:

Hi Chris

 

Why does Fractional Reserve(FR) first use any available space in the volume for overwrites FIRST before using the FR space set aside? For example, let's say I have a 500GB volume with a 200GB lun (both guaranteed space). If I leave FR to 100%, the FR space can grow to potentially 200GB. This leaves me with 100GB in the volume for snapshots.

 

Now, let's say that I fill 200GB of my lun and create a snapshot. All of those blocks are locked byu the snapshot and FR is 200GB. Now, I overwrite 100GB in the lun. Since all blocks in the lun are locked by the snapshot I need somewhere to guarantee the overwrites.

 

According to your statement, the first place that gets looked to store these overwrites is the available free space in the volume - which is the 100GB free space intended for snapshots. So, the overwrite blocks are written there. That leaves zero space left in the volume. I attempt to take a snapshot. This will fail because there is no room left in the volume. At this point I have to grow the volume to be able to create a snapshot.

 

If the overwrite was allowed to be written into the overwrite space and NOT the available free space (intended for snapshots) we wouldn't have been presented with a full volume.

 

So, why do overwrites get written to available free space FIRST and not directly in the FR space intended for just that?

 

Thanks

Mar 22, 2009 10:44 AM ianaforbes ianaforbes    says in response to ChrisHolloway:

Chris - I always hate trying to size FR when using SnapManager products. I've been wanting to implement the FR monitring within Snapmanager for sometime, but haven't been exactly sure how to use it.

 

Could you please explain how to enable and monitor it so if it reaches a threshold I can either autogrow the volume first and/or delete a Snapmanager snapshot? I'm currently only storing a specified number of snapshots via Snapmanager policy settings. How does the snap autodelete co-exist with that policy?

 

Hmmm...When I clicked on Fractional Reservation Monitoring from within the SnapManager Exchange Console I got an error stating, Cannot obtain FSR Policies, please verify if the SnapManager for Exchange Monitoring Service is running'

 

I'd like to also get to a point whereby I define FR to 10% and use FR monitoring to autodelete snapshots AND autogrow the volume, as needed.

Mar 22, 2009 10:59 AM chriskranz chriskranz    says in response to ianaforbes:

The FR area is a safety net to prevent your active filesystem not having anywhere to write it's data. In your example if you have marked the snap reserve area as 100g (20% of the 500g volume), then no, FR would not overwrite to that area, but may in some cases reserve all the spare space in the volume. If you are using something like SnapDrive and you have reduced your snap reserve to 0% and leave SnapDrive to sort it, then yes, it will use the calculated extra space for the FR. But then that simply means you haven't sized your volume correctly. If you have 50% rate of change, then you need to size the volume accordingly. But through your example, at no point has the filer put your data at risk. Your existing snapshots are protected, and your Active Filesystem is also protected and has a guaranteed area to make new writes. If a new snapshot was allowed, then both these areas could be in danger, at which point the filer would protect against any data loss by offlining the LUN (drastic, but the only way if you have no auto-del / auto-size policies in place).

 

Moving onto your next comment around SnapManager monitoring. As Chris Holloway mentioned, it's best to drop your FR down to 10% and then enable monitoring within SnapDrive and SnapManager. You need to make sure you have the correct versions of both SnapDrive / SnapManager and OnTap to support this (I'll have to look it up if you want to know). But basically this will monitor the rate of change within the volume, which can be taken from FR (which is why it needs to still exist and not be 0%). If you have a very high rate of change, then 10% may not be enough to trigger an auto-size or auto-delete, so size accordingly (in your example, 50% rate of change is very high!).

 

I am told that the SnapManager Snap Auto-Delete feature can be a little slow in processing, so if you do have a higher rate of change, then in may not kick in quick enough to allow you the required free space. Really these tools should be used as a get out of jail card though, and not a daily routine. The volumes should be sized to accomodate all the snaps you want to store online at any time.

 

Personally I am not a big fan of the snap auto-delete feature. If you have an unexpected high-rate of change, what are the chances that this was caused by something abnormal in the system? If it is abnormal in a bad way, then the one thing you want most of all is as many online backups as possible, so for me, snap auto-delete could be slightly counter productive. But that is just my personal opinion

 

I like volume auto-grow and you can use this outside of any SMAI integration and use it direct either with SnapDrive or on the command line. This is much more reliable and is a great get out of jail free card on those odd occasions where something unexpected happens, but gives you head room to keep your systems online and stable. If you are looking to start playing with reducing FR and using these features, this is where I'd start.

 

On the volume options on the command line you can tell it whether to try vol auto-size or snap auto-delete first, and it will try both if it needs to. You can also tell it to defer deleting user created snapshots too, but that can be counter productive if you have a snapshot that you've forgotten about and is causing the volume to grow!

 

I would agree that the whole process can be quite confusing though. It's taken me awhile, and on some technical sides the above description may not be 100% technically accurate, but I have found that it is the easiest way to visualise and understand what the FR process is all about. Ultimately I think you need to understand that before you can start modifying the levels and policies.

 

I may well work on a follow up blog entry to this that goes over the finer points of space monitoring and the vol auto-size / snap auto-del policies. This is a hot topic at the moment! I went through with a customer the other day and we saved them 6TB simply by dropping the FR on their filer estate!

Mar 22, 2009 11:11 AM ianaforbes ianaforbes    says in response to chriskranz:

Hi Chris

 

I should have explained that I do understand FR. I was just interested in your comment about how it uses available free space before actually using the FR space allocated. Let's assume we're dealing strictly with Snapdrive/SME implementations. I set snap reserve to 0% and leave FR at 100% as a best practice. I realize that I also need to set aside available space in the volume for snapshots. I haven't defined that space (i.e. snapshot reserve) but do need to have space available. I suppose you could call it thin proviisioning snapshots :-)

 

So, back to my example. I've run all calculations that indicate my lun size is 200GB, FR is 200GB and I've calculated that I require 100GB space for snapshots. Therefore, I size my volume to be at least 500GB.

 

Now, I go through my previuos example. Accoring to your statement, the overwrites will attempt to FIRST write to the available free space in the volume - which is the space I've set aside (not specifially, but "thin provisioned") for snapshots.

 

My example stated 100GB overwrite. Are you saying that this 100Gb overwrite would FIRST look to fill up this available free space in the volume and NOT the FR space (200GB)? That's where I'm a little confused. If it does look to fill up that available free space first, it will consume the 100GB that had been made available for storing snapshots. The volume will report full and need to be grown. Why couldn't the overwrite have been stored in the FR space, leaving the available free space alone? That's my question.

 

Thanks for the other information on the FR monitoring within SnapManager

Mar 22, 2009 11:23 AM chriskranz chriskranz    says in response to ianaforbes:

Okay cool!

 

So yes, you are right, it doesn't make entire sense! Chris Holloway followed up with a comment to highlight the point you are making. The FR is a safety net. So when all else has failed, we can still guarantee writes to your LUN volume. So yes, it will write to the free space (even if you have reserved this in your head for snapshots) for writing data to. In reality we don't need to guarantee 100%, so we can tweak this setting, and it has always seemed a little bizarre the way this works, but it is the best way for the filer to guarantee your data and backups.

 

Back to your example, yes you are completely right, you would run out of space before being able to take the next snapshot. It would probably be more likely to do this if you have snapshots already, but yes, ultimately you will run out of space. Which is why traditionally sizing with FR at 100% can seem very wasteful, and is sometimes a bit of effort to justify. Consider the cost savings of having online backups is usually a good one, but now this is becoming a harder sell, so by reducing the FR, and having dedupe, thin provisioning and everything else, we have a fantastic message and we can really tweek every byte out of the system.

 

Hope this helps? But feel free to post some more

Mar 22, 2009 11:23 AM ianaforbes ianaforbes    says in response to chriskranz:

Hehe. I'm a reseller also and do a bunch of SnapManager implementations. The struggle is ALWAYS around FR space. It can really be a pain. The problem with going with the change rate analysis is that many customers have no idea or this is a new solution architecture and there is no data to draw upon for accurate sizing.

 

That's why I usually go with 100% FR to 100% protect the customer and let them know that over time the actual amount will reveal itself and at that point you can scale down from 100% to something more reasonable. kinda tough to tell them to go with 30% and receive a phone call the next weekend complaing that their luns are offline for some reason.

 

If I could go with something low like 10% and just enable autogrow volume that would be great. Is that pretty much what you do for your implementations? I'd be happy to leave any automatic snapshot deleteion out of the picture.

 

Question - Can you have SnapManager monitor the FR and autogrow the volume if needed? Is that functionality all within SnapManager?

Mar 22, 2009 11:29 AM ianaforbes ianaforbes    says in response to chriskranz:

Could you post an example of turning on autogrow on a volume? I'm assuming you autogrow by a bit - see if it helps, autogrow a bit more, etc. There must be a "maximum autogrow" setting as well...

Mar 22, 2009 11:33 AM chriskranz chriskranz    says in response to ianaforbes:

Totally understand you! I had a very unfortunate experience with a customer who didn't listen to any of our recommendations and just dropped everything to 0% and left it. Sure enough it all went offline! But they quickly learnt.

 

And I agree totally, it is very difficult to guage rates of change with new customers. Especially with Exchange as it does online defrags all the time!!! I would generally ask a customer to run a system at 100% for atleast a month and then we'd come back and review the setup and make the adjustments as necessary. Sometimes it can highlight poor system design. I still have customers insisting that they need to do entire SQL database reloads every week as that's how their application was coded!!! Crazy!!!

 

FR is sometimes a difficult concept to come in on new, especially if the customer has never used a NetApp system before as it is very different in many of its concepts. So the learning curve tends to be very high, and so tweaking these settings aren't necessarily recommended for the first timer. But that is the main reason for the above blog post!

Mar 22, 2009 11:42 AM ianaforbes ianaforbes    says in response to chriskranz:

Hehe...Sounds like we have similar experiences I've been working with Netapps since 2000 (pre 7G and trad vols, etc). It's pretty much taken me recently to fully (well, almost) understand FR, space reservations and the such. I always feel sorry for newbie customers because I put in this new solution for them and they see how the Snapmanager software is a real great thing. They just don't comprehend what ONTAP and WAFL are doing in the background.

 

It's tough enough for us to understand - how the heck are customers supposed to get it? They ALWAYS want to wratchet down the space in the volume because they're hung up on how NTFS or EXT3 does file system accounting.

 

In any case, great post and great information. I hope you don't mind me bugging you with questions every now and then If you could send me a quick example on the autogrow feature (i.e. how to enable and configure it on the volume) I'd appreciate it.

Mar 22, 2009 11:51 AM chriskranz chriskranz    says in response to ianaforbes:

No problem at all buddy, I'm always happy to have like-minded techies around, and the questions help us both learn. I actually have to think about stuff for a change!

 

Okay, quick example of enabling volume autosize

 

Firstly the syntax for you...

b2net-filer01> vol autosize
vol autosize: No volume name supplied.
usage:
vol autosize <vol-name> [-m <size>[k|m|g|t]] [-i <size>[k|m|g|t]]
        [ on | off | reset ]

 

No I have my Exchange database volume, currently it is 500g, but I want to enable volume autogrow to increment in 10g chunks to a maximum of 500g.

b2net-filer01> vol autosize exch01_db -m 600g -i 10g on
vol autosize: Flexible volume 'exch01_db' autosize settings UPDATED.

 

I also want to make sure that the filer will try the autosize first and not the snap autodelete (if anyone in NetApp is reading this, it'd be nice to use the same naming conventions through-out autosize here, volume_grow there )

 

b2net-filer01> vol options exch01_db
nosnap=off, nosnapdir=off, minra=off, no_atime_update=off, nvfail=off,
ignore_inconsistent=off, snapmirrored=off, create_ucode=on,
convert_ucode=on, maxdirsize=20971, schedsnapname=ordinal,
fs_size_fixed=off, guarantee=volume, svo_enable=off, svo_checksum=off,
svo_allow_rman=off, svo_reject_errors=off, no_i2p=off,
fractional_reserve=100, extent=off, try_first=volume_grow

 

And we can see that "try_first" is by default set to volume_grow. Then finally we can verify the setting.

 

b2net-filer01> vol autosize exch01_db
Volume autosize is currently ON for volume 'exch01_db'.
The volume is set to grow to a maximum of 600 GB, in increments of 10 GB.

 

And just to be clear, this process is totally independent of SnapDrive or SnapManager. Ultimately they are goign to trigger a snapshot on the filer, and the above mechanism is based on snapshots, so we are all good, and it should work fine. The best thing, is it should be transparent to SnapDrive, SnapManager and the end user!

 

So if you have a new customer that doesn't really get it, set up some policies with enough headroom to cover the unexpected, and make sure the customer is monitoring things atleast a little bit. Even if you are keeping FR at 100%, this mechanism really wouldn't hurt. What is more important to the customer, free space on the disks or Exchange / SQL being completely happy?

Mar 22, 2009 11:58 AM ianaforbes ianaforbes    says in response to chriskranz:

Cool. Thanks a lot for the example Chris. Where do you configure the trigger for autogrow? For example, if the volume gets to 85% full kick off the autogrow.

Mar 22, 2009 12:11 PM ianaforbes ianaforbes    says in response to chriskranz:

Actually, I do have another question to ask. How do you size snapinfo when sizing for Snapmanager for SQL. The docs give zero calculations for calculating snapinfo lun and volume size.

Mar 22, 2009 12:23 PM chriskranz chriskranz    says in response to ianaforbes:

Setting the trigger is a very good question, and I'm going to have to look into it. I believe that the trigger is when a snapshot would fail, and so it triggers the mechanism. I'm not sure if you can define that you want your volume kept to 85%, although you can with snap autodelete funnily enough! The following article - https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb6973 - shows that the setting is defined by "wafl_reclaim_threshold", and I'm just trying to find what/where that actually is!

 

Sizing SnapInfo is fairly straight forward once you know how much logs data they generate. The Windows volume: Logs DB size * num logs online to keep + a bit for system databases and a bit more headroom. The NetApp FlexVol gets sized in a similar way, taking into account how many snapshots you'll be keeping online as opposed to how many logs you will be keeping, the default is 7 days worth of snaps I believe.

 

It shouldn't be too tricky to calculate a customers rate of logging, although remember SQL logs are a database, so they may change when you stick SMSQL in there as you may truncate them more often. In many cases, when you start taking snapshots, you'll want to do a db-shrink the first time to get it down to size. This needs a bit of thinking about and calculating, the way SQL does it's logging is sometimes totally lost on customers (and some engineers I've found )

Mar 22, 2009 12:58 PM ChrisHolloway ChrisHolloway    says in response to ianaforbes:

Ian

 

SnapManagers don't have any visibility of vol autogrow, and as they only monitor FR space, the snap delete from SME/SMSQL only gets triggered when fractional reserve is beginning to be used.  As has already been mentioned by Chris in another comment, FR only gets used when all other space in the volume has been used, and if you're using vol autogrow, this would mean that the volume has been grown to it's maximum size.  What I do is set the FR on the volume to be around 10% and set my trigger for deleting snapshots to be around 2-3%.  That way I've got a bit of a buffer if the snap delete from SM does take a little longer.

 

Finally, never use snap auto-delete from within ONTAP if you're using SnapManagers, as SnapManager will get very upset if you start deleting it's snaps.

 

Chris

Mar 22, 2009 1:04 PM ianaforbes ianaforbes    says in response to ChrisHolloway:

Hi Chris

 

Thanks for the response. I still think it makes more sense for overwrites to occupy the FR space BEFORE looking to use available free space in the volume. It'd be a more efficient use of spce IMHO. Kind of doesn't make sense that a volume can be full when there is plent of room in the FR.

 

Where and how do you set your trigger for deleting the Snapmanager snapshots? I see the Fractional Reserve option in SME/SMSQL but i'm unsure how to enable it and make it work.

Mar 22, 2009 1:12 PM ChrisHolloway ChrisHolloway    says in response to chriskranz:

Chris

 

The actual trigger varies according to the size of the volume, with larger volumes having a (slightly) higher trigger.  You can change these, but I've always been told it's a bad idea to do so (though what it would cause I don't know).  These are the defaults (or at least last time I noted them down):

 

Volume size < 20GB trigger point is 85%

 

Volume size 20-100GB trigger  point 90%

 

Volume size 100-500GB trigger point 92%

 

Volume size  500GB -1TB trigger point 95%

 

Volume size > 1TB trigger point 98%

 

I'll try and find where these are set, I've got it written somewhere.

 

Chris

Mar 22, 2009 1:18 PM ChrisHolloway ChrisHolloway    says in response to ChrisHolloway:

Found it:

 

Its a priv set diag level command.  You can view them with printflag:

 

wafl_reclaim_threshold_t = 85
wafl_reclaim_threshold_s = 90
wafl_reclaim_threshold_m = 92
wafl_reclaim_threshold_l = 95
wafl_reclaim_threshold_xl = 98

 

The letter at the end is the size (tiny, small, medium, large, extra large).
  I have no idea whether changing these would be supported.

 

Chris

Mar 22, 2009 1:20 PM ianaforbes ianaforbes    says in response to ChrisHolloway:

Awesome! Thanks Chris!

Mar 22, 2009 1:32 PM ChrisHolloway ChrisHolloway    says in response to ianaforbes:

Ian

 

SnapManager fractional reserve monitoring is under Fractional Reservation -> Policy Settings (I'm looking at SME 6 here).  You need to enable it, and then set the threshold you want to use and how many of the most revent snaps you want to retain.

 

Chris

Mar 23, 2009 8:01 AM ianaforbes ianaforbes    says in response to ChrisHolloway:

Hi Chris I'm running SME5.0. When I attempt to configure the Fractional Space Reservations Settings I get the following error:

 

Cannot obtain FSR Policies. Please verify if the SnapManager for Exchange Monitoring Service is running. SME UI will attempt to reconnect with the service and refresh the data within 15 seconds.

 

I search on the knowledgebase came up with this:

 

https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb17622

 

I'm running ONTAP 7.3. I also don't see a windows service called SnapManager Monitor Service. Do you know what's going on?

Mar 24, 2009 2:31 PM chriskranz chriskranz    says in response to ianaforbes:

What version of SnapDrive are you using?

Mar 24, 2009 2:32 PM chriskranz chriskranz    says in response to chriskranz:

Matt Robinson has posted a great blog article which is probably a good follow on from anyone reading through this - http://communities.netapp.com/blogs/ServiceBytes/2009/03/23/fsr-in-action

Mar 24, 2009 4:06 PM BrendonHiggins BrendonHiggins    says in response to chriskranz:

A great post and I now understand FR enough not to be filled with fear when asked to explain it again.   Will point others this way.

 

Brendon

Apr 23, 2009 8:08 PM david.edborg david.edborg    says in response to BrendonHiggins:

Great info.  What I think would be helpful to further understand the situation is an explanation of why LUN overwrites need to be treated differently than NFS or CIFS overwrites.  Along with an explanation of SnapDrive space reclamation and an explanation of the rumors that Oracle has an overwrite fix in the wings.

Jun 23, 2009 6:31 AM marcconeley marcconeley    says in response to david.edborg:

(please ignore)

Jun 25, 2009 10:20 AM radek.kubka radek.kubka    says:

In a case someone will peek in here quicker than into my post - http://communities.netapp.com/message/12493#12493

 

Can frequent use of realloacte destroy this rosy picture (of 0%/few% FR being good enough)?

Oct 7, 2009 11:15 AM Sebastian.Goetze Sebastian.Goetze    says:

My personal definition for Fractional Reserve and Snapshot Reserve  when I'm teaching courses about it is:

 

Snapshot Reserve:   Amount of space reserved for Snapshots to ensure backup functionality. AFS can't grow into it.

Fractional Reserve:  Amount of space reserved for the Active File System to ensure Host/Client functionality. Snapshots can't grow into it.

 

They neatly complement one another and my students seem to get the point fairly fast.

One important point to note is that this whole business of the FR space being used 'last' (seemingly having been tucked away before) makes it seem, as if theres a certain 'place' pre-assigned for it, which isn't the case. It's just bookkeeping and we're making sure that the snapshots will never use more space than they're allowed to. IAW:

 

Snapshot Reserve:  Space reserved for Snapshots

Fractional Reserve: Space reserved for Not-Snapshots (the AFS...)

 

Hope that helped

 

Sebastian

Oct 7, 2009 11:24 AM chriskranz chriskranz    says in response to Sebastian.Goetze:

That's a decent high level view of FR, but not sure if it's too technically accurate.

 

The AFS is protected by fact of it being a LUN, but if your snapshots get out of hand, they can indeed inpeed on the available space that the AFS has to write into. That's a key area where fractional reserve gives you some head room. Snapshots can, and do regularly grow into the Fractional Reserve space.

 

Fractional Reservation is really space for the AFS to over-write existing data, which is why Operations Manager terms it as the LUN Over-write sapce.

 

There is a pre-assigned place for it, but as with all things NetApp, this is just an area of space that could exist anywhere on the disk. Disk space assignment is always extrapolated from physical placement on the disks, so in NetApp terms the space for FR and Snaps are definitely pre-assigned, just not physically. Similar memory / CPU reservations in VMware where space is reserved, but not physical blocks, just the assumption that a certain quantity is not used by other areas.

 

Sorry to get granual with that, I think I'm splittin hairs a little. I've gotten wrapped up in FR a lot the past year or so, and I tend to get stuck on the detail!

Oct 7, 2009 12:46 PM Sebastian.Goetze Sebastian.Goetze    says in response to chriskranz:

Hmmmmm,

Snapshots can, and do regularly grow into the Fractional Reserve space.

the way I'm teaching it (and the way it's documented) a snapshot creation would fail if the space left after the snapshot blocks the currently active file system is less than the FR space.

 

The only way I can imagine the snapshots 'growing into the FR' is, when (at the time of taking the snapshot) everything was still OK, but afterwards the AFS grew (and thereby also the FR) and the snapshots are now using up space that the (now bigger) FR 'should have protected'. Can't blame the snapshot, though. It's just that by definition the FR is a moving target (until reaching it's maximum size of x% of the volume size as opposed to x% of the AFS size...).

 

Was that what you observed?

 

Sebastian

 

P.S. And yes, it's high level, but catchy... 

Oct 7, 2009 1:14 PM chriskranz chriskranz    says in response to Sebastian.Goetze:

Definitely agree with your description of events. I've seen it happen on a few other odd cases, but generally you're right, it's be caused by the AFS growing quicker than expected and as such using up extra space, or causing the FR to reserve more space than expected.

 

I agree the simpler you can explain FR, the better and easier it is to understand. But a lot of the rules around FR and snapshots tend to be a little flexible, and tend to be at the whim of the AFS. Ultimately the Active File System is the most important thing to the filer, and the filer gives this ultimate priority.