(Please, don’t leave! This really is interesting. Seriously! I promise!)
Let’s start with a riddle: “If I lock down my electronic data so that it can’t be modified or deleted, how can I get the benefits of storage efficiency from NetApp deduplication and compression? Aren’t those two concepts mutually exclusive?”
Normally, this would be a riddle worthy of the Sphinx. But thanks to NetApp’s Unified Storage architecture, you CAN get both deduplication and compression benefits with our immutable storage solution, SnapLock. SnapLock offers WORM protection for your data, at the volume level. This is not “protection from worms (of the virus variety),” but “immutable storage:” WORM stands for Write Once, Read Many, and is the technology most readily seen in CDs and DVDs. In the business world, compliance-related archives used to be kept on Optical Platters, which used similar technology. Today, NetApp offers WORM functionality on magnetic disk drives via a combination of software and hardware to ensure that the data can’t be modified or deleted prior to a specified retention date. We even include a unique, separate “compliance clock” to prevent a less than honest person from resetting the system time to the future in order to delete data.
While other vendors offer similar solutions, NetApp is unique because SnapLock is not offered on a standalone platform, but is turned on via license key on any of our FAS systems. So, you can add it later on to an existing system, as well as mix and match SnapLock and non-SnapLock volumes in the same unit. And because it’s fully integrated into our operating system, both our highly-efficient block-level deduplication and our compression work with it. So, customers using an archive or Enterprise Content Management (ECM) application can benefit from our storage efficiency even for compliance data. In addition, only block-level deduplication on primary data can save space for applications that manage document versioning, since every new version of a document requires the application to store an entirely new copy of the file. We see up to 50% deduplication at customers in those environments, and that same level of deduplication is supported with SnapLock as well.
Which gets us back to the riddle – how can we de-duplicate data that is supposed to be immutable?!?!? Well, first you have to start with How NetApp Deduplication Works - A Primer, by Dr. Dedupe (Larry Freeman). From there you’ll see that what’s key to a file in the NetApp world are the “inode pointers” – that’s what our WAFL file system uses to reference the actual blocks of data (it’s also the key to NetApp Snapshot technology in general). With deduplication, when a unique block is written for the first time, we fingerprint the block and then avoid writing a block with the same fingerprint in the future. So, for SnapLock, the first block is written and “locked down”, which from our perspective means that we lock down the pointer and the block of data for the duration of the retention period. For future files containing the same block of data, we create a many-to-one relationship of pointers to the same block of data, and then retain the individual pointers until their respective retention dates expire. When all have expired, the data block can then be deleted.
Hard to visualize? No worries; see the following videos on our NetApp YouTube channel for a play by play description:
Finally, the more compliance-savvy readers may ask “OK, sounds reasonable, but does it meet very stringent regulatory requirements, like SEC 17a-4(f)?” Well, as you may already have guessed, the answer is a resounding “YES.” Back in 2009, Dr. Dedupe wrote a short blog about a Cohasset evaluation, and an updated version (with some technical revisions) of the Cohasset Technical Report has just been released. Cohasset concluded: “the NetApp SnapLock Compliance Storage Solution and NetApp Deduplication capabilities meet all of the requirements for which they are directly responsible as they relate to the recording, identification, retention protection and availability of electronic records with a time-based retention period as specified in SEC Rule 17a-4(f).”
(Still there? Thought so!)