A friend of mine asked whether he could string together a bunch of flash-
based iPods and build an enterprise storage array. While this seems like a crazy idea (imagine the mountain of discarded white ear buds) there is no question that flash memory in some form will become a big part of enterprise storage. Flash is fast – it is somewhere between DRAM and 15k rpm disk in terms of IOP/sec
. It is also expensive – at least 10x the cost/GB of the fast est disk, but the prices are falling fast. The performance assures that flash will be of great benefit when used to store the most active data on an array. The cost will determine just how much of the disk market ultimately converts to flash. No matter what, it will be there in a very important way.
Flash will emerge in several forms in enterprise storage. Enterprise quality SSDs
are becoming available now as an alternative to 15k rpm disks in storage shelves. While performance varies greatly across vendors and read/write mix, they are very fast – 5000 IOPs and up vs. about 300 IOPs for a 15k FC disk. This means a few SSDs will deliver more IOPs than a full shelf of partially filled 15k drives. They take less power and less space too. NetApp is in the process of certifying enterprise-grade SSDs that you can use in our existing storage shelves.
But flash is memory and is fast enough to be a layer of cache in a storage system. Imagine having a terabyte or more of very fast and low-latency cache to hold the most frequently accessed data in your array. NetApp is shipping a plug-in DRAM cache card
today and we will offer a version using flash chips next year.
One compelling advantage of using a cache approach is that you don’t have to manage another “Tier” of storage – the system automatically puts your most active data blocks into the fast flash storage.
This makes many disk data placement science projects unnecessary since the most active data will remain in the large flash cache. Not just the data you ‘think’ will be hot – actually the data that is hot. Manually planning disk data placement for performance reasons was fun in the 80s, but customers I talk to seem to care much more about saving time and increasing the agility of their infrastructure than mastering the eccentricities of their storage systems.
In addition, that cache can be deduped
so that it won’t fill up with identical blocks from multiple VMware images (NetApp does this today). If you define a policy that certain data volumes are more important, they can either be pre-loaded in cache, or designated to never be kicked out of cache. Or you can pin metadata in cache ahead of data. Lots of ways to optimize here using policy, not people.
For the next few years, you won’t be using a lot of flash capacity in your systems, not just because of the costs. At 10x or more the IOP rate of hard disks, it only takes a small number of SSDs in disk slots to saturate the performance of the array controller. It’s like trying to fly a model airplane in your living room – you’ll run into a system performance wall long before you hit capacity limits. This is another reason that flash as cache is economically efficient – it puts the necessarily small amount of very fast storage at a point in the architecture where you can best exploit the performance.
Flash is hot. While there is probably more smoke than fire right now, it will definitely produce significant improvements in enterprise storage and application performance. SSDs will be the first wave and will be easy to plug in. But the real innovation will be in how enterprise array designs adapt to embrace flash. Then the fun starts.