Quick and simple new way to look at storage. Stop buying flash arrays that offer a bunch of bells and whistles. Two main reasons, 1. It increases your $/TB and 2. It locks you into their platform. Lets dive deeper.
1. If you go out and buy an All Flash Array (AFA) from one of the 50 vendors selling them today you will likely see there is a wide spectrum not just from the media (eMLC, MLC, cMLC) but also in the features and functionality. These vendors are all scrambling to put in as many features as possible in order to reach a broader customer base. That said, you the customer will be looking to see which AFA has this or is missing that and it can become an Excel Pivot Table from hell to manage. The vendor will start raising the price per TB on those solutions because now you can have more features to do things therefore you now have more storage available or data protection is better. But the reality is you are paying the bills for those developers who are coding the new shiny feature in some basement. That added cost is passed down to the customer and does increase your purchase price.
2. The more features you use on a particular AFA, the harder it is to move to another platform if you want a different system. This is what we call ‘stickiness’. Vendors want you to use their features more and more so that when they raise prices or want you to upgrade it is harder for you to look elsewhere. If you have an outage or something happens where your boss comes in and say “I want these <insert vendor name> out of here”, are you going to say well the whole company runs on that and its going to take about 12-18 months to do that?
I bet your thinking well I need those functions because I have to protect my data or i get more storage out of them because I use this function, but what you can do is take those functions away from the media and bring it up into a layer above them in a virtual storage layer. This way you can move dumb storage hardware in and out as needed and more based on price and performance than feature and functionality. By moving the higher functionality into the virtual layer the AFA can be swapped out easily and allow you to always look at the lowest price system based solely on performance.
Now your thinking about the cost of licenses for this function and that feature in the virtualization layer and how that is just moving the numbers around right? wrong! For IBM Spectrum Virtualize you buy a license for so many TBs and that license is perpetual. You can move storage in and out of the virtualization layer and you do not have to increase the amount of licenses. For example. You purchase 100TB of licenses and your virtualize a 75TB Pure system. You boss comes in and says, I need another 15TB for this new project that is coming online next week. You can go out to your vendors and choose a dumb storage AFA array and insert it into the virtual storage layer and you still get all of the features and functions you had before. Then a few years go by and you want to replace the Pure system with a nice IBM flash system. No problem, with ZERO downtime you can insert the Flash 900 under the virtual layer, migrate the data to the new flash and the hosts do not have to be touched.
The cool thing that I see with this kind of virtualization layer is the simplicity of not having to know how to program APIs, or have a bunch of consultants come in for some long drawn out study and then tell you to go to ‘cloud’. In one way this technology is creating a private cloud of storage for your data center. But the point here is by not having to buy licenses for features every time you buy a box allows you to lower that $/TB and it gives you the true freedom to shop the vendors.
Cloud is changing the storage business in more ways than just price per unit. It is fundamentally changing how we design our storage systems and which way we deploy, protect and recover them. For those most fortunate companies who are just starting out the cloud is an easy task as there is no legacy systems or tried and true methods, it has always been on the ‘cloud’.
For most companies that are trying to find ways to cut their storage cost while keeping some control of their storage, cloud seems to be the answer. But getting there is not an easy tasks as most have seen. The transfer of data, code that has to be rewritten, systems and processes that all have to be changed just to report back to their CIO that they are using the cloud.
Now there are many ways to get to the cloud but one that I am excited about is using technology originally deployed back in the late 90s.
GPFS (errr, $1 in the naughty jar) Spectrum Scale is a parralel file system that can spread the data across many different tiers of storage. From flash to spinning drives to tape, Scale has the ability to alleviate storage administration by policy based movement of data. This movement is based on the metadata and is written, moved and deleted based on policies set by the storage admin.
So how does this help you get to the cloud? Glad you asked. IBM released a new plug in for Scale that treats the cloud as another tier of storage. This could be from multiple cloud vendors like IBM Cleversafe, IBM Softlayer, Amazon S3 or a private cloud (Think Openstack). The cloud provider is attached to the cloud node over ethernet and allows your Scale system to either write directly to the cloud tier or move data as it ages/cools.
This will do a couple of things for you.
- Because we are looking at the last read date, data that is still needed but the chance you will read it is highly unlikely can be moved automatically to the cloud. If a system needs the file/object there is no re-coding that needs to be done as the namespace doesn’t change.
- If you run out of storage and need to ‘burst’ out because of some monthly/yearly job you can move data around to help free up space on-perm or write directly out to the cloud.
- Data protection such as snapshots and backups can still take place. This is valuable to many customers as they know the data doesn’t change often but like the idea they don not have to change their recovery process every time they want to add new technology.
- Cheap Disaster Recovery. Scale does have the ability to replicate to another system but as these systems grow larger and beyond multiple petabytes, replication becomes more difficult. For the most part you are going to need to recover the most recent (~90 Days) of data that runs your business. Inside of Scale is the ability to create mirrors of data pools. One of those mirrors could be the cloud tier where your most recent data is kept in case there is a problem in the data center.
- It allows you to start small and work your way into a cloud offering. Part of the problem some clients have is they want to take on too much too quickly. Because Scale allows customers to have data in multiple clouds, you can start with a larger vendor like IBM and then when your private cloud on Openstack is up and running you can use them both or just one. The migration would be simple as both share the same namespace under the same file system. This frees the client up from having to make changes on the front side of the application.
Today this feature is offered as an open beta only. The release is coming soon as they are tweaking and doing some bug fixes before it is generally available. Here is the link to the DevWorks page that goes into more about the beta and how to download a VM that will let you test these features out.
I really believe this is going to help many of my customers move into that hybrid cloud platform. Take a look at the video below and how it can help you as well.