I have closet in my house that I keep all kinds of computer gear. Most are things from some fun project that I was working or a technology that is past is prime. There is everything from Zip drives to coax termination to a Ultra-wide scsi interface for an external CDROM. Why do I keep these things in a box in a closet? Great question that usually comes up one a year from some family member that sticks there head in there looking for a toy, coat or looking to make a point.
But on more than one occasion I have had to go to the closet of ‘junk’ to get something that helped me in completing a project. A Cat5 cable for my son’s computer, an extra wireless mouse when my other one died. Yes I could go through it all and sort it out and come up with some nice labels for it all, but that takes time. It’s just easier to close the container lid and forget about it until I realize I need something and its easy enough to grab it.
Now this is not a hoarding issue like those you see on TV where people fill their house, garage, sheds and barns with all kinds of things. Those people who show up on TV have taking the ‘collecting’ business to another level and some call them ‘hoarders’. But if you watch shows like “American Pickers” on the History Chanel, you will notice that most of the ‘hoarders’ know what they have and where, a meta data knowledge of their antiques.
When you look at how businesses are storing their data today, most are looking to keep as much as possible in production. Data that is no longer serving a real purpose but storage admins are too gun shy to hit the delete button on it for fear of some VMWare admin calling up to see why their Windows NT 4 server is not responding. If you have tools that can move data around based on the age or last accessed then you have made a great leap into making savings. But these older ILM systems can not handle the growth of unstructured data of 2017.
Companies want to be able to create a container for the data and not have to worry if the data is on prem, off prem, on disk or tape. Set it and forget it is the basic rule of thumb. But this becomes difficult due to the nature of data as it has many different values depending on who you ask. A 2 year old invoice is not as valuable to someone in Engineering as it is to the AR person who is using it to base their next billing cycle.
One of the better ways to cut through the issue is to have a flexible platform that can move data from expensive flash down to tape and cloud with out changing the way people access the data. If the user can not tell the difference where his data is coming from and does not have to change the way he gets to it then why not look at putting the cold data on something low cost like tape and cloud tape.
This type of system can be accomplished but using the IBM Spectrum Scale platform. The file system has a global name space across all of the different types of media and can even use the cloud as a place to store data without changing the way the end user will access the data. The file movement is policy based and allows admins to not ask the user if the data is needed, it simply can move it to a lower cost as it gets older/colder. The best part is because of a new licensing scheme, customers only pay the TB license for data that is on disk and flash. Any data that sits on Tape does not contribute to the overall license cost.
For example: 500TB of data, 100 TBs that is less than 30 days old and 400 that will greater than 30 days. If stored on a Spectrum Scale file system, you only have to pay for the 100 TBs that is being stored on disk and not the 400 TB on tape. This greatly reduces the cost to store data as while not taking features away from our customers.
For more great information on the IBM Spectrum Scale go here to this link and catch up.
Currently, I am working with a customer on their archive data and we are discussing which is the better medium for their data that never gets read back into their environment. They have about 200TB of data that is sitting on their Tier 1 that is not being accessed, ever. The crazy part is this data is growing faster than the database that is being accessed by their main program.
This is starting to pop up more and more as the unstructured data is eating up storage systems and not being used very frequently. I have heard this called dark data or cold data. In this case its frozen data.
We started looking at what it would cost them over a 5 year period to store their data on both tape and cloud. Yes, that four letter word is still a very good option for most customers. We wanted to keep the exercise simple so we agreed that 200TB would be the size of the data and there would be no recalls on the data. We know most cloud providers charge extra for the recalls so we wanted and of course the tape system doesn’t have that extra cost so we wanted an apples to apples comparison. As close as we could.
For the cloud we used Amazon Glacier pricing which is about $0.007 per GB per month. Our formula for cloud:
200TB X 1000GB X $0.007 x 60 months = $84,000
The tape side of the equation was a little more tricky but we decided that we would just look at the tape media and tape library in comparison. I picked an middle of the road tape library and the new LTO7 media.
Tape Library TS3200 street price $10,000 + 48 LTO7 tapes (@ $150 each) = $17,200
We then looked at the ability to scale and what would happen if they factored in their growth rate. They are growing at 20% annually which translates to 40TB a year. Keeping the same platforms what would be their 5 year cost? Cloud was..
200TB + (Growth of 3.33TB per month) x 1000GB x 60 months = $125,258
Tape was calculated at:
$10,000 for the library + (396TB/6TB LTO7s capacity)x$150 per tape = $19,900
We all here how cloud is so much cheap and easier to scale but after doing this quick back of the napkin math I am not so sure. I know what some of you are saying that we didn’t calculate the server costs and the 4 FTEs it takes to manage a tape system. I agree this is basic but in this example this is a small to medium size company that is trying to invest money into getting their product off the ground. The tape library is fairly small and should be a set it and forget it type of solution. I doubt there will much more overhead for the tape solution than a cloud. Maybe not as cool or flashy but for $100,000 over 5 years they can go out and buy their 5 person IT staff a $100 lunch everyday, all five years.
So to those who think tape is a four letter word and is that thing in the corner that no one wants to deal with, I say embrace it and squeeze the value out of them. Most IT shops have tape still and can show to their finical teams how they can lower their cost with out putting their data at risk in the cloud with this:
As is customary for many bloggers at the end of the year we take time to reflect on the past year and make some predictions for next year. This is always fun because I get a chance to see what people predicted for this year and who was right / wrong. Some people are more right than wrong but its fun to guess at what will happen next year none the less.
2011 was a great year for storage and IT as a whole. A couple of highlights I think were important points this year:
- SSD pricing drops significantly to approximately $3 per GB. With the flooding in Thailand, the price for spinning drives went sky high back in October. Since then, the prices have started to decline but not as quickly as the SSDs.
- Tape is still around and is giving people options. There are only a handful of vendors than even like to talk about tape as another storage tier. Those who do have it in the bag of options levered this as something the others can not provide as a full solution.
- Archive and backup were debated, debated again and hopefully the marketing people have learned the difference. I think there are times were backups can be archives but not the other way around. There are people out there that backup their archives but that is a whole blog article to it’s self.
- Mobile apps were plentiful from Fruit Ninja to Facebook to business app like Quick Office flooded the market place. Not only were people developing for the iPad, iPod and iPoop platforms but we saw the rise of the Droid (Google) and Blackberry (RIM) even Windows is now reporting over 50,000 apps available for downloads.
- Clouds got a little more defined and people are starting to see the benefits of having the right ‘cloud’ for the right job. This time year the future of clouds was a little cloudier than what we see them as today. The funny thing I believe most people realized this year was we have been clouding in IT for a long time just under different names.
- Social media was the biggest part of IT in 2011 in my opinion. I saw a fundamental shift in how people got information and how that influenced their decisions. From CEOs blogging to Charlie Sheen going up in flames on Twitter, the warlocks and trolls out there were craving something more. Social media is the new dot com era and now we wait for the bubble to burst.
Now for the good part. Here is what I think 2012 will bring to the IT / Storage world. Note: If any of these do or don’t come true I will deny any involvement.
- Big Data moves into full swing. If you think you heard a lot of people talking about Big Data in 2011 then prepare yourselves for the avalanche of bloggers, researchers, analysts, marketers, sales people, you name it to bombard you with not just what Big Data is but what to do with it. I suspect technologies like Hadoop and Analytics will drive products more than typical structured data storage.
- Protection of remote data on mobile devices like tablets and phones will be a bigger concern. With the rise of these devices people have started to move from the traditional desktop or even laptop. There is already an uptick on the number of viruses in the wild that are designed for mobile users. As more data is generated and stored on them the higher the risk companies face in loosing data. I predict companies will either rely on public clouds like Amazon S3 or Dropbox to help protect from data loss or even go private and force users to back their data to a central repository.
- Software will continue to drive innovation over hardware. Virtualization was a big part of 2011 and that will only continue to grow bigger in the datacenter. Storage systems for the most part are made up of the same parts from the same suppliers. It’s the software that is put to use that drives the efficiency, performance and utilization of the hardware. Those storage vendors who can get out of the hardware speeds and features will show you how their solution solves the problems of today. There are still some customers who want to know how many 15K RPM drives are in the solution. I think there will always be gear-heads who want to know these things, but they are getting fewer.
- Scale out grid clustering with global name space will dominate the new dynamic of how to deploy units. Netapp should be releasing the long anticipated Cluster Mode of Data Ontap sometime in 2012. I hear not everyone will be getting carte blanche on the download so be ready to be told its still not prime time. Even though it took Netapp nine years to get a real product in the market for general consumption. Other vendors like EMC, IBM and HP will all be touting their own version of scale out / grid / you name it as the best way to drive up the efficiency number. Do your research and make sure to compare apples to apples. Just because they all say the same thing doesn’t mean its done the same way.
- Tiering will be on everyone’s mind. Even with the price of SSDs coming down they still are a bit pricey for a SSD only solution. Companies like Violin Memory or Fusion IO will help keep I/O up at the server instead of hitting the storage system. Automatic file based tiering like that of the Active Cloud Engine from IBM will determine from policies where data needs to be written and moved down the $/GB slope as it ages. IBM also has a great automatic tiering solution called Easy Tier that is on the DS8000, SVC and the V7000 takes the guess work out of when to put data on faster disks.
- Consolidation of smaller systems with fewer islands of storage will be a key initiative in 2012. As the economy flat lines on the global operating table IT budgets will be looking for ways to cut line items out of the cost of running the shop. This is a continuance from 2011 but with the push of ‘Big Data’ companies will be asked to take on more data demand with the same budget as last year. As customers pool their resources together to meet these demands they will pool their storage platforms together for better utilization. Look for data migration tools like SVC from IBM to help make these transitions easier.
Finally, I send you the best for a new year full of exciting challenges. Happy New Year!