Yes, IBM is at it again with it’s storage innovation receiving 12 new patents for tape systems. What? You thought tape was a dead? Again? Tape is very much alive and kicking and while you may be jaded one way or another, tape is still the cheapest most reliable long term storage platform out there.
IBM is known for it’s innovation and the patents it is awarded every year. For the last 23 years, it has been awarded more patents in the US than any other company. Just in 2015, IBM was awarded 7355 patents compared to 7852 patents for Google, Microsoft GE and HP combined. Roughly 40% of the 18172 patents awarded went to IBM.
When you look at the 12 storage patents (listed here), you notice they are all from 2010- to 2014/15. They range from how the data is written to abrasion check. The people behind these technologies are brilliant to say the least and it shows in the details of the filing. While they are sometimes hard to read, the technology being introduced will save IBM customers time and money down the road.
IBM also uses its patents as a revenue source. Just in the last year, IBM sold patents to both Pure Storage and Western Digital. Since Pure and IBM compete in the all flash array environment, IBM must of gotten a huge sum of money for those patents to offset the ability to crush your competitor. None the less, IBM utilizes its investment of R&D buy selling the technology to others who may be spending their money elsewhere (like marketing and selling).
If you want to learn more about the IBM Storage Patents, click over here to read about them in detail.
Great new Blog from my friend Ravi Prakash. Follow him for all things Spectrum Control!….
Today if you are a customer in a sector like financial, retail, digital media, biotechnology, science or government and you use applications like big data analytics, gene sequencing, digital media or scalable file serving, there is a strong possibility that you are already using IBM Spectrum Scale (previously called General Parallel File System or GPFS).
A question foremost in your mind may be: “If Spectrum Scale has its own element manager – the Scale GUI, what would I gain from using Spectrum Control?”
The Spectrum Scale GUI focuses on a single Spectrum Scale cluster. In contrast, the Spectrum Control GUI offers a single pane-of-glass to manage multiple Scale clusters, it gives you higher level analytics, a view of relationships between clusters, the relationships between clusters and SAN attached storage. In future, we expect to extend this support to Spectrum Scale in hybrid cloud scenarios where Spectrum Scale may be backed…
View original post 382 more words
I got a great question the other day regarding VMware Raw Device Mappings:
If an RDM is a direct pass though of a volume from Storage Device to VM, does the VM need MPIO software like a physical machine does?
The short answer is NO, it doesn’t. But I thought I would show why this is so, and in fact why adding MPIO software may help.
First up, to test this, I created two volumes on my Storwize V3700.
I mapped them to an ESXi server as LUN ID 2 and LUN ID 3. Note the serials of the volumes end in 0040 and 0041:
On ESX I did a Rescan All and discovered two new volumes, which we know match the two I just made on my V3700, as the serial numbers end in 40 and 41 and the LUN IDs are 2 and 3:
I confirmed that the…
View original post 356 more words
We have been getting this question about clustering the storage controllers on the V7000 Unified (V7kU) more and more as people start expanding their systems beyond their initial controllers. But let’s step back a few steps and understand what we are working with first.
- V7kU is a mixed protocol storage platform. It uses Spectrum Scale as the file system and Storwize as the operating system. This is important as people get interested in how they can adopt a high speed, parallel file system with grace and ease. The V7kU comes preloaded so no need to understand the knobs and switches of installing and configuring Spectrum Scale (formerly known as GPFS). The V7kU supports SMB (CIFS), NFS, FC, FCoE, iSCSI and can be used with other building blocks like Openstack to support Object Storage too.
- V7kU can scale up to 20 disk enclosures per controller. This platform can cluster up to four controllers giving customers a chance to max out around 7.5 PBs of storage. The best part is you can mix and match drives types and sizes. You can have flash drives in the same enclosure as SAS and NLSAS drives.
- Single interface is the best part of this solution. You can provision both block and file access from the same gui/cli. Data protection like snapshot s and flash copies, replication and remote cache copies.
- Policy based data management. One of my favorite parts of the solution is I can create policies to manage the data on the box. For example, I can create a policy that says if my flash pool becomes 75% full start moving the oldest data to the NLSAS pool. Not only does this make my job easier not having to manage the data move, but it frees up the flash pool and extends the buying power of the flash. Since flash is the most expensive part of the storage, I want the best bang for the buck there.
Now comes the part of can we cluster these V7000s to make a bigger pool, yes we can. Not only can we cluster the systems (multiple IO groups) we can mix the file and block independently. The best part as you add IO groups you add more performance, capacity all the while managing it from the same single interface.
This was taken from the V7000 Infocenter:
- Issue this CLI command to list the node candidates:
This output is an example of what you might see after you issue the lsnodecandidate command:
id panel_name UPS_serial_number UPS_unique_id hardware 50050768010037DA 104615 10004BC047 20400001124C0107 8G4 id panel_name UPS_serial_number UPS_unique_id hardware 5005076801000149 106075 10004BC031 20400001124C00C1 8G4
- Issue this CLI command to add the node:
addnode -panelname panel_name -name new_name_arg -iogrp iogroup_name
where panel_name is the name that is noted in step 1 (in this example, the panel name is 000279). The number is printed on the front panel of the node that you are adding back into the system. The new_name_arg is optional to specify a name for the new node; iogroup_name is the I/O group that was noted when the previous node was deleted from the system.Note: In a service situation, add a node back into a clustered system using the original node name. As long as the partner node in the I/O group has not been deleted too, the default name is used if -name is not specified.
This example shows the command that you might issue:
addnode -panelname 000279 -name newnode -iogrp io_grp1
This output is an example of what you might see:
Node, id [newnode], successfully addedAttention: If more than one candidate node exists, ensure that the node that you add into an I/O group is the same node that was deleted from that I/O group. Failure to do so might result in data corruption. If you are uncertain about which candidate node belongs to the I/O group, shut down all host systems that access this clustered system before you proceed. Reboot each system when you have added all the nodes back into the clustered system.
- Issue this CLI command to ensure that the node was added successfully:
This output is an example of what you might see when you issue the lsnode command:
id name UPS_serial_number WWNN status IO_group_id IO_group_name config_node UPS_unique_id hardware 1 node1 1000877059 5005076801000EAA online 0 io_grp0 yes 20400002071C0149 8F2 2 node2 1000871053 500507680100275D online 0 io_grp0 no 2040000207040143 8F2
All nodes are now online.
Currently, I am working with a customer on their archive data and we are discussing which is the better medium for their data that never gets read back into their environment. They have about 200TB of data that is sitting on their Tier 1 that is not being accessed, ever. The crazy part is this data is growing faster than the database that is being accessed by their main program.
This is starting to pop up more and more as the unstructured data is eating up storage systems and not being used very frequently. I have heard this called dark data or cold data. In this case its frozen data.
We started looking at what it would cost them over a 5 year period to store their data on both tape and cloud. Yes, that four letter word is still a very good option for most customers. We wanted to keep the exercise simple so we agreed that 200TB would be the size of the data and there would be no recalls on the data. We know most cloud providers charge extra for the recalls so we wanted and of course the tape system doesn’t have that extra cost so we wanted an apples to apples comparison. As close as we could.
For the cloud we used Amazon Glacier pricing which is about $0.007 per GB per month. Our formula for cloud:
200TB X 1000GB X $0.007 x 60 months = $84,000
The tape side of the equation was a little more tricky but we decided that we would just look at the tape media and tape library in comparison. I picked an middle of the road tape library and the new LTO7 media.
Tape Library TS3200 street price $10,000 + 48 LTO7 tapes (@ $150 each) = $17,200
We then looked at the ability to scale and what would happen if they factored in their growth rate. They are growing at 20% annually which translates to 40TB a year. Keeping the same platforms what would be their 5 year cost? Cloud was..
200TB + (Growth of 3.33TB per month) x 1000GB x 60 months = $125,258
Tape was calculated at:
$10,000 for the library + (396TB/6TB LTO7s capacity)x$150 per tape = $19,900
We all here how cloud is so much cheap and easier to scale but after doing this quick back of the napkin math I am not so sure. I know what some of you are saying that we didn’t calculate the server costs and the 4 FTEs it takes to manage a tape system. I agree this is basic but in this example this is a small to medium size company that is trying to invest money into getting their product off the ground. The tape library is fairly small and should be a set it and forget it type of solution. I doubt there will much more overhead for the tape solution than a cloud. Maybe not as cool or flashy but for $100,000 over 5 years they can go out and buy their 5 person IT staff a $100 lunch everyday, all five years.
So to those who think tape is a four letter word and is that thing in the corner that no one wants to deal with, I say embrace it and squeeze the value out of them. Most IT shops have tape still and can show to their finical teams how they can lower their cost with out putting their data at risk in the cloud with this:
IBM changed the way they are going to market with the Spectrum Storage family of software defined storage platform. Since the initial re-branding of their software formerly known as Tivioli, XIV, GPFS, SVC, TPC and LTFS, the plan was to create a portfolio of packages that would aid in protecting and storing data on existing hardware or in the cloud. This lines up with how Big Blue is looking for better margins and cloud ready everything.
These platforms, based on a heritage of IBM products, now are available as a suite where a customer can order the license (per TB) with unlimited usage for all six offerings. The now allows customers to move more rapidly into the SDS environment not have a complex license agreement to manage. All of the Spectrum family is based on a similar look and feel and support is all done through IBM.
Clients will have to license the software only for production capacity. Since all of the software is part of the suite, clients can also test and deploy different items and mix and match as they see fit. If you need 100TB of data protection, this allows you to have 50TB or Spectrum Protect and maybe 50 TB of Spectrum Archive. If you then need to add storage monitoring IE Spectrum Control, then your license count doesn’t start from 0 but at 100TB. If anything has taught me working with IBM, the more you buy of the same thing the cheaper per unit it will be in the end.
For more information on the Spectrum Storage Suite go to the IBM home here:
I had a customer recently upgrade their XIVs to the 11.6.1 code and wanted to know more about the real time compression that is now built into the code. IBM purchased the compression technology from a company named “Storwize” and adopted the name for their mid-range product family. One of the cool things that came out of that acquisition is the RACE engine that runs the compression algorithm.
Basically the compression is your standard LZ compression with some cool technology that keeps meta data about what you are writing. For XIV, this happens before the data is written to cache and turns all of those random writes into a sequential write. If you want to dig into the way RTC works check out this link.
The XIV team took the compression technology and integrated it into the XIV. The GUI even will show you an estimate on which volume will get better compression savings in order to fine tune your workloads.
Q. How does the compression work? Is it on all the volumes/pools?
A. Compression can turned on if a pool is set to “thin”. This thin pool then can have both thick and thin volumes in it. Once compression is turned on and licensed you can convert any thin volume to a compressed volume by right clicking on it and choosing compress. You can also tell XIV to compress all of your volumes or de-compress all of your volumes.
Q. Can I turn it off and on based on compression savings?
A. Yes, compression can be turned off on the whole box or just one volume. You get to decide based on the compression savings
Q. What size volumes can it handle?
A. The maximum is 10TB and the minimum is 52GB
Q. What kind of performance hit will it have on the XIV?
A. There are some benchmarks in the redbook with turning on compression that can help you decide on compression. Basically it comes down to two things: the workload and the model of XIV. If you look in the Compression Savings field and it is 30% or less then the you should not compress that data. If you have a new 314 model or 214 model compression can be turned on but 114 models need to be checked to make sure there is enough horsepower.
Q. What type of data does not compress well?
A. I get this question a lot. The basic answer is any time of data that has either already been compressed. Also backups seem to have lower compression savings. The better answer is always look at the compression savings in the gui and base conclusion on that output.
Q. Can I compress volumes that are mirrored?
A. Of course. Mirrored volumes can be compressed on both sides. If the mirror already exists then the mirror has to be broken, data compressed and the mirror copy has to be re-synced. We have seen a major performance improvement with compressed mirrored volumes as the amount of data being transferred is cut into half or less.
If you have questions about running compression on XIV or any other IBM platform leave a comment and I will try to answer them here.