Let’s talk about the performance characteristics of the EMC DataDomain and the IBM ProtecTIER. I had the privilege for some months to have both an EMC DD890 and an IBM TS7650G onsite. Both are close to top of the range and were connected via multiple 8 Gbps fibres – it doesn’t get much better than this.
I blogged before that, in my tests, the ProtecTIER was slower than the DataDomain. I suppose that’s fighting talk, so here’s how I get to that conclusion.
Simulating a backup
What we managed to do is to generate a dummy bytestream that we send via TSM to the device like so:
dsmSendObj(handle, stBackup, NULL, &objName, &objAttrArea, NULL)
for (i = 1; i <= numblocks; i++)
sz += 2 * 1024 * 1024;
This way we don’t have to read data off disk when we back up, or write data to disk when we restore.
Next we can make our data (pseudo)random, by setting up the first dataBlkArea from above like so:
for(i = 0; rhubarbrhubarb...; i++)
int j = (random() % sz) - 1;
area[j] = (random() % sizeof(long));
Subsequent blocks are just permutations of this first block.
When we set our “random” parameter to 0%, and we back in 8 streams, each object being 8 GB in size, the DataDomain backs up at 1.4 GB/s and the ProtecTIER at 1.2 GB/s.
When we set it to 100% we get 1.2 GB/s on the DataDomain but only 450 MB/s on the ProtecTIER. Also, the ProtecTIER recovers faster than it backs up, but it’s still significantly slower with our 100% “random” setting.
Identity vs Similarity based deduplication
The DataDomain is an identity based deduplication device. That means, when it inspects the incoming stream of data it tries to find exact matches. If it finds a match it stores only a “pointer” to the match. Because we’re looking for exact matches the segments are typically smaller, the index becomes larger and resides on disk.
The ProtecTIER is a similarity based deduplication device. That means, when it inspects the incoming stream of data it tries to find similar data, not exact matches. It then finds the difference between the incoming segment and the matching segment and stores a “pointer” + the difference. That means we can have much bigger segments, and a smaller index. In fact the ProtecTIER (Diligent before being bought by IBM) was designed from the ground up to address 1 PB of data with a 4 GB index. However, because we have to do a byte wise comparison on each incoming segment to find the difference, we need to read it all back from disk.
As it turns out, both approaches have disk contention issues.
The DataDomain tries to mitigate the disk problem in three ways:
1. In memory index summary
Incoming segments are fingerprinted (using a hash function such as SHA-1) and the fingerprint is added to a Bloom filter which we can use to do a very quick search. If no match is found using the Bloom filter, there is definitely no matching segment to be found in the segment index on disk. Otherwise there’s a good possibility we have a matching segment and we have to go to disk to find it.
2. Exploiting spatial locality
When a backup is run a second time there’s a good probability the data is still in the same sequence. Even if you added, changed or deleted blocks, most of the segments will be in the same order as before. That means, if the DataDomain finds a match, chances are that the match has neighbours which will match the incoming segment’s neigbours.
The DataDomain tries to exploit this spatial locality and calls it stream-informed segment layout (SISL).
Searching through fingerprints is hard because of their random nature. The DataDomain uses the same spatial locality idea in its caching algorithm.
These evasive actions apparently avoid 99% of all disk IOs.
From my investigations, it appears the ProtecTIER tries to mitigate the disk contention issue in …, uhm, 0 (zero) ways. Here’s what I found:
The ProtecTIER gateway needs two disk pools. One is a small pool for meta data and needs to be fast. They ask for RAID10. In this one they maintain the state of the VTL and do all kinds of housekeeping. The other is the big user data pool.
The 4 graphs below show the behaviour of the two pools under various loads. The two graphs on the right hand side show the latency and it’s clear there’s no issue there (this disk is all sitting on IBM XIV behind IBM’s SVC).
The 2 graphs on the left show IOPS for the meta data pool (top graph) and for the user data pool (bottom).
We ran a backup (8 streams, 1 large 8 GB object in each) just after 09:00 with our random parameter set to 0%. It finished around 09:05. You can see we did up to 675 write IOPS to meta data and some IOs on the user data.
We now restore those same objects in 8 streams starting at 09:07 and finishing just after 09:10. Very little IO on the meta data and some reads on user data.
Here’s the clincher: We now run a backup (8 streams again, 1 large 8 GB object in each) just after 09:10 with our random parameter set to 100%. Now we only finish 14 minutes later and did a lot more reading off the user data (up to 5500 IOPS). However, the number of IOs to the meta data pool stays the same (the area under the graph is the same).
Lastly we restore those same objects and see massive reads off the user data pool.
the degree of “randomness” has no effect on the meta data pool
Let’s call this switchedfabric’s first law of why the ProtecTIER gateway has disk issues.
switchedfabric’s first corollary on that:
You can’t speed up your backups in the face of more random data
by making the meta data pool faster.
It’s the user data pool that needs to be fast. It also needs to be big. It also needs to be affordable.
Storage sales persons love to tell you that you can have any two of those, but not all three.