Posts Tagged ‘deduplication’
People who have been using VMware Data Recovery quickly discovered that this product had issues. VMware’s take on Data Recovery was that they wanted to have a backup product for the smaller shops with a short time-to-market. Too bad it was this product that drove a lot of users to Veeam or PHDvirtual because of its many problems. In secrecy VMware started working together with EMC’s BRS division to build a brand new backup product leveraging EMC’s Avamar technology under the codename “Project Toystory”. This product has seen the light of day as “vSphere Data Protection 5.1” or vDP for short. In this post I will be looking into vDP version 5.1, which is actually the initial release.
Introduction to vSphere Data Protection 5.1
This is the first release of vDP, so actually a 1.0 version. I will not be expecting a fully feature-rich product, but one that actually WORKS would be nice. After all, it is a “free” product Read the rest of this entry »
For a long time now I have been a fan of PHDs esXpress. It is still the only VMware backup solution I know that scales, has no single points of failure and works reliable with VMware snapshots. The solution has always been “other than others”: At first it appears to be a really weird piece of software, that creates its own appliances to perform its backups. Once you get to know it, esXpress’s way of working is great. So great in fact, that VMware themselves are now adopting this very way of working with their Disaster Recovery feature in vSphere 4, maybe even stepping away from their beloved VCB (VMware Consolidated Backup).
VCB in my opinion has never been that great, apart for some special uses in special environments. esXpress fits all, from single ESX hosts to large clusters. In contrast to VMware’s Disaster Recovery which is still buggy at the time of this blog, esXpress has been on this train for years now, and definitely knows the drill. EsXpress 3.1 is not the holy grail though. Some features were just not easy to use, there was no global GUI to manage all nodes easily and there was no Data deduplication available (not that I am that big a fan data dedup for backup, but ey, everbody does it!).
Enter esXpress 3.5
To make up for most of the shortcomings, esXpress version 3.5 has been introduced. The engine itself still is pretty much the same. And exactly there lies the power of esXpress: It still WORKS. It just works, it always works. Extra features have been added in such a smart and incredible simple way, that the product remains rock stable. No “waiting for the point 1 release” needed here!
I was over at a client who suffered a SAN failure (when upgrading firmware). They were in progress of failing over to their recovery site, when the administrator got an email from one of the production ESX hosts: esXpress had successfully completed its backups. What? All LUNs appeared unavailable at the production site. This host did not have its storage devices rescanned; it still kept on ticking. I think things like this are major plusses for both VMware ESX and esXpress showing their enterprise readiness.
Finally: A working global GUI
From the initial ESX 3.5 release, PHD also released a GUI to manage all esXpress instances from one central portal. In the old 3.1 (and before) days, you ended up copying configfiles between hosts; working, but not very user friendly. You might think that adding a central GUI took a lot of deep digging in the code of esXpress. But, they surprised once again: The GUI just holds the config files and, could it be more simple, the GUI appliance introduces a small NFS store. The NFS store is automagically mounted to the ESX servers, and presto! That is where the config files can be found. EsXpress itself just has to check the share for a new config, something already (partly) in existance in the previous version.
Even better: the GUI does a great job. I had some trouble with the first versions, some manual labor was needed to get it going (like manually needing to change the time zone and not being able to add a second DNS server). All these issues are fixed now, but even those early versions were already very effective. And things have become only better since then!
Because “everybody has it”: Deduplication
What should we do without deduplication nowadays? It is a major hype around storage and backup. If you don’t have it, you’re out of business it seems. But who ever thinks about the risks and limitations involved (see: The Dedup Dillema).
The idea of deduplication is brilliant, but the implementation has to be right. I must admit, I am not a big fan of deduplication. It is still your vital data you are talking about! Admit nr.2: EsXpress 3.5 managed to change my opinion to dedup a little.
The deduplication implementation of esXpress is in style with PHDs way of working: both effective and simple. A separate appliance is installed (which is in fact the same one as the GUI appliance. At first boot of the appliance you choose what the appliance will become. Smart!). The dedup appliance (called PHDD for PHd Data Dedup) can mount a datastore or an NFS store for storing its deduped data. It performs quite well, saving diskspace as you backup more of the same (or alike) data. It is now much “cheaper” to keep more backups of your VMs.
Only few changes appear to have been made to esXpress itself to allow PHDD as a backup target, so once again, stability guaranteed.
So now all your data lives inside the PHDD appliance. Now how do I get out this data the way I want it? PHD did something clever: They added a CIFS/SAMBA interface to the appliance, allowing you to browse, copy and backup your VMs as if they weren’t deduped at all! This last feature makes the mix of backup and dedup more acceptable, even effectively useable 🙂
When will the fun EVER stop? File level restore!
The best feature of the PHDD dedup target in my opinion, next to dedup itself, is the ability to perform file level restores. At last you can get out that one single file of a full VM without having to restore the whole thing. This option is so cool, you simply browse to the appliance, select your files, and save the collection you marked as a single zip file! Couldn’t be easier, another bulls eye for PHD, even in their first release of this piece of software.
Scaling esXpress 3.5 with dedup
Not all is bright and shiny with dedup. I found it hard to scale the solution: If there is only one PHDD target, scaling ends somewhere, and a SPOF (single point of failure) is introduced. Not good (although PHD is working on a way to link the dedup appliance to a secondary one). Still, one may consider to use two or more PHDD appliances in parallel. This will work, but the dedup effectiveness will drop sharply, especially when you use DRS and all VM backups end up on all PHDD targets in time (this happens when you design the often used strategy where one assings for a backup target to each ESX server individually with failovers to others). You can make it somewhat more effective by specifying a backup target for each VM (in the local config), a best practice that also stands when using multiple FTP targets btw. This will ensure that a backup of a particular VM will always end up on the same backup target, making things clearer and making dedup more effective (although far from ideal – Every PHDD target has its own library of data, meaning that identical blocks still get stored on EACH PHDD target instead of just one).
The limitations mentioned above are not a limit of esXpress though, but more a limitation of dedup in itself. PHD choose to use online dedup (basically you dedup while you write), which will use CPU power during backup and restores. CPU power might even be the limiting factor in your backup speed. Luckily CPU power is usually available in abundance nowadays. I will dive deeper into performance and scaling of deduped installations in the next blogpost, which will hopefully prove that dedup really performs (like the setup using multiple FTP targets simultaneously described in my blogpost Scaling VMware hot-backups using esXpress).
The new version of esXpress 3.5 is in terms of speed and reliability on par with its predecessor version 3.1. It is still the only backup solution I know that has no Single Point of Failure, scales (REALLY scales) up to whatever size you want without any issues, and best of all: Once it works it KEEPS working with hardly any problems around VM snapshotting like some other backup solutions do have.
On top of all the good things that already were, a global GUI is added which manages all esXpress installs at the same time, and there is a Data Deduplication appliance which features a very well working single file restore option. I would like to have seen a file restore option in a non-dedup target as well. From what I’ve seen, online deduping costs a lot of CPU power, and the backup speeds go down because of this. Once the database is built though, things do get better (less data to backup because more and more blocks are already backup up in th dedup appliance). Still, calculations have to be done.
In a smaller environment, the dedup appliance is no match for a set of non-dedupping FTP targets. This is a drawback from which any dedup system suffers… It is just the way the “thingy” works. Still I see a solid future for esXpresses PHDD dedup targets where speed is not of the utmost importance.
Make no mistake on backup speeds: IF esXpress and its backup targets are designed and configured properly, it is by far the fastest full-VM backup solution I’ve seen. It does not mess with taking backups through the service console network, it creates Virtual Appliances runtime that perform the backups – and many in parallel. If you want to see real backup speed from esXpress, do not test it on a single VM like some people tend to do when comparing. If you do, speeds are about on par with other 3rd party vendors. But when scaled up to make 8 or more backups in parallel to several backup targets with matched bandwidth, esXpress will start to shine and leave the competition far behind.
Everybody does it – and if you don’t, you can’t play along. What am I talking about? Data deduplication. It’s the best thing since sliced bread I hear people say. Sure it saves you a lot of disk space. But is it really all that brilliant in all scenarios?
The theory behind Data Deduplication
The idea is truly brilliant – You store blocks of data in a storage solution, and you create a hash which identifies the data inside the block uniquely. Every time you need to backup a block, you check (using the hash) if you already have the block in storage. If you do, just write a pointer to the data. Only if you have not got the block yet, copy it and include it into the storage dedup Dbase. The advantage is clear: The more equal data you store, the more you save in disk space. This is, especially in VMware, using equal VMs from templates a very big saving in disk space.
The actual dilemma
Certainly a nice thing about deduplication is, next to the large amounts of storage (and associated costs) you save, is that when you deduplicate at the source, you end up only sending new blocks across the line, which could dramatically reduce the bandwidth you need between remote offices and central backup locations. Deduplication at the source also means, you generally spread CPU loads better across your remote servers instead of locally in the storage solution.
Since there is a downside on every upside – Data Deduplication certainly has its downsides. For example, if I had 100 VMs, all from the same template, there surely are blocks that occur in each and every one of them. If that particular block gets corrupted… Indeed! You loose ALL your data. Continuing to scare you, if the hash algorithm you use is insufficient, two different data blocks might be identified as being equal, resulting in corrupted data. Make no mistake, the only way you can be 100% percent sure the blocks are equal, you need a hash number as big as the block itself (rendering the solution kind of useless). All dedup vendors use shorter hashes (I wonder why 😉 ), and live with the risk (which is VERY small in practice but never zero). Third mayor drawback, is the speed at which the storage device is able to deliver your data (un-deduplicated) back to you (which especially hurts on backup targets which have to perform massive restore operations). Final drawback: You need your ENTIRE database in order to perform any restore (at least you cannot be sure which blocks are going to be required to restore a particular set of data).
So – should I use it?
The reasons stated above always kept me a skeptic when it came to data deduplication, especially for backup purposes. Because at the end of the day, you want your backups to be functional, and not requiring the ENTIRE dataset in order to perform a restore. Speed can also be a factor, especially when you rely on restores from the dedup solution in a case of disaster recovery.
Still, there are definitely uses for deduplication. Most vendors have solved most issues with success, for example being able to access un-deduplicated data directly from the storage solution (enabling separate backups to tape etc). I have been looking at the new version of esXpress with their PHDD dedup targets, and I must say it is a very elegant solution (on which I will create a blog shortly 🙂