Exchange backups a problem?

74 Gigs should *NOT* take 24 hours to back up.  Keep in mind, we are not going to tape, we are backing up to Disk Storage Units and then copying backup images from disk to tape later.  So tape bandwidth is not the issue here.

I’ve been working on an exchange backup problem.  Now I know that the exchange server in question was not set up as “best practices”.  Single information store, (used to be installed on the C: drive, we finally moved that) for about 350 users.  The new exchange server is coming online soon (not soon enough for my tastes) but for now this is what I have to work with.

A single stream backup has taken about 24 hours to complete, even for differentials, using the default directive:

Microsoft Exchange Mailboxes:\

You create a single stream for all mailboxes.

So knowing there has to be a better way to do this, I tried the usual wildcard, as follows:

Microsoft Exchange Mailboxes:\*

With disasterous results.  The system spawned 400+ backup streams, which held the entire backup environment hostage.  Half the backup jobs couldn’t run within the 5 hour window we had set for ourselves.  A little research through the Symantec/Veritas site (their site is not exactly easty to sift through) turns up the following set of directives in the “Exchange Administrator’s Guide”:

NEW_STREAM
Microsoft Exchange Public Folders:\
NEW_STREAM
Microsoft Information Store:\
NEW_STREAM
Microsoft Exchange Mailboxes:\[a-e]*
NEW_STREAM
Microsoft Exchange Mailboxes:\[f-j]*
NEW_STREAM
Microsoft Exchange Mailboxes:\[k-o]*
NEW_STREAM
Microsoft Exchange Mailboxes:\[p-t]*
NEW_STREAM
Microsoft Exchange Mailboxes:\[u-z]*

Now the first two are easy – back up the public folders, and back up the information store as a whole.  Backing up the information store as well as the mailboxes can be said to be a bit redundant, however this is a big deal if you have to do a full restore – restoring from the Information store backup is much much faster than the item by item backup.  It’s a waste of time until you’re down and need it, so is worth the extra time / storage. 
The remaining directives group a collection of mailboxes into a single stream.  In our case we’ve put all mailboxes starting with A through all mailboxes starting with E in the single stream.  F through J in the second stream, etc.

There is further tuning that can be done, moving sets of mailboxes from one stream to another to balance them out.

Only time will tell if this really helps.  My testing indicates that using 5 streams, show that the 500-600 Kps slows to 300-400kps, but when multiplied by 5 streams it still looks like this might be an improvement.

7 comments

Skip to comment form

    • on September 12, 2006 at 4:00 pm
    • Reply

    I’m assuming that the Exchange Server is also the NetBackup Media Server and you are using the Veritas drivers…. If this is the case there are a couple of performance tweaks you can try.

    http://support.veritas.com/docs/244652

    The above tweaks can drastically improve performance to tape but may help to disk.

    • on September 12, 2006 at 4:31 pm
    • Reply

    Also, it doesnt appear that you have your Exchange Server setup as a Media Server. That will give you the biggest performance increase right there, provided you have the disk to host the disk backups, but in that case relocating the backups to tape after the fact will hit you on performance and add to the backup window. IMO I would just backup the databases or mailboxes directly to tape from the Exchange Server. What type of tape drives are you using? Using LTO-2 fibre channel drives connected to your Exchange Servers, you can easily attain a sustained 45MB/s throughput.

  1. it was time for me to post a follow-up to that anyway.

    My incremental backups completed last night in 2 hours, 8 minutes. Quite the improvement. 🙂 I’m anxious to see what the fulls do. If they make the window, I will probably consider the problem solved.

    I’m also looking at using the IP over FC option to get a little more network bandwidth, but watching the perf. counters, I’m sure that it’s exchange that is starving the backup, not the network. (as our network admin consistently says, “It’s never the network.”)

    Though making the exchange server a media server is a great idea, now that we are migrating to our fibre attached exchange server (off the internal disks) that becomes more of an option. We thought about Veritas Shared-Storage-Option, but decided that the policy would be Disk–>Disk–>Tape with no exceptions, because there is almost nothing I hate more than having to go back to tape for a single mail item restore.

    I think the other problem with making the exchange server a media server is that during a message/mailbox level backup/restore, you’re connecting in through RPC, which is where your bottleneck is.

    My kingdom for a sendmail/pine environment. 🙂

    • on September 12, 2006 at 8:48 pm
    • Reply

    I would tend to agree that D2D backups would be the way to go for a larger enterprise, but why with only a single 75GB database. The SSO is a great way to take advantage of your hardware and make it really work. What is the retention time on disk, before moving to tape? If its a day or 2… why waste your time?

    I was the Enterprise Storage Architect at my previous company and I was backing up 4+ TB of an Exchange VCS using SSO in 4 hours or less with zero failures. Your right the bottleneck is the server when performing brick level backups of Exchange, however a mailstore being backed up by the Veritas agent has almost no performance hit at all, even when backing up during the day over the same HBA as the disk storage (I’ve tried).

    Just my opinion, but I think your set-up has alot of things to troubleshoot if something goes wrong. A 5 hour window is plenty of time to backup 75GB of anything to include brick level backups.

    The D2D backup is an excelent option for any type of database that can have logs and databases backed up separately such as Oracle or MSSql, or for an enterprise that needs to keep anything on line (daily fulls for 2 weeks) that will equal Terabytes of data. Disk is expensive to burn for 75GB (around $1000 per 73GB FC disk compared to a tape that can hold 400GB compressed for $50). Wher is your D2D backup going to SAN or Local?

    As for your directive. When the wildcard is used in a Veritas directive, each separate item backed up will spawn a new stream whether its a mailbox or a flat file from a server.

  2. Did I mention that the 75G database was 25G three months ago? We’ll hit 100G by the end of this month.

    Our enterprise has gone from 25 employees to 350 employees since I was hired in March. We’re expecting that kind of growth going forward so have just completed the implementation of an exchange cluster (remind me to slap Bill Gates’ next time I see him for generating that travesty ) with about 600G of storage. The thinking is that that MIGHT get us through the year. We’ll go to our second cluster then.

    The cluster is the problem. I don’t think I’ve ever heard of a clustered disk media server (have you?)

    Long term goal is to move this to the Symm and to start doing snap backups of the filesystems. (*MY* long term goal is to get sendmail in and try and teach sales and corporate types to use PINE.)

    And someday pigs will fly.

    • on September 12, 2006 at 10:04 pm
    • Reply

    If your using Netbackup 6.0 you should be fine with clustering a Media Server out of the box. NB 6.0 keeps the Media database for all Media Servers on the Master or on another server that you designate. NB 5.x kept the Media DB’s on thier respective Media Server, this is where the issue comes into play. During a failover unless you do ALOT of cluster customization to make tape drives, reg keys, and of course NB move, you would end up with 2 media databases (Media Server DB’s). But yes I have implemented this on 5.x will alot of troubleshooting and importing of tapes to counteract the multiple media db’s, but finally did get it to work in a VCS.

    If your environment continues to grow at this rate, I would recommend one of the 2 of the following

    1. Create a dedicated backup LAN that runs at GB speed

    or

    2. Set up your exchange servers as SAN Media Servers.

    If you plan on growing this fast then you also need to look at your retention period for each type of schedule you have (dail, weekly, monthly, etc) If you need to keep everything permanently then all you are doing by D2D2T backups is wasting ALOT of money to buy youself a couple of weeks.

    I currently work on TEIM Exchange Backups and can tell you from experience that the NetBackup client is far more stable, requires minimum overhead, and is more reliable than using Exchange Snaps from the frame especially for recoverability. If you have the cash, set up a global cluster and SRDF then you are really covered.

    • Jesse on September 12, 2006 at 10:19 pm
      Author
    • Reply

    Yeah, we’re running 6.0, and it’s definately an option. We’ve already done the dedicated GIG-E backup network, and are actually looking to do IP/FC for the production servers (see another post on this site) so that we can free up the second broadcom connection for load-balancing on the production network.

    The D2D2T backups are only buying us two weeks, but that’s the mandate I was given and when someone with “Chief” in their title says they want it one way, that’s how I do it. Their main reason is that they want instant protection in the case of “oops I accidentially deleted file(x)” or in the event of a database crash, I can recover to within 15 minutes of the failure without having to wait for tapes that have most likely been taken off site. I prefer it if for no other reason than the wide stripe I’m using on the back end is gaining me much needed speed on the initial backup, then the duplicate is happening during the day without affecting production processing.

    I’ve done a few automated backup solutions in my time, when I was consulting at a certain movie studio in Burbank California a while back, we were putting about 300TB of IBM DB2 database data to tape every night using TimeFinder BCV’s. (now called TimeFinder/Mirro) It’s probably a lot more now, that was in 2001.

    I dislike being handicapped by using Microsoft, but our *EX* CIO mandated that we were a .NET shop *ONLY*. (A fact that I was not happy about when I found out after I started)

    The winds of change are hopefully blowing, because I refuse to put microsoft on my resume.

Leave a Reply

Your email address will not be published.