FTP Home   WSS Home   Customer Service   Site Map
 

Strategies for Dealing With Exchange Storage
Exchange data is too important to delete, and too massive to keep online. Here's how to alleviate the headaches.
by Alan Maddison

January 25, 2005

Thanks to the growing use and importance of e-mail, IT managers have the unenviable task of keeping large volumes of files online or at least easily recoverable. E-mail storage is more than a convenience issue for users because organizations need to comply with myriad legal and security considerations (see the sidebar, “Legislative Burden on IT”). Fortunately, for organizations that have moved to Exchange 2003, Microsoft has stepped up to the plate and added features and enhancements that help with the day-to-day problems IT managers face.

However, even with the latest technology, it is important to document, document, document all procedures, processes, and daily activities. Exchange administration can be difficult. Daily routines are often filled with problems that need fixing and users who need help. Urgent tasks overwhelm important tasks. Low-urgency chores such as documenting procedures, making sure users know what is expected of them, and planning for problems never make it to the front burner.

However, the increasing complexity of Exchange environments means we can no longer afford to put off important tasks. Our lives end up being a whole lot simpler if everyone knows what to do and what is expected of them.

This article covers the areas you need to document to help make your life simpler, as well as meet legal and security requirements. You'll also learn about the tools that will keep data safe, and if something goes wrong, you are in a position to recover the data.

Best Practices: Policies and Procedures
Successful Exchange management is more than getting through the day without annoying users. The best approach to e-mail administration is to look at the big picture and start managing e-mail systems actively. As mentioned previously, the big-picture approach requires documentation of policies and procedures, which many Exchange Administrators overlook.

The process of documentation formalizes and standardizes how IT departments are supposed to do their jobs. This, in turn, helps reduce downtime and errors. It also helps when problems occur, as formalization and standardization reduce the unknowns when troubleshooting. The process of developing and implementing policies that affect an Exchange environment's operation as well as how people use the system will involve many people beyond the messaging group. For example, defer security and legal requirements to the legal department and senior executives.

Your role in these matters is to advise on the technology required to meet legal and security requirements and monitor enforcement. For mundane matters such as e-mail storage limits, attachment sizes, and public folder creation, it is your responsibility to define limits based on your infrastructure. As always, prepare to deal with exceptions. Exceptions should always be well documented and understood, as they often prove to be the source of problems.

Creating procedures is a very different process, and it is one for which the Exchange team is solely responsible. Procedures define the tasks we do and how we carry them out. Typically, Exchange-related procedures can be grouped into four categories: backup and restoration, monitoring and analysis, system administration, and change and configuration management. We cover backup and restoration in a later section because of its importance. If all else fails and you have a well-documented and tested backup and restoration procedure, users will be spared.

The purpose of monitoring and analysis is to measure and understand your hardware and software systems. Monitoring and analysis are central to maintaining a healthy system and preparing you for future changes. For example, without the hard numbers provided by monitoring and analysis, it becomes difficult to justify spending money on new hardware or hiring another administrator. A historical record helps you understand your hardware and software and what is normal behavior. This, of course, allows you to spot signs of trouble as they occur rather than waiting until users complain or systems crash.

Monitoring and analysis can be divided further into performance monitoring and availability management. Performance monitoring covers all of the fundamental measurements required to know how your system is performing and identify when there might be a problem. Examples of counters to measure in performance monitoring include CPU utilization, memory usage, pages per second, average disk queue length, and message queue length. This information allows you to identify bottlenecks and pinpoint trends. Availability management is about maximizing system uptime and is tied closely to performance monitoring. Availability, reliability, and repair time are the key measurements to track.

System administration procedures are tasks completed regularly. Regardless of the frequency of the tasks, the goal is to document the procedure so every administrator understands what needs to be done and how it should be done. Daily tasks include reviewing event logs, checking that backups are completed, adding or deleting users, and reviewing alerts generated by performance monitoring. Less-frequent tasks include security reviews, testing backups, monitoring tape rotation, and database maintenance. Because many system administration tasks are repetitive, one of your goals should be to automate as many of them as possible. This will help reduce errors and increase your availability to handle unexpected problems. A key part of good system administration is understanding the tools that you have at your disposal and how and when they should be used.

Change management involves documenting and monitoring how and when changes are made to your Exchange environment. This includes research, review, and rollback plans. The goal of successful change management is to modify your Exchange environment without impacting users during or after the change. Changes are usually measured by their scope and potential impact. High impact is defined typically as a companywide impact and may require input from colleagues outside the messaging team. An example is a migration from Exchange 5.5 to Exchange 2003.

Medium-impact changes affect one or more of the critical systems within the messaging environment. A great example of this is the patching and update process.

Low-impact changes are modifications in policies and settings that do not have any significant impact on your environment. Configuration management is tied to change management and is the process of maintaining records on software and hardware systems. Much of the information gathered as part of this process is critical if you are subject to an audit, such as a financial, legal, or security audit.

Data Backup and Recovery
Fortunately, many vendors want to tackle the problems we face. Recent third-party software enhancements include software that allows for improved pattern recognition for searching e-mail and improved auditing capabilities to show that data has been accessed by authorized personnel only. Several Exchange backup vendors tackle the issue of granular recovery. These tools will allow you to mount backups and restore to live Exchange environments without interruption to service.

On the hardware side, some vendors advocate disk-based backup solutions. Tape has many disadvantages. The biggest, of course, is restoration speed. With the price of Networked Attached Storage (NAS) and Storage Attached Networks (SAN) coming down, these solutions become attractive as a medium-term storage solution. When incorporated into your environment correctly, disk storage is cost-effective and provides multiple layers of redundancy.

Some vendors provide block-level backup tied to policy implementations that prevent data from being deleted in contravention of company policies. When tied with the ability to create secure audit trails, these solutions provide a comprehensive solution to regulatory compliance. However, it is unlikely that tape will disappear because it still plays an important role for long-term storage. Moreover, unless you are backing up over a WAN, tape offers a good method of getting oodles of data off-site. This, of course, is necessary for any adequate disaster recovery plan.

The downside to many spinning spindles is cost. Some organizations don't have the budget to replace hardware that seems to work properly by outward appearance. Unfortunately, most of us won't have the funding we want for the latest technology, so it is important not to overlook the fundamentals. Regardless of the tools you use, the procedures you follow and the standards you set that are more important than technology. In fact, the fundamentals of any well-run Exchange environment are universal.

Backup Strategies
Before you slap a tape into a drive to back up an Exchange server, you need to understand the physical file structure of an Exchange storage group and how Exchange manages data in these files. Each storage group consists of two files: EDB and STM.

The STM file is a temporary storage point for SMTP mail that stores Multipurpose Internet Mail Extensions (MIME) content. Once an e-mail has been accessed by a MAPI client, such as Outlook, the content is stored in the EDB file.

In addition to these two file types, Exchange uses log files as an intermediate step in the write-to-database process. These logs files help ensure data consistency, integrity, and performance. The log files are a critical component in the restoration process. In fact, it is only after a normal backup using an Exchange-aware backup client that these log files are deleted. One important item to note is that although Exchange supports circular logging, do not use this feature in a production environment because of the possibility of data being overwritten. Also remember that the System State needs to be backed up to facilitate the recovery process in the event of complete system failure. You will also have to back up the IIS metabase if your organization uses Outlook Web Access.

While there are many Exchange-aware backup clients, (including NTBackup, which is a solid tool for smaller shops) the task of developing a backup strategy is the same regardless of the tool. The three backup choices are full, incremental, and differential.

A full backup is the best option to choose if you have the storage space and a sufficiently large backup window to complete the task. A restoration from a full backup is straightforward. It is also the fastest method of restoring your Exchange environment in the event of problems.

An incremental backup archives the data that changed since the last incremental backup. This is the fastest backup method. You should use this method if you have a short backup window and are comfortable handling the additional complexity of a restoration. In the event of a system failure, you will need to have the last full backup and every incremental backup since.

A differential backup represents a trade-off between a full and an incremental backup. Because it will archive all data that has changed since the last full backup, it does not provide the fastest backup method. A restore will only require the last full backup and the last differential backup.

While the process of backing up Exchange is straightforward, data recovery is not. The biggest challenge is recovering individual items or mailboxes. This is called brick-level backup and restoration. Unfortunately, this is one of Exchange's biggest drawbacks.

The API approach developed by Microsoft only provides backup at the database level. While many products allow brick-level backups, be aware that performance will be dramatically slower when performing such a backup. Perhaps more importantly, brick-level backups should not be considered foolproof, as they often have difficulty with open items, as well as third-party Exchange add-ins. Fortunately, Exchange 2003 makes the brick-level restoration easier than in earlier versions.

Exchange 2003 Tools
Data integrity is an Exchange administrator's biggest responsibility. To help ensure data is safe and available, Exchange 2003 provides several tools an administrator needs to know how to use. There is nothing worse than being in the middle of an emergency and realizing that to solve the problem you need to learn to use a new tool.

The simplest method of protecting data is to stop it from ever being deleted. Of course, users are human and administrators are expected to pick up the pieces. The first thing we must do in Exchange 2003 is use the Exchange Storage Manager to configure the message store property "Keep deleted items for … (days)" to a value that will provide adequate protection for your organization (see Figure 1).

Don't forget that this setting will impact your storage; you will need to monitor it carefully. The length of retention will be something you may need to adjust over time. The default setting of seven days may end up being too high or too low. Start low and work higher until you reach a suitable trade-off.

Another important setting associated with Mailbox store properties is the "Keep deleted mailboxes for … (days)" option. Used in conjunction with the Mailbox Recovery Center to actually restore the mailbox, this setting will protect you from your own mistakes and save you from embarrassment. The default value for this option is 30 days. Once again, this will impact your storage requirements, particularly in large environments, so monitor and adjust this setting accordingly.

A third item to consider is the option not to delete mailboxes and items until the store has been backed up. As you will see in Figure 1, you simply check the associated box and Exchange flags any deleted mailboxes and items.

Mailbox Recovery Center
Microsoft provides the Mailbox Recovery Center (MRC) if you need to recover a mailbox that has been deleted within the time you defined above, as shown in Figure 2. The MRC keeps track of all deleted mailboxes.

The first step is to add the relevant Mailbox store to the MRC to see the deleted mailboxes associated with the store. If you have already associated your mailbox store with the MRC then as soon as you click on the MRC you will see all of the deleted mailboxes that are available for recovery.

The first menu option is "Find Match" when you right-click on the mailbox you wish to recover. Selecting this option launches the Exchange Mailbox Matching Wizard.

If the associated user account still exists within Active Directory, the wizard will restore the mailbox automatically and associate it with the correct account. If the account does not exist, you will have the option to select an account.

The final step is to reconnect the mailbox to the user account. Right-click on the mailbox that you just recovered and choose the Reconnect option. This will launch the Exchange Mailbox Reconnect Wizard.

Recovery Storage Group
If the retention period has passed, you must turn to the Recovery Storage Group (RSG) and your backup media. In earlier versions of Exchange, you needed to restore to a new server. With Exchange 2003, you can use the RSG to recover to the same server. The RSG is similar to other storage groups except that you cannot send or receive mail through this storage group. This means you can access this storage group either through the ExMerge utility or, if you are using Exchange 2003 Service Pack 1, the Recover Mailbox data task in System Manager.

The Recovery Storage Group has two main functions. The first is to provide the ability to restore individual items and mailboxes. The second is to facilitate rapid recovery from a failure at the storage group level. Instead of e-mail being unavailable for several hours, the Recovery Storage Group provides a restore target. This, in turn, permits your users to send and receive e-mail from their normal storage group during restoration. If you are also fortunate enough that your clients are using Outlook 2003 in Exchange cache mode then your users will also have access to their old e-mail during the restoration.

To create an RSG, right-click on the appropriate server, select New, and then select Recovery Storage Group, as shown in Figure 3.

Once you select this option, you will be required to enter the log and system paths. Exchange will default to the Recovery Storage Group subfolder of your Exchange installation. The RSG is created after pressing OK. The next task is to add the database you wish to recover. First, right-click on the RSG and choose the Add Database to Recover ... option. Don't forget that this process can recover mailboxes and not Public folders.

The next step is to check to make sure the database is not mounted. You are now ready to begin recovery. Here, restore the latest Exchange backup. Your Exchange-aware backup software should restore to the RSG automatically. The associated log files from the backup will be replayed against the RSG. Check the backup software documentation for specific procedures.

The final step is to mount the database as shown in Figure 4. Once you have refreshed the RSG, you will see the mailboxes in the RSG.

Once you have recovered your data to the Recovery Storage Group, the next step is to merge the recovered data back into the production database. As mentioned earlier, there are two methods to achieve this: ExMerge and System Manager's Recover Mailbox data task in Exchange 2003 Service Pack 1. Both tools must be run by a user with Receive As permissions.

To begin recovery, launch the Exchange wizard, right-click on the mailbox you wish to recover, and select Exchange Tasks. After clicking Next twice, you will be asked to confirm the destination Mailbox store. Clicking Next brings you to the option to merge the recovered data or copy the recovered data. The final step is to schedule the recovery. After you click Next, the recovery process will run and the task will complete.

The process for recovering an entire information store is different. If the entire store has crashed, dismount the store and rename the edb and stm files associated with the corrupt information store. When you mount the information store, the system will not find the correct files and will create a new set of files. At this point, follow the steps described above to create a recovery storage group and restore your Exchange backup. Once the RSG has been mounted you have two choices. The first is to use the Exchange wizard or ExMerge to recover the data. The downside is that this process can take hours, particularly in large environments.

A quicker method is to dismount both the primary and recovery storage group stores, rename the files, and then copy the files to the appropriate location. This has the benefit of allowing you to recover only the new and much smaller information store that you created at the beginning of the recovery process. This speeds up the time required to complete a full restoration significantly.

Volume Shadow Copy Services
What Microsoft calls a shadow copy is more commonly known as a "snapshot." The goal of Volume Shadow Copy Services (VSS) is to provide accurate and complete snapshots of data at a given moment in time, overcoming the problems typically associated with backing up open files. VSS is a complex technology that has three key components: requesters, providers, and writers.

Requesters are third-party backup applications that are VSS-aware. Note that NTBackup cannot use VSS. Providers are services that provide access to VSS functionality. There are three types of providers, including system, hardware, and software. The final component, writers, are applications that are "aware" of VSS and include specific logic to take advantage of the service. In Exchange 2003, this is provided by exwriter.dll, which can pause writes to the EDB and log files as well as stop the STM file from growing to ensure an accurate snapshot.

When backing up using VSS, you have the option to do a full backup, which will truncate the log files, or a copy that does not purge logs. A copy is only suitable for testing purposes, perhaps as part of change-management testing.

The minimum selection as part of the backup is a storage group. When restoring, you have the option to recover the entire storage group or any database that is part of the storage group. Used in conjunction with the other tools discussed earlier, this can provide a fast and effective means to perform brick-level backups, among other things.

As costs come down, this technology will become more popular with messaging groups. The downside is that it actually stops writes to Exchange. Users may experience slow responses when a snapshot is being created. On long snapshots, this may prove problematic. This means that although Windows 2003 includes a VSS provider, you need a high-performance storage system, such as a SAN, to use VSS effectively. If you are fortunate enough to have a SAN, you need to investigate VSS fully and what it can do for you.

About the Author
Alan Maddison has more than 10 years experience in the technology field, with a diverse background in areas such as system analysis, network design, project management, and product development. He currently runs network operations for an ASP. He previously managed IT in various industries, including 3D animation, software, K12 education, and biotechnology.