Back to blog

Who moved my millions of morsels of cheese?

11 May 2018 by Jason Jacobo

Part 2: Bringing email data back under control

Welcome back to the tale of email past to present. As we mentioned at the end of our last post, the decentralization of email data quickly became a problem for many organizations. This need gave birth to a new type of enterprise solution.

The arrival of third-party email archiving solutions

Centralized email archiving solutions such as Enterprise Vault, Source One, EAS, Mimosa, and HPCA appeared on the market. These solutions could supplement production email systems to move data from their expensive, limited storage model to a new system. A solution designed to accommodate the big data that email had grown into, while providing a local archive-esque experience, and reducing the cost of storage. Not only that, this new system could also facilitate the stability and ease of recovery for the email servers generating this (still rapidly growing) data. The solutions were designed to fill the usage gap that email solutions simply appeared to ignore.

These enterprise-class software solutions were really more of a workaround; they sidestepped the underlying issue, rather than directly addressing the root cause. Messaging solutions needed to be able to accommodate the natural use of email in business, without relying on external storage solutions to maintain stability. However, data continued to grow and the source of that data was unable to accommodate it. Archiving systems bought both Exchange, and its users more time and space.  As a bonus, organizations interested in legal discovery or compliance adherence of their email data could choose an archiving system that also met that need.

Although most archiving solutions were designed to be able to grow, they too had limitations. Soon, these systems felt the same pressures of the messaging solutions they supported and the local archives that came before them:

  • Perpetually unchecked growth
  • A failure to educate users on “proper use”
  • The artificial extension of storage limitations meant that the root cause of the issue remained unsolved.
  • Some archiving solutions had hard limits on how much it could grow, and large organizations quickly found those limits to be insufficient.

As a result, the workaround also escalated:

  • For many, deployment, management, and maintenance of multiple instances of the same solutions were their only answer (for example, one of our customers has 59 total instances over two solutions!).
  • Other solutions attempted to address continued extendibility by creating views, segmenting, single instancing, compressing the data it was archiving (or a combination of all these action).

No matter the approach, the result was the same: archiving solutions began to feel the impact of their own exponential growth.

Archiving solutions had also complicated the issue of growth. By artificially extending capacity, they made it so messaging solutions had a seemingly insurmountable task to supply (now enormous) storage needs to accommodate the data contained within archiving solutions. Although modern mail systems have addressed much of the storage model issues of the past, mailbox limitations are still unable to address many of the challenges introduced by message archiving solutions. Large mailboxes or folders containing an “excessive” number of messages cannot be accommodated by solutions such as Microsoft’s Exchange Server. Journaling of messages for compliance and litigation searching created huge archives of data that were all received to the same mailbox, in the same folder.

Millions of messages originating in a single folder in a mailbox is not uncommon, and the occasional archive containing billions of items has been seen. The introduction of archiving solutions simply moved the issue down the pipe and morphed it into a more complicated issue.

So far in this series we have covered the turbulent history of email, and its explosive popularity, usage, and volume. We also discussed how the transition of detaching email data from the solution (either via PSTs or third-party archives) did not address the problem messaging servers faced and introduced some new challenges those solutions would have to confront in order to bring their data back into a messaging solution again.

In part 3 of this series, we will look at how these new challenges impact the migration solutions intended to “bring this data back” to the mail servers where it originated.