11 Dec 2019 by Mike Weaver
Integration: The Final Step in Change Management
The final step in successful change management is the Integration stage. Here’s how to bring everything together. Watch now.
When migrating email and document archives to the cloud, the total volume of data migrated (that is, the data ‘on the wire’) that needs to be transmitted from a company’s network to a cloud datacenter is significantly greater than the data volume that is reported by an archiving system. For example, if you have 1 TB of archive data in a repository, it is compressed and single-instanced. To migrate the data to Office 365, the data is moved via a multi-instance uncompressed API to the Office 365 datacenter. This will create 3 or 4 TB network traffic on the wire.
So what causes this problem? Put simply, the underlying MAPI interface (Microsoft Mail Application Programming Interface – designed in the early 1990s) allows ingested data to be associated with one mailbox/personal archive at a time. If a given 100 Kb email is sent to 5 recipients, that email needs to be ingested 6 times (5 recipients, plus the sender – 600 Kb). The situation is then further complicated by needing to either encode the items for transport to the API, adding at least another 3rd, or allowing MAPI to wrap the low level structures.[vc_column width=”1/4″][vc_column width=”3/4″]
Incredibly, and rather unbelievably, migration providers still use old-style MAPI technology to move data around. MAPI-based implementations limit other tool ingest capabilities to something like 5 TB/day when information is moved into Office 365 (with caveats around ‘per server’ or ‘per task’). The fact is that MAPI was never actually designed for migration. It was designed to enable Clients (like Microsoft Outlook) to interact with Exchange. Any interface that is intended for client use implies that it is optimized for on the user experience, happily opening messages that are even corrupt – missing hidden properties – silently removed without the application even noticing. More on that in my next blog. Microsoft have long since re-written Exchange’s internals to stream data in and out of mailboxes using alternate methods and protocols, rather than MAPI (which is actually still heavily used by Outlook).
Whilst performance testing 2 years ago, we hit a wall: MAPI performance couldn’t be pushed to a place where it was capable of delivering the kind of transmission speeds we wanted. We couldn’t optimize the protocol, and we had to pull crazy tricks like using hundreds of concurrent processes to get any sort of reasonable performance. So, we took inspiration from a famous computer scientist: Alan Kay. During a 1982 conference, he said “People who are really serious about software should make their own hardware.”. That paradigm holds true here: people who are really serious about migration should design their own foundation libraries. So we threw away MAPI and started from the ground up.
Microsoft provide a great set of open specifications covering all of the protocols and formats used with Office 365, so whilst we were charting new waters by doing this ourselves, we had reference material to make sure everything we were doing was fully supported. The intelligence in writing the foundation libraries actually isn’t just in being able to interpret specifications and produce efficient logic to do so, it’s actually in ensuring performance and an overall sensible architecture; with 900+ items/sec being addressed by this library in a single process, smart memory management, threading, and logic are essential (just imagine an inefficient implementation loading 1000 items into memory per second @ 200 Kb each and that resulting in 2 GB of memory movement/second… disastrous for performance).
So what does having our own foundation library allow?
Our advice to other migration providers? If you want significant performance, true item integrity, and to really start reducing customer bandwidth, it’s time to build your own foundations too. Without it, you can’t simply can’t guarantee a high performing, efficient, safe migration.