Handling email domains during an Office 365 Tenant Migration: Part Two
Retaining primary SMTP addresses
This blog is the second post in a two-part series on ‘Handling email domains during an Office 365 tenant migration’ – read part one here.
The more challenging aspect of migrating email addresses in an Office 365 tenant to tenant scenario is where the primary SMTP email address (or reply address) must be retained.
The core business driver for this is to ensure that when communicating with external people no change is perceived. For example, the business may wish to ensure that the acquired business will still be the company they know and respect.
Rewriting SMTP addresses
We can deal with this scenario by managing the primary SMTP address throughout the transition outside of Office 365. By moving the inbound and outbound mail gateway to a third-party service, we can map individual email addresses to the Office 365 tenant we choose; when an outbound email passes through the mail gateway, we can re-write the SMTP address to the domain we choose.
You’ll see in the simplified diagram above that this approach allows the identity – contoso.com, in this case – to be retained to the outside world. Outside of this process we may have configured a range of other configuration such as shared global address book, migration software, and more.
However, this approach isn’t without issues. Aspects you need to consider include:
- Who runs the service? – Exchange Online Protection with Advanced Threat Protection (ATP) is a market leading service.
Unless you select a market leading alternative (or already use a third-party service) the organization is likely to receive more spam and phishing messages. You may also lose useful functionality that is increasingly necessary to ensure message delivery, like DMARC/DKIM.
If the mail gateway’s purpose is primarily to co-ordinate a tenant to tenant migration, then it is very likely this will add an additional risk, both in terms of spam and malware threats; but also, reliability.
- How will you coordinate mail routing updates? – As you move mailboxes from one tenant to another, you should still ensure that mail forwarding takes place from the source tenant to the destination tenant. However, you’ll also want to ensure that updates take place in a rapid manner to ensure that re-write actions are applied from the moment a user accesses their migrated mailbox.
The first issue is potentially the one easiest to understate the importance of. Conceptually, on paper, building out several Exchange servers running the Edge role in Azure; or perhaps Linux servers running your favourite Message Transfer Agent may seem like a great option. Or perhaps you consider a provider that is providing temporary routing alongside a tenant to tenant migration. The downside to these is that they move back the anti-spam and (to a certain degree) the anti-malware capabilities back to the what could be considered the stone age.
Out of the box or basic configurations miss the intelligence and abilities that being the first-hop of mail benefit from – for further evidence, examine technologies like grey listing or even how real-time-blocklists provide significant protection.
Therefore, the recommendation is usually to rely on a market-leading third-party SMTP gateway during that transition. Only a short time ago, many organisations retained these services after migrating to Office 365 – however, an increasing number choose Microsoft’s ATP instead. So, this will only become more of an issue in the short to medium term, assuming Microsoft will produce a solution in the next 12 to 18 months.
With a re-write solution, you are going to need to cut over the domain at some point – usually the end of the migration. With a third-party gateway solution in place, you retain full control over the mail flow for your domain, and by the time you move the custom domain across, you may have migrated all your resources to the destination, and thus can remove the custom domain from the source Office 365 tenant without risk.
After you add the custom domain across, you follow a similar process to the cutover scenario below. Migrated mailboxes and other recipients will have their source email addresses added and set as the primary SMTP address. Soon after this is complete, the mail gateway performing re-write operations will not be required.
Cutover of custom domains
Another business driver may be tied to the user experience. As discussed above, it is common for the sign-in address to match the email address.
An example of a business driver for this might be that a business unit is running separately and staff identify with that brand rather than the group, and although it makes sense to consolidate to a single tenant, they may wish to ensure that staff do not incorrectly perceive a change to the business due to this change.
Generally, this process follows Microsoft’s guidance as all mailboxes are pre-synchronized using a third-party tool prior to the cutover, with delta synchronizations taking place to ensure no mail received subsequent to the first sync is left behind.
On the day of cutover, mailboxes exist in both the source and destination Office 365 tenants, and prior to cutover, all traces of the custom domain to move must be removed first from the source Office 365 tenant, prior to removing the custom domain itself. In advance of removing the custom domain, the TTL (time to live) value for the MX (mail exchanger) record is lowered in DNS. Reducing this time ensures that updates to the record should be honoured rapidly.
Prior to removing the custom domain, the MX record is updated to point to an invalid DNS name, such as “invalid-address.contoso.com”. This record should not point to any valid email server – otherwise email will be rejected.
The theory behind this method is that the sending email system will assume that a temporary issue has occurred with the destination email system, and the sending email system should hold that message back and retry later.
Once the custom domain has been successfully added to the target Office 365 tenant and all mailboxes and other recipients updated with the correct email addresses, the MX record must be updated to re-point at Office 365, as shown below. In theory, new email will soon be delivered, and messages sent during the outage will eventually be received by the recipients.
There are of course a lot of should and in theory caveats in this process, and that’s where it falls apart somewhat. Disadvantages of this include:
- You won’t know what mail you have lost – You will, almost without a doubt, lose email. That may be limited to spam senders that only attempt once – usually intentionally to make best use of their resources. However, it isn’t only spammers that may not follow the normal rules of SMTP email.
Not all email servers are built to follow standard rules around what they will do when they encounter an invalid domain. Badly set up email systems might be configured to NDR, or bounce, email messages earlier than you expect. Defaults for services like Exchange Server are common across most email systems, but these may have changed from the defaults. In my time as an email admin I found that one thing could be relied upon – if people can change the defaults in something, they will. If mail was lost during an operation like this, the blame would lie with me for not considering this.
- You will find that stale DNS records are more common than you expect – Updating the TTL to a low value, such as 5 minutes is supposed to ensure that within five minutes every single sending email server will now send to the updated MX record.
However, you will find that this is not the case. If you have moved between anti-spam providers in the past you will almost certainly have seen that sometimes, even days later, mail delivery will be attempted to the old MX records.
This will potentially mean a number of additional messages are delivered to the source Office 365 tenant (solved with a final delta) before you complete clean-up of email addresses. It will, however, mean it’s very likely that some mail is rejected by the Office 365 whilst the domain is registered to neither source or target tenants. You may also find that mail continues to be delayed from some senders for longer than you expect if they cache the invalid address for longer than the TTL value.
- Custom domains can take longer than you expect to move – In general, if all records are removed, the process can take as little as five minutes. In the real world – especially if something has been missed – this process can take much longer, sometimes 24 hours or more. For example, with Exchange Server set to defaults – once this reaches the stage where it exceeds 4 hours, most email servers will inform the sender that the email has been delayed. After 2 days the message will expire, and the sender will receive an NDR.
You can mitigate against the impact of these disadvantages in several ways.
In one of our earlier scenarios, where we migrate to a new primary SMTP address, then by the time you perform this process the old domain may be less used, and your users could have been informed to update important mailing list subscriptions and ensure the people they communicate with are asked to use the new address. With some mitigations in place, the risk may be minimal.
However, in the scenario here where the primary SMTP address is being moved across in one operation, a large element of risk does exist, and can be mitigated using one of the following methods:
- Follow the third-party gateway approach – Use a third-party mail gateway providing equivalent functionality to Exchange Online Protection and ATP to control mail delivery during the transition in a similar way to the SMTP re-write example. The configuration can remain straightforward as email is transitioned en mass for the domain and does not need to re-write outbound email.
With this approach the facility to halt mail delivery whilst you perform the move of the custom domain will be helpful, as it is crucial – even with this approach – to manage the period of time where the custom domain does not exist in either Office 365 tenant.
- Provide a retry-later service – If a third-party gateway is not an option, implementing a temporary retry later SMTP service may be considered. It works in a similar way to using an invalid domain name for an MX record but uses a tried and tested approach for gracefully requesting sending servers to retry later. The SMTP protocol has an error code 451 which means “Please try later”. The use of this method is not uncommon, and is commonly used in anti-spam solutions to implement Greylisting, but in a similar way to using an invalid domain it too has potential disadvantages. Crucially though, a record of attempted delivery will be available should it be required – and email security will not be compromised by using a third-party gateway designed for transition only that doesn’t match up to Exchange Online Protection and ATP.
Both of the above solutions require some configuration to utilize but they mitigate either wholly or partly against the risk of lost email during a cutover transition.
In this article, we’ve explored the key scenarios organizations face when migrating email between two Office 365 tenants and the main challenges associated with these. As you can see, there’s not yet a one-size fits all solution – and as Office 365 adoption of services increase, the solutions will remain a moving target. Hopefully, though you’ll be better armed with the possible solutions should you need them.
This blog was guest written by Steve Goodman. Steve works at Content and Code as Principal Technology Strategist, spending half his time helping customers understand how best to utilize Office 365, and the other half of his time hands-on with some of the more complex problems associated with migrating to the cloud or to newer versions of Exchange Server. Outside of his day job, you’ll find Steve talking about Exchange, Teams, and Office 365 at user groups, conferences, on various blogs, and podcasts.