How to ensure Chain of Custody for email and archive migrations
Who has handled the evidence?
In the world of email and email archive migrations you’ll often see reference to Chain of Custody (CoC) as a requirement. So what exactly do we mean by that?
CoC is about traceability and being able to demonstrate that the source hasn’t been tampered with. It’s evidence that the wood you buy from your timber merchant comes from certified forests, or that drug trial specimens haven’t been contaminated and mixed up with others. Items in transit are locked down at every stage, with auditable records that will stand up to legal scrutiny.
More specifically, in our field, CoC is a “process used to maintain and document the chronological history of the handling, including the transfer of ownership, of any arbitrary digital file from its creation to a final state version” (US National Digital Stewardship Alliance). CoC tracks the movement of evidence through its collection, safeguarding, and analysis lifecycle by documenting each person who handled it, the date/time it was collected or transferred, and the purpose for the transfer. With email migrations, CoC records are vital for legal defense once your old source has been decommissioned.
When you move email records, you need to be able to account for every single message that was stored in your source system and what’s happened to each one at every step of migration. If the message is not contained in the target destination, you need to be able to explain why not, and what happened to that message. CoC records are most likely to be required retrospectively, so there’s only one chance to get them right. That’s a good reason to use specialist tools to manage your migration project, rather than relying on manual methods.
CoC is an absolute ‘must have’ in any email or archive migration project. Your vendor should be able to show you the kinds of CoC records and reports generated by their products during the process. Here’s how we go about it at Quadrotech.
Chain of Custody for email archive migrations
Let’s assume you have a working archive in something like Enterprise Vault (EV), and you want to migrate Enterprise Vault to Office 365. Our ArchiveShuttle tool was created with CoC in mind. We keep a digital fingerprint of every item, at every stage of the process. From extraction to ingestion you can interrogate the metadata in the database to ensure there’s been no interference during migration; files in the target destination are totally reconciled against the source.
Where ArchiveShuttle differs significantly from most comparable tools is that our proprietary Advanced Ingestion protocol (AIP) streams the migrated archive into Exchange and Office 365 fully preserved. This makes CoC much easier to validate. A number of other vendors use protocols like Exchange Web Services (EWS) for ingestion, which break the data into smaller pieces (authors, recipients, dates and other properties). EWS can’t even preserve some information – like the date the item was created – meaning there’s much more of an opportunity for information to be lost or corrupted.
Live mail and PST Chain of Custody
No matter what toolset you use, the migration of PST and live mail files poses greater difficulties for CoC. It’s unusual to be able to extract data without alteration. Native PST files are often corrupted to start with and require repair, de-duplication and so on before they are in a fit state to be migrated.
The extraction, rationalization and preparation process is automated in our PST FlightDeck and MailboxShuttle tools, with exceptions being flagged for manual intervention. The tools record how many items could not be fixed, but in practice these file fragments are generally lost to the world. All you can tell is that they existed on the source system once. We do provide reports on corrupt/unrecoverable items to customers if they want to attempt to rescue them themselves prior to decommissioning the source.
Full CoC kicks in once the original item has been repaired. At that stage you are effectively treating the exercise in a similar way to an archive migration, migrating the individual items into Exchange.
Our experience of Chain of Custody
We’re sometimes asked if there is a workable solution that addresses CoC for those unrecoverable PST and live mail files. Certainly, you could migrate the unresolved fragments, but they’d still be of no practical use. Because CoC can be demonstrated for usable items once they’ve been repaired – and you can demonstrate there were unusable or unrecoverable elements in the source data – your CoC compliance will stand up to the most rigorous scrutiny.
Other questions we get asked relate to archive migration and journal data in EV. Sometimes the source archive in these is also corrupt or damaged, and you have to carry out repair work before migration. And yes, CoC is more complicated to monitor if you are consolidating many archives from different sources into a single repository – but then our digital fingerprinting applies at each stage, from each source, so everything is fully reconciled.
A CoC database contains multi-millions of items. It’s your single point of truth for what happened to the source data, so it should be kept safe and ready for interrogation. Auditors may choose to test it by looking for specific messages in the report. Legal and compliance officers may wish to do a complete analysis to make sure every message has been migrated.
Whatever migration you aim to achieve, Quadrotech’s toolset ensures your CoC will be robust, demonstrable and defensible.