What is the Microsoft 365 Substrate?

What is the Microsoft 365 Substrate?


Microsoft 365 running on Windows with a magnifying glass looking closer.
Image: IB Photography/Adobe Stock

Microsoft 365 has come a long way from the first Office 365 tenants to today’s massive cloud service. Much of the work for this transition lies under the surface via a set of shared services and storage that’s referred to as the Microsoft 365 Substrate — you may see older references to the Office 365 Substrate.

Put in place to provide a common set of services and storage, the Substrate is most obvious as the common search provider for Microsoft 365 and the framework used to host its data governance features. Building on the flexible data storage tooling used by Exchange, it’s perhaps best thought of as a common NoSQL storage layer for Microsoft 365 that hosts local data and provides digital twins of external data.

SEE: Check out our Microsoft 365 cheat sheet.

Bringing everything into a common data layer that’s a foundation for Microsoft 365 services makes sense for Microsoft. That role explains the name; in semiconductor fabs, the substrate is the layer that chips are built on top of. In Microsoft 365, the Substrate may not be that powerful on its own, but without it, nothing else could work.

That doesn’t mean there’s only one store for your data. If you look at Azure, Microsoft has been doing a lot of work to provide support for a common query framework across data lakes, where relational and non-relational — structured and unstructured — data is stored in its own formats without slow translation to a lowest common denominator service.

The hidden Microsoft 365 service

Oddly for a service as important as the Microsoft 365 Substrate, there’s little documentation. You’ll find it mentioned in the Microsoft 365 roadmap, but that’s about it. Then again, this is a foundational technology with no user-facing functions, so perhaps it’s best there’s no possibly confusing documentation, just outputs we can use to help manage our Microsoft 365 instances.

Hidden away underneath the familiar Exchange and SharePoint services, the Microsoft 365 Substrate behaves like the productivity software version of a data lake, ensuring content is stored — if not in its original form, then as a digital twin of the original, using familiar formats that can be accessed using well-known application programming interfaces. Newer services can take advantage of cloud native Azure tooling for scale and global reach, with data accessible from within both Microsoft 365’s SharePoint and Exchange services.

Here the substrate is the mix of technologies needed to manage those stores, providing an intelligent bridge between the new and the familiar. There’s a long-term aim to bring all Microsoft 365 data into a common storage layer that underpins the entire platform, much like Power Platform’s Dataverse. That’s a complex task and one that will take time, most likely building on the Extensible Storage Engine used in Exchange and other Office servers.

What’s important about the Microsoft Substrate is what it does, not how it works. Microsoft will talk about it and its plans for the service, but the actual mechanics of it aren’t that important. All you need to know is that it’s there and that it works. Then, you can use its output to manage your users and their data.

The Microsoft 365 Substrate and compliance

One of the more important roles for the Microsoft 365 Substrate is its role in supporting compliance tasks. It provides a way of bringing different services into a common search and index layer, using Exchange mailboxes.

For example, Microsoft Teams chats are built on top of a Cosmos DB service, which provides a consistent, global, near real-time way for Teams to render chats and channels. That’s all very well for Teams’ internal operations, as it’s a cloud-native service.

But, what if you need to search those channels for e-discovery purposes or if you need to put a legal hold on the conversations that have taken place? This is where the Microsoft 365 Substrate has a role, as it copies all a user’s messages into a mailbox, with all channel messages in a group mailbox. Files from channels go into SharePoint and OneDrive.

Compliance applications developed for use with SharePoint and Exchange can then be used with Teams, working with the mailbox copies rather than the original Cosmos DB data. Data in mailboxes can now be managed by the e-discovery tools in Exchange, locking down the copies when a legal hold is applied. You can then apply rules to that data — for example, deleting messages after a set amount of time to ensure sensitive data is controlled.

SEE: Make sure your company meets compliance standards with this data governance checklist from TechRepublic Premium.

Without the Microsoft 365 Substrate this would have been hard to implement, requiring new features in Teams and in Cosmos DB. As Cosmos DB is a foundational technology for Azure, that could have been a complex process requiring significant engineering effort and adding overhead to a service that needs to be fast.

You don’t need to know that the Substrate is managing the data from one service, copying it into another. All you need to know are the locations of the Exchange mailboxes and SharePoint stores you need to target with holds and other e-discovery tools.

The Substrate extends this functionality into helping manage compliance at an application level. One tool based on it is Microsoft 365’s Information Barriers tooling. This lets you ensure that teams that need to be segregated for compliance purposes, such as retail banking and investment banking staff, are unable to communicate directly. Information Barriers stops them from sharing documents or using Teams for conversations; it even stops users in one group from searching for users in another, building on the address book policy tools in Exchange.

The future of the Microsoft 365 Substrate

Microsoft has talked about using the Microsoft 365 Substrate as a foundation for applying machine learning to the various communication channels we use, allowing relevant information to be surfaced by some future set of client tools, whether in Outlook, SharePoint, or through the new Viva services. We’ll still call the tools we use Exchange and SharePoint, but they’ll be familiar APIs and tools layered over a next generation common data layer.

Building a cloud service like Microsoft 365 from what were discrete servers is a long and complex process, and tools like the Substrate are essential to delivering a consistent and coherent environment. They then provide a basis for future evolution of the service, removing silos between service data and providing a basis for new machine learning-powered applications and services. There’s an interesting future here for a service that began as a way to move content from new services to familiar places, providing a new architecture for the whole of Microsoft’s productivity platform.

Source of Article