9 - What are the latest data protection trends you're seeing, such as AI, automation and machine learning etc?
“OK, so AI is obviously the the hot topic everywhere in tech at the moment. It does have a role to play.
So in in Purview, we’ve got things like trainable classifiers. So these are machine learning algorithms that we train to recognise specific
types of sensitive content that can come in different formats. So things like invoices or CV’s. So we can recognise them, we can then use that to classify them, label them, and ensure they’re protected appropriately.
So how does the technology recognise sensitive content, such as credit card details, CV’s and invoices, etc?
Credit card details are actually much easier than things like CV’s and invoices. So a credit card number is very predictable. It’s it’s always 16
digits. It conforms to a checksum. It conforms to the Luhn algorithm. You’ve usually got some supporting information near it.
It will usually come with an expiration date as well. So, simple pattern matching using something called ‘sensitive information types’ is usually enough for that.
Things like CV’s, invoices, contracts are actually more challenging because you haven’t got those same predictable patterns, they come in all kinds of different formats.
If you think about the number of CV’s you’ve seen, and the different ways they’re presented and laid out, that needs a more flexible approach. That’s where that machine learning approach comes in, where you’re basically showing it several hundred examples of a CV, so it learns to recognise CV’s in different formats, and you’re using that to recognise and classify that content.
Can Microsoft Copilot help with data governance challenges?
OK, so I wouldn’t say Copilot can help with data governance. I think data governance can help with Copilot. So if you’re rolling out a generative AI tool like Copilot for Microsoft 365, it’s drawing on all the data you’ve got access to from across Microsoft 365, and anything else that you’ve connected to it.
You’re going to get better responses from Copilot if the data you’re storing is accurate and not duplicated, and it’s up to date, etc. So for example, if I ask Copilot a question about annual leave entitlements, but in SharePoint I’m storing four different versions of our annual leave policy, Copilot’s not going to know what to do with that. Chances are we’re going to get some bad advice.
So getting data governance right definitely plays a role in getting ready to make the most of tools like Copilot.”