Skip to main content

Your backups have been collecting forensic evidence this whole time. Those "humble" audit and sign-in logs from Entra ID. They're actually detailed behavioral fingerprints for every user. When someone deviates from their normal pattern - different country, weird hours, new device... you name it.
 

 

Instead of letting that data collect digital dust, you can build a surprisingly effective anomaly detection system from what you already have. Here is roughly how this all works.

 

Step 1: Get Your Data Ready

First, you'll need to dig into those sign-in and audit logs from your Veeam backups. Microsoft's log format is a bit nested - it's layered like an onion and about as fun to work with. Flatten that mess into something actually usable and drop it into MongoDB (could be anything). 

Step 2: Teach the System What Normal Looks Like

This is where machine learning earns its keep. Feed some historical data (I chose 60 days) to the system and let it figure out each user's habits. Does Sarah always log in from Chicago around 9 AM? Does Mike never use his phone? The system builds these behavioral fingerprints and we spot the outliers using isolation forests
 

Imagine you have a big pile of socks. Most are common colors like black or white, but there are a few really unique, brightly patterned ones. The Isolation Forest doesn’t try to identify every single sock. Instead, it focuses on finding the “odd ones out.” It does this by randomly picking a feature (like color or pattern) and then a random value to divide the socks. The unusual socks are usually easier to separate because there are fewer of them and they look different from the typical ones. The model essentially makes many simple “decisions” to isolate these unusual data points with fewer steps. If a login event gets separated very quickly, it’s considered out of the ordinary. The great thing is this technique is very efficient making it perfect for processing thousands of sign-in log entries.
 

 

Step 3: Watch for the Weird Stuff

Now the system is ready to receive new logs from our backups, we parse it through the same process and then can use the trained model to detect outliers. When someone logs in from Belarus at 3 AM (and they've never left Ohio), or when there are 50 failed login attempts in 10 minutes, it flags these as anomalies. Each one gets a severity score so you know what to tackle first.

Step 4: Make It Visual

Raw alerts are useless if nobody can understand them. Tools like Metabase turn your findings into dashboards easily. 

The beauty of this whole approach? You're not buying expensive new tools or collecting more data. You're just being smarter about the backup data that's already there, turning what most people ignore into an early warning system!

Interesting concept Ben great to see you can use existing data and backups.  Watching the video now which is great. 👍


Thanks for the overview video explaining this Ben. A couple questions → you said you created a blog post going into more detail (what you used to script with...what dashboard, etc), but you didn’t place the link to the blog here. Do you mind sharing?

BTW, yes...Columbus OH is a city in Ohio (I don’t live there anymore, but I grew up in & am from Ohio 😉)

This is a really interesting use of Veeam logging. Could this somehow be a ‘Veeam fling’ or incorporated into VONE in a future release? 🤔

Thanks for sharing!


Thanks ​@Chris.Childerhose and ​@coolsport00 

No code released yet, still TBC on that. I love the fling idea though. There is slightly-more technical writeup with a bit more data on the blog, i deliberately didn't link here as we like the content here in the community but you know where to look :) 


Yep..understood 😉 Thanks Ben!


A good security feature and also helpful for audits with stringent security features.


Comment