Introducing a Veeam Backup & Replication Log Anonymization Script

Forum|Forum|2 years ago
September 29, 2023
16 comments
980 views

JMousqueton
Comes here often

Hello Veeam Community!

I'm excited to introduce a Python script that I've developed for anonymizing Veeam Backup & Replication logs. Protecting sensitive information in log files is crucial, and this script simplifies the process while maintaining the integrity of your logs.

Acknowledgments:

Before diving into the details, I'd like to express my gratitude:

Bertrand: Thank you for the original idea that inspired this script and for your valuable improvement suggestions. Your input was instrumental in making this script more robust and feature-rich.
Eric: A big thank you for your unwavering support and encouragement throughout the development process. Your feedback and insights helped shape this tool.

Disclaimer:

I want to clarify that I'm not a developer by profession, but rather a member of the Veeam community who saw the need for a tool like this. The script has been created out of a passion for data privacy and a desire to contribute to our community.

Key Features:

Anonymization: The script can anonymize sensitive information such as server names, user accounts, IP addresses, and more, helping you comply with data privacy regulations.
Mapping Table: It generates a mapping table of original and anonymized values, making it easy to trace back anonymized data when needed.
Extensible: The script is highly extensible, allowing you to add custom anonymization patterns or adapt it for other log formats.
Open Source: The script is open-source and available for the Veeam community to use and contribute to.

How to Get Started:

I've posted the script on GitHub, along with detailed documentation and usage instructions. You can find it here: https://github.com/JMousqueton/VeeamLogAnonymizer

Feedback and Contributions:

I welcome your feedback, suggestions, and contributions to make this script even better. Feel free to open issues on GitHub, submit pull requests.

I hope this script proves valuable to the Veeam community in maintaining data privacy and compliance. Give it a try, and let me know your thoughts!

Julien Mousqueton

+13

Iams3le
Forum|Forum|2 years ago
September 29, 2023

Can’t wait to test this script! Great initiative @JMousqueton

BertrandFR
Influencer
Forum|Forum|2 years ago
September 29, 2023

A brillant project @JMousqueton , thank you for taking the time to make it! I’m pretty sure, it coud be useful for some companies or public institution who can’t easily share some logs with the Support due to internal security restrictions.

I hope in the future, Veeam can propose it as a new feature and for the DB too. It could really help some customers to have a better support so obviously a better veeam experience.

@Mildur @HannesK Do you need a feature request on the r&d forum :)?

+22

coolsport00
Veeam Legend
Forum|Forum|2 years ago
September 29, 2023

Fantastic endeavor @JMousqueton !! And kudos to your fellow moral supporters 🙌🏻

Shane Williford - Veeam VMCA/VMCE | Veeam Legend | VUG Leader | VCP-DCV | Twitter: @coolsport00

+22

Geoff Burke
Veeam Vanguard
Forum|Forum|2 years ago
September 29, 2023

This is golden! Thanks!

VMCA2024, VMCE2023, VMCE2024-SP,CKA, LFCS

+11

HangTen416
Influencer
Forum|Forum|2 years ago
September 29, 2023

Excellent script! Confidentiality is so important! Thanks!

VMCE2024 | VMCA2022

+10

Scott
Veeam Legend
Forum|Forum|2 years ago
September 29, 2023

Wow, I’m impressed. Great for so many uses. Now we can even post examples on forums\blogs and not have to worry about any data ending up in the examples.

I’ll have to test this out when I get a chance.

+21

Chris.Childerhose
Veeam Legend, Veeam Vanguard
Forum|Forum|2 years ago
September 29, 2023

Very cool - security will love me now. 😂

Thanks for sharing this one.

JMousqueton
Author
Comes here often
Forum|Forum|2 years ago
September 30, 2023

Can’t wait to test this script! Great initiative @JMousqueton

I'm thrilled to hear that you're eager to test the script! If you have any feedback or suggestions after testing it out, please don't hesitate to share. I'm always open to improving my script based on user experiences.

HannesK
Comes here often
Forum|Forum|2 years ago
October 2, 2023

@BertrandFR :

putting an official feature request to the R&D forums is a good idea. Currently there is only an internal thread that customers cannot add “plus 1”
we have that feature request already tracked and I would add you to that request
There is a support tool that does what you ask for. But it’s slow. So the main question is: how fast per GByte log files is that script?

JMousqueton
Author
Comes here often
Forum|Forum|2 years ago
October 3, 2023

3. There is a support tool that does what you ask for. But it’s slow. So the main question is: how fast per GByte log files is that script?

Hi @HannesK

I´ve never heard of such tools neither as a partner or custormer. Where can we find it ?

Unfortunately as expressed before I’m not a developper so I guess the script should be slow and could be improve.

More over to get the “dictionnary” feature I have to do two pass which also lows down the process.

I’ll try some benchmark during the week so I could give you feedback from the amount of log I have in my lab.

Julien

HenriqueA
Comes here often
Forum|Forum|2 years ago
October 3, 2023

Great project, @JMousqueton , Sometimes i have problem to share some logs with support due internal security when needed. I think this will be the best way to perform it.
Thanks for sharing with us.

HannesK
Comes here often
Forum|Forum|2 years ago
October 4, 2023

@JMousqueton : it should be available from support by opening a ticket and asking for it. I know it existed some years ago. I’m not 100% it was adopted to V12.

JMousqueton
Author
Comes here often
Forum|Forum|2 years ago
October 5, 2023

My benchmark,

I used my MacBook M1 on battery (in the plane) with the full option command :

python3 VeeamLogAnonymizer.py -d ./log -o anonymizedlog -f -m -v -D

First batch of logs :

650Mb / 395 files : 22 minutes 9 secondes

Second batch of logs :

486Mb / 535 files : 11 minutes 2 secondes

HannesK
Comes here often
Forum|Forum|2 years ago
October 6, 2023

I guess that shows the challenge… log bundles are often 5GB - 50GB, I even heard of 200GB. I’m also not a developer, so I’m technically not able to improve the code to make it faster.

BertrandFR
Influencer
Forum|Forum|2 years ago
October 9, 2023

Hello,

@HannesK I’ve never heard of such tools neither as a customer, not sure Veeam sales are aware from it. For some public instit it could be a prerequisites and the result could be no Deal.

I will do a case support during the week for science purpose.

For the FR on the r&d forum , do you need one FR per request? I mean coud i create a root FR for all or :

Anonymization Logs
Anonymization DB VBR (Backup)
Anonymization DB Veeamone (Backup)
Anonymization DB EntMan (Backup)

I will probably have the same request for Kasten, where can i post the FR?

For performance purposes, it depends of many factors (compute, storage type/speed) but as discussed with @JMousqueton script could be improved. For 10GB logs logs exporter is already too slow for me even on large VBR and full nvme, customers who need to use this kind of functionnality are aware it could result in more time to export logs to the support. I think it’s the game :)

HannesK
Comes here often
Forum|Forum|2 years ago
October 9, 2023

> For the FR on the r&d forum , do you need one FR per request? I mean coud i create a root FR for all or

I would start with one thread in the VBR forums. A list of attributes that should be anonymized would be useful.

For the database, that seems to open a can of worms. If foreign keys are involved, that would cost huge amounts of development resources.

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded