Hello @vmiss33, why not automate your testing process? Before automating, create a group of few devices or applications and test. If these tests are successful, then you can roll them out to the rest of the systems….
Usually if i wish to patch a production system on a fly, I will just pull a backup before patching the system. Please let us know the application or systems in question so we can proffer solutions correctly!
To answer your question!
> So I ask you all - what makes patch testing so terrible?
When doing it manually! Patching thousands of VMs and Servers one after the other! Other than this, the process is flawless when prior testing is done to some groups already.
Fully agree with @Iams3le
Manual testing works for one app, but not at scale.
I think most people don't even know where to start with documenting use cases that need to be tested and even then, they don't know how to automate AND report on the automated tests.
Fully agree with @Iams3le
Manual testing works for one app, but not at scale.
I think most people don't even know where to start with documenting use cases that need to be tested and even then, they don't know how to automate AND report on the automated tests.
Great input as well, thank you!
Patching is a nightmare, OS or application….
Like you guys said before, a big amount of vms to patch is awful,
blue screens or OS failures is always a possibility, even if you test it a few times, some servers won’t like it at all!
and patching applications, you must meet or request the application owner to test it, and some times they don’t have time…. Or don’t want to test the app that isn’t your responsibility, but you still worry about it and you take care of patching and testing.
Patching can be terrible for non-clustered applications. First you will always need a downtime/maintenance window, and if the patching fails you can’t easily get those application back online. While a test environment is great and can easily be automated with Veeam, it will still cost much time depending on how far you’re going: only test your critical applications vs test every OS update. I’m glad that OS patches don’t fall under my responsibility, because those take a huge amount of time and are error-prone, especially on Windows.
Meaningful testing is hardest, software developers have struggled for years to get QA cycles to properly capture regressions, and with Microsoft simplifying their own testing processes from actual hardware varieties to templated VMs, the testing feels squarely pushed to the customer. If anyone is interested in how the Microsoft test cycle changed the below is a good video by a former Microsoft employee:
In essence, to get good testing, requires the involvement of the entire business, but historically there has been a disjointed approach to patching.
The responsibility of installing the patches sits squarely with IT, sometimes other departments are the ones notified about a new release (in my experience this mainly happens in finance when there’s new regulations that require updates). However IT can’t, and shouldn’t, be the owner of testing.
Ensuring that core business processes continue to work after patching needs to sit within the departments that rely on the applications, but most businesses do not budget the staff overheads required to effectively run tasks such as this in parallel. I’ve been fortunate enough to work at organisations that during patching will have one person “shadow processing” the same inputs and confirming they get the same results, and once all the business critical functions were validated, the patch could be applied to production. But for every company I see that takes this seriously, it feels like the next 9 don’t.
At the bare minimum at my last job at a Veeam Cloud Connect service provider, I would deploy SureBackup as our patch management tools would have the windows updates scheduled to install on reboot, so seeing any SureBackup jobs fail was a good indication that a disaster was coming. But that still only catches core OS/driver updates.
A SQL service with no usable business functions is hardly any better than the service failing to start.
Patching is a nightmare, OS or application….
Like you guys said before, a big amount of vms to patch is awful,
blue screens or OS failures is always a possibility, even if you test it a few times, some servers won’t like it at all!
and patching applications, you must meet or request the application owner to test it, and some times they don’t have time…. Or don’t want to test the app that isn’t your responsibility, but you still worry about it and you take care of patching and testing.
> Or don’t want to test the app that isn’t your responsibility, but you still worry about it and you take care of patching and testing.
Definitely spot on! Before you even thinking of patching, I am sure you should be concerned only with your area of responsibilities. I am no longer configuring routers and switches. Why should I now be concerned with patching these devices…?
> and patching applications, you must meet or request the application owner to test it,
I agree with you tho and will reaffirm that, if you have to request to patch, then there isn’t a good security or best practices been followed here. Systems admins should be delegated with this responsibilities and appropriate change request/management etc. must be followed and maintenance time communicated to the users of the applications so they are aware. .
Patching can be terrible for non-clustered applications. First you will always need a downtime/maintenance window, and if the patching fails you can’t easily get those application back online. While a test environment is great and can easily be automated with Veeam, it will still cost much time depending on how far you’re going: only test your critical applications vs test every OS update. I’m glad that OS patches don’t fall under my responsibility, because those take a huge amount of time and are error-prone, especially on Windows.
Absolutely spot on!
Meaningful testing is hardest, software developers have struggled for years to get QA cycles to properly capture regressions, and with Microsoft simplifying their own testing processes from actual hardware varieties to templated VMs, the testing feels squarely pushed to the customer. If anyone is interested in how the Microsoft test cycle changed the below is a good video by a former Microsoft employee:
In essence, to get good testing, requires the involvement of the entire business, but historically there has been a disjointed approach to patching.
The responsibility of installing the patches sits squarely with IT, sometimes other departments are the ones notified about a new release (in my experience this mainly happens in finance when there’s new regulations that require updates). However IT can’t, and shouldn’t, be the owner of testing.
Ensuring that core business processes continue to work after patching needs to sit within the departments that rely on the applications, but most businesses do not budget the staff overheads required to effectively run tasks such as this in parallel. I’ve been fortunate enough to work at organisations that during patching will have one person “shadow processing” the same inputs and confirming they get the same results, and once all the business critical functions were validated, the patch could be applied to production. But for every company I see that takes this seriously, it feels like the next 9 don’t.
At the bare minimum at my last job at a Veeam Cloud Connect service provider, I would deploy SureBackup as our patch management tools would have the windows updates scheduled to install on reboot, so seeing any SureBackup jobs fail was a good indication that a disaster was coming. But that still only catches core OS/driver updates.
A SQL service with no usable business functions is hardly any better than the service failing to start.
Just for our information - Some of the information Barnacules Nerdgasm shared in his YouTube videos are not longer true! This has nothing to do with his knowledge but because of the transformation of Windows etc….
The real problem is that the likelihood of a patch breaking something is very high, and sadly 2021 teaches that.
Obviously I am referring to ouside Veeam products.
Maybe the rush to close major flaws has reduced the patch testing time for devs who developing them, or dev’s skill level has dropped by a lot, Microsoft in primis.