Roll Back or Resolve Forward? How to Deal With Database Errors

Jeff Pell Big Data Solutions, CI/CD, Cloud Technology, DevOps

When you apply database changes, whether automated as part of a CICD process, or manually in the legacy solution, what do you do when an error occurs? Do you undo all the changes applied so you can analyze the issue or do you search for an immediate solution so you can finish all the changes?  

Both options have pros and cons, and your choice likely depends on the team dynamic, the manager’s preference, or perhaps to conform to a company policy. Let’s discuss both options.

Rollback

Performing a rollback requires removing all the changes of the current set that applied prior to experiencing the error. This returns the database to a known state—but it can be difficult to perform. 

To apply quickly, you must develop and test the rollback process prior to every release. 

Your rollback process may include executing data definition language (DDL) and data manipulation language (DML) that reverses all the changes applied. This may be straightforward when the change was adding a new column but more complex if you modified the data type of a column or modified the data itself or even removed a column. Developing and testing the complex changes may require a great deal of time and effort. Depending on the tools utilized, this type of rollback can be automated, including if a CICD process is in place.

Another possible rollback option is to restore the entire database to a point immediately prior to deployment of the changes. This option is relatively easy, but has the downside of requiring a complete outage for the end-user during updates or the acceptance of potentially lost data in the event of a rollback.

Resolve Forward

Resolving forward can only occur if your staff can immediately investigate, troubleshoot, and resolve the problem when it occurs. In the end, the release process is complete, and the new version is deployed.

The resolve forward process needs to be completed as quickly as possible. It requires identification of the problematic DDL or DML, analysis of the root cause of the error, development of the solution, and quick implementation. Additionally, all of this must happen within a very short time to avoid undue service interruptions.  

Which is Best?

Developing and testing the rollback process is a consistent time requirement for EVERY release. Every DDL and DML statement will require a matching statement that can undo the changes. Some of these changes will be complex, time-consuming to design, develop and test, and may still result in possible data loss.

Scrambling to find a solution when release errors occur hopefully happens infrequently. Will you save time and effort compared to the total time expended for developing the rollback process for every change? Perhaps. Will there be substantial stress during the resolve forward process? Most likely.

The final Choice is project-specific. Some teams will want to return to the last known “stable” state and then ensure the problem does not reoccur when the same release is attempted again.  Other teams will want to push forward so the new release is available and work on the next one can start. Which is best for you?

Blue Sentry has extensive experience in using industry leading tools to implement both practices. We can help you decide which option is best for you and assist you in the solution implementation. For more information contact us at 336.604.6991.