Advanced Deployment – Reduced upgrade time from 50 to 5 hrs

Tableau server backup can take very long time. It takes 20+ hrs for my server even after implemented Filestore on initial node and very aggressive content archiving.  Of course, part of the reason is that Tableau does not have incremental backup.

There are two problems with long backup : One is that you Disaster Recover (DR) data is far from Prod data. Two is that upgrade process takes long time, likely whole weekend.

This blog talks about how to reduce upgrade time from about 50 hrs to 5 hrs. 

  • Large Tableau sever upgrade used to take 50 hrs pre-TSM
  • Same Tableau server upgrade takes about 30 hrs with TSM (V2018.2 or newer)
  • Same Tableau server upgrade can be done within 5 hrs with TSM
  1. It used to take 50 hrs for the upgrade

upgrade1

2. Thanks for TSM (v2018.2) that allows new server version installed while current version still running. The upgrade time with post-TSM cuts almost half (means upgrade from v2018.2 or above to newer version). Here is how it looks like:

upgrade6

3. When I looked at the above timeline, I asked myself : Is it possible to skip the cold backup process so upgrade can be done within 5 hrs? It is possible with two options:

  • Option 1: Assume that you will have a hot backup done, if cold backup is skipped and upgrade went south that you have to restore from the backup. What it mean is that you will miss about 20 hr server changes – anything after hot backup started are gone and you have no way to know what those are anymore: Are IT and business willing to take this risk? If yes, you are lucky and you can just skip the cold backup.
  • Option 2: Most likely many others can’t take the risk. For IT, upgrade can fail but IT has to be able to get back the data/workbook/permissions for business if upgrade failed. At minimum, IT has to know what the changes are. Here is what we did – I called it the BIG idea – that is to track all the changes for a period of about 24 hrs:Upgrade4

4. How to skip cold backup but track changes?

  • How it works is that you will have two hot backups. The hot backup 1 is restored to DR while hot backup 2 is saved but not restored (no time to restore before upgrade)
  • Skip cold backup and then complete the upgrade within 5 hrs.
  • If upgrade failed in such way that restore has to be done to get server back. You can restore from hot backup 2 that misses about 20 hrs data (from point 1 to point 2). Then you will need to let impacted publishers know those changes, so they can manually re-do the missing data provided by Server team:
        • Workbooks
        • Data Sources
        • Projects
        • Tasks
        • Subscriptions
        • Data-driven alerts
        • Permissions

5. The net result is that server upgrade is done within 5 hrs. Wow! That is huge!  If things go south, IT server team has all changes tracked – that is more like incremental backup. The difference is that most likely business publishers should re-do those changes.

Upgrade5

6. How to track the changes for the following objects?

  • Workbooks
  • Data Sources
  • Projects
  • Tasks
  • Subscriptions
  • Data-driven alerts
  • Permissions

Just query your Postgre database directly, you can easily get all of them from one point to another point since all those objects have timestamp except Permissions that is very tricky.

Like it or not, Tableau’s permission tables do not have timestamp! I personally gave feedback to Tableau Dev team already but it is what it is.

You can find Tableau permission workbook @ https://community.tableau.com/message/940284 and one option is that you run it twice and diff it.

Re-cap: Both business and IT are extremely happy with 5 hr upgrade process while it used to be 50 hrs or at least 30 hrs. 

 

 

 

 

 

 

 

 

 

 

 

Leave a Reply