What? You can run FileStore on network storage? Yes, it is doable with fair good performance. And the benefit is to have DR data only 2 hrs behind prod while it used to be 50 hrs behind for large & extract heavy deployment.
Before you continue, the intent of this blog is NOT to teach you how to configure your Tableau server to run on network storage. Because it is not supported by Tableau yet. Instead the intent of this blog is share with you the possibility of awesome new feature coming in the future…
The Problem Statement: When your server FileStore gets close to 1TB (it happens in large enterprise server with extract heavy deployment even you are doing aggressive archiving), the backup or restore can take 20 hrs each. It means that DR data is at least 50 hrs behind considering file transfer time.
- The server upgrade can take the whole weekend
- The server users will see 2 day old data whenever user traffic is routed to DR (like weekly maintenance)
The Solution: Config FileStore on network storage so that all extract files can be snapshot to DR in much fast speed leveraging network storage’s built-in snapshot technology.
Impact: The DR data can be about 2 hrs behind prod vs 50 hrs.
How it works?
- After it is configured, the server works the same way as file store on local disk. No user should notice any difference as long as you use Tier 0 (most expensive) SSD network storage (NetApp or EMC for example)
- Server admin should see no difference as well when use TSM or Tableau server admin views
- Does it work for Windows or Linux? I am running it on Linux after working with Tableau Dev for months. Tableau Dev may have a alpha config for Windows but I don’t know
- Can we run repository on network storage as well? That was what we had initially but it also means single repository for the whole cluster that posts additional risk. I am running repository on local and have two repositories.
- Does it means that you can’t have 2nd Filestore in the cluster? You are right – single filestore only on network storage. Is it risky? It has risk but it is common enterprise practice for many other large apps.
New process to backup and restore:
- Regular tsm maintenance backup handles both repository and filestore extract nicely together. Now we do not want to backup filestore anymore, so use the
tsm maintenance backup --file <backup_file> --pg-only
- Unfortunately when you use pg-only backup, the Restore will fail since repository and Filestore are not in the ‘stable’ status.
- What happens is that both repository and filestore syncs internally within Tableau server constantly . For example, when new extract is added to filestore, the handle of the extract has to be added in repository. When extract is deleted from filestore (old extract, or user workbook deletion, etc), the handle has to be deleted from repository otherwise Postgre will fail to start during integrity checks after Restore.
- One critical step is to stop the sync job between repository and filestore before the backup happen to ensure both repository and filestore are in the ‘stable’ status that can be separately sent to DR
- Of course, after backup is done, restart the repository and filestore sync jobs to catch up sync.
What it means with Tableau’s new 2019.3 Amazon RDS External Repository?
When Tableaus’ repository can run on Amazon RDS external database, it potentially means future deduction of repository’s availability on DR. Hopefully the 2hrs backup/send/restore repository can be reduced to minutes. I have not tried this config yet.
Re-cap: You can run FileStore on network storage for much better DR potentially from 2 days to 2 hrs.