Automation – VizAlert for Slow Render Workbooks

Tableau server performance management has been everyone’s concerns when it comes to a shared self-service environment since nobody wants to be impacted by others. This is especially true when each business unit decides their own publishing criteria where central IT team does not gate the publishing process.

How to protect the shared self-service environment? How to prevent one badly designed query from bringing all servers to their knees?

My previous blog Governed Self-Service Analytics: Performance Management (7/10) talked about a few actions. This blog talks about one technical implementation : how to send auto alert to workbook owner when the average workbook render time exceeded defined threshold. This helps as most of workbook owners are willing to take actions. My next blog will talk about some enforcement when no action is taken overtime.

What is the problem statement? 

  • Let workbook owners know their workbooks are a lot slower than many other workbooks on the same server

Is this necessary at all? Don’t they know already? 

  • Some workbook owners may know but other may not. It is possible that a well performed workbook becomes much slower overtime due to data change
  • Some workbook owners may know the slowness but they did not know how slow comparing with many other workbook on the server
  • Other workbook owners know slow but they do not bother to improve – if their users do not complain and server admin team does not complain, they have no incentive to improve it

Why server admin team cares about my slow views if my users are happy about it?

  • This is a valid argument. One time, I did research on my server : 1% views with render time > 30 seconds consume about 20-30% server CPU & memory resources!

 It used to take one day to get the answer outside Tableau, but now it takes only 5 mins on Tableau server to get the answer, I am happy about it!

  • This is a valid argument too. My counter answer is that you are working on a shared server env – your slow view impacts others. If you have your own dedicated server, admin will not care if your view takes 5 mins or much longer.

Now tell me auto alert solution:  Create a VizAlert to be sent to workbook owner abut the slow workbook views automatically in weekly basis. More details and tips for the VizAlert:

  1. Use weekly render time avg to avoid a lot one-off outliers
  2. Define your alert threshold, I use 30 sec as weekly avg
  3. Include # of weeks the slow view/workbook has been on the alert
  4. If it is on the alert the week 1, then it is not on week 1 anymore, that is great!
  5. However if the workbook is on week 1, 2 and 3, a conversation may be needed or you may want to trigger enforcement in my next blog
  6. Be sure to include best practice for designing effective Tableau workbooks and other internal resource list.

slow workbook

Automation – Timeout long subscriptions and auto send email to workbook owner

Why Subscriptions are not preferred for Tableau server?

  1. Subscriptions send out email at defined intervals, which does have a lot of conveniences for some users. However Tableau’s strength is interactive. The subscription is counter-interactive. 
  2. It is nothing wrong for users to get Tableau views in their inbox. The problem is that server admins or workbook owners have no way to tell if the users open the emails at all. Overtime, admins can’t tell if users are actually using the Tableau server or not.

What is the worse scenario of subscriptions?

I found that some ‘smart’ publishers mis-use subscriptions in such way that they know that their workbook render is too slow (like 30 minutes render) for user to render so they use subscriptions to send emails to users. Why it is bad?

  • It can take a lot of backgrounder process
  • It is just a bad behavior for those publishers who should spend more time to tune their dashboard for better perf
  • It impacts other extract jobs. As matter of fact, backgrounder priority is subscription, incremental extract, full extract. Image that 30 min subscription job to 20 people will potentially block a lot of small (2-3 mins) important extracts…..

How to implement server governance process to let offensive publishers feel the pain?

  • Reduce the subscriptions.timeout from default 30 mins to a shorter time, like 2-5 mins.
  • This value applies separately to each view in the workbook, so the total length of time to render all the views in a workbook (the full subscription task) may exceed this timeout value.
  • For example, if the subscriptions.timeout is 3 mins and workbook has 4 views, the workbook will not timeout till 3×4 = 12 mins
  • The commend is  tsm configuration set -k subscriptions.timeout -v 180  (where 180 seconds is 3 minutes)

What is the problem of this subscriptions.timeout?

The problem is that Tableau server sends the following  failure notification to subscribers who had no idea what happened. A lot of subscribers assume that something is wrong on server while the workbook owner had no idea about the timeout.

The view snapshot in this email could not be properly rendered.

To see the view online, go to link to the workbook view

If it failure like this, after  5 consecutive times it will skip from subscription.

What is the solution to streamline the subscriptions.timeout message?

  1. Create VizAlert to look for subscriptions.timeout error
  2. Find out workbook owner based on workbook_id
  3. Create VizAlert message about subscription time out policy
  4. Send VizAlert to both subscribers and workbook owner

Conclusion:

  1. Necessary governance processes are required to scale Tableau to enterprise
  2. The core concept of governance is to encourage good behaviors and disencourge bad behaviors.
  3. It is a bad behavior of subscribing users to a super slow workbook
  4. Reduce subscriptions.timeout to a few minutes (for example:  tsm configuration set -k subscriptions.timeout -v 180) and use VizAlert to send message to workbook owner about subscriptions.timeout is one solution to counter this bad behavior.