My previous post (Automation – Set Usage Based Extract Schedule) provides a practical server governance approach that re-schedules self-service publisher’s extracts based on workbook usage automatically.
This blog talks about handling old workbooks that nobody uses anymore over a period of time. The keyword is archiving. Many server admins are doing archiving. The tips and tricks in this blog will enlighten your thinking about this topic, which is why I call it advanced archiving.
- Do not archive but delete
The common IT way of doing things is to make copy of ‘old workbooks/data sources’ somewhere else, then business workbook/data source owners can download when needed. This is old way of doing things since it creates more support work for technical team (like workbook owners could not find the archiving URL or workbook, etc). The much better way is no archiving but just deletion, then send the deleted workbooks to owners.
2. Send old workbooks to owners automatically
For the workbook met deletion criteria, call server API (GET /api/api-version/sites/site-id/workbooks/workbook-id/content) to download the workbook. If the workbook is twb, perfect; If the workbook is twbx, rename it as zip, unzip it, ignore the .hyper (or .tde) but get .twb only. Then send the .twb to workbook owners (and project leaders if needed) email with .twb attached. Key benefits are as followings:
- Workbook owner can always search their email inbox to get the deleted workbooks if they need to re-publish again later on.
- Do not email .hyper (or .tde) due to its size and data security concerns
3. Delete first, then send notifications
It is a common mistake that server admin sends a list of workbooks for owners to confirm before archiving, which creates unnecessary clicks on server. Please delete those old workbook first from server then send the notification with .twb attached and policy link in the email body.
4. Delete more aggressively for larger workbooks
How to define old? Some use 180 days but I use 15-90 days depends on size of workbooks:
- Regular workbooks get deleted if no usage for 90 days
- Workbooks with 2G+ size get deleted if no usage for 30 day
- Workbooks with 5G+ size get deleted with no usage for 15 day
5. Delete published data sources as well
When you delete workbooks, some published data sources have no connected workbook anymore over-time:
- Delete standalone data source if it is created more than 2 (or 4) weeks back – you do not want to delete the recently published data sources
6. Technical implementation details
Use historical_events table to drive usage cals. Make days of no-usage as part of email body vs policy so workbook owner does not have to guess why the workbook deleted. If you use size criteria as well, get the workbook size in the email body as well.
7. Get buy-in from business management for those policy
You want to get buy-in from business leaders for those policy, document the policy, and then the email notification always includes a link of this policy. It is a lot of easier than what most people think to get buy-in. Why? Business loves the fact that server deletion makes interactor’s life much easier to find the active content. The higher level you do, the easier to get buy-in.
8. How to identify those workbooks not used for long time?
One way is to use the following criteria :
select views_workbook_id, ((now())::date max(last_access_time)::date) as last_used
from _views_stats
where last_used > 90
group by views_workbook_id
Download the Tableau Workbook Archiving Recommendation.twb
Updated on June 8, 2019: Pls read Automation – Data Source Archiving
Thank you so much for putting up such a informative and engaging post .Tableau is really helpful in visualizing of data.