Advanced Deployment (9/10): OPTIMIZE BACKGROUNDER (extract AND SUBSCRIPTION) efficiency

Are you facing the situation that your Tableau server backgrounder jobs having long delay? You always have limited backgrounders. How to cut average extract/subscription delay without adding backgrounders? This webinar covers the following 4 things:

  1. Suspend extract for inactive content
  2. Reduce extract frequency per usage
  3. Dynamic swap vizQL and backgrounder
  4. Incremental or smaller extracts run first
  5. VIP extract priority
  • Download slides here

  • Watch recording here

  1. Suspend extract for inactive content

I used to suspend extract for inactive content by using Python. But thanks for Tableau v2020.3 that made this as  out-of-box feature. This feature should be the first thing every server admin does. The good thing is that this feature is ON as default with 30 days.

suspend extract for inactive content

2.  Reduce extract frequency per usage

Challenge:There are many unnecessary extract refreshes due to the fact that all schedules are available to every publisher who have complete freedom to choose whatever schedules they wanted. Although workbooks not used for weeks at all would be suspended as part of new v2020.3 feature. But what if the workbook with hourly or daily extract but only use once a month? … Maybe initially usage is high but overtime usage went down but publisher never bothers to reduce refresh frequency…. They have no incentive to do so at all.

Solution: Set Usage Based Extract Schedule – reschedule the extract frequency based on usage

For example:

  • Hourly refresh changes to daily if workbook not used for 3 days
  • Daily changes to weekly if workbook not used for 3 weeks
  • Weekly changes to monthly if workbook not used for 2 months

A few implementation notes:

too much workbook refresh

  • Here is how it works:
    • Find out the last_used (by days)
      select views_workbook_id, ((now())::date – max(last_view_time)::date) as last_used from _views_stats
      group by views_workbook_id
    • Find out refresh schedule by joining tasks table with schedules table
    • Do the calculation and comparison.  For example
      extract change

How to change schedule frequency?

  • Manual approach : Change the workbook or datasource refresh schedule based on the attached schedule change recommendation workbooks
  • Automation:  No API to change schedules.  The schedule change can be done to update tasks.schedule id (tasks is the table name and schedule id is column name):

UPDATE tasks
SET schedule id = xxx
WHERE condition;

A few additional notes:

  • How to figure out which schedule id to change to? Let’s say you have 10 daily schedules, when you change from hourly to daily, the best way is to randomly choose one of the 10 daily schedules to avoid the situation that overtime too many jobs are on one specific schedule.
  • What if publisher changes back from daily to hourly? They do have freedom to change their extract schedules at any time. This is the world of self-service. However they will not beat your automatic scripts over time. On the other side, this feature will help you to get buy-in from business.
  • How much improvement can you expect with this automation? Depends on your situation. I have seen 50%+ delay reductions.
  • Is the automation approach supposed by Tableau? NO NO. You are on your own risk but the risk is low for me, return is high.

3.  Dynamically swap backgrounders and VizQL

Tableau’s Backgrounder handles extract refresh or subscriptions and VizQL handles viz render. Often VizQL has more idle time during night while Backgrounder has more idle time during day. Is it possible to automatically configure more cores as Backgrounders during the night and more VizQL during the day? The dream becomes true with Tableau TSM from V2018.2.screenshot_3651

  • How to identify the right time to swap?
    • Get extract delays by hour from backgrounder_jobs table
    • Get VizQL usage by hour from http_requests table
    • The use pattern will show the best time to swap
  • Can I have the scripts to swap? Click Backgrounder swap with VizQL scripts.txt.zip to download working version of the scripts
  • What happens with in-flying tasks when the backgrounder is gone? The tasks will fail and get re-started automatically from V2019.1

4.  Incremental or smaller extracts run first

screenshot_3652

  • Make sure to educate your publishers since this feature is a great incentive for them
  • How to config incremental goes first? There is nothing to config for incremental goes first. It is out-of-box Tableau feature
  • How to config Smaller Full Extract goes first?
    • V2019.3 : No Config Required
    • V2019.1 & V2019.2: backgrounder.enable_task_run_time_and_job_rank  & backgrounder.enable_sort_jobs_by_job_rank
    • V2018.3or older backgrounder.sort_jobs_by_run_time_history_observable_hours -v 180 (recommend 180 hrs to cover the weekly jobs)

5  VIP Extract Priority

Challenge : If you will have to give higher priority to some extracts, the challenge is that new revision of the workbook or datasource will set extract priority back to default 50.

Solution : You can automate it by (no API available)

UPDATE tasks

SET tasks.priority = xx

WHERE condition;

Read more @ https://enterprisetableau.com/extract/

Re-cap: Although Tableau did not give server admins enough control on the extract refresh schedule selections for given workbook or datasource, there are still ways to govern your Tableau server backgrounder jobs :

  • Reduce extract frequency per usage will reduce all the unnecessary refresh. This can increase 50% your backgrounder efficiency.
  • Dynamic swap vizQL and backgrounder will give you more machine power. This can get 50% more backgrounders depends on your use pattern.
  • Incremental or smaller extracts run first is out-of-box feature. Make sure to let publishers know as this is a incentive for them to design effective extracts.
  • VIP extract priority may not help a lot for the backgrounder efficiency comparing with other 3 items but this is one of things that you may have to do per business need

Governed Self-Service Analytics: Data Governance (8/10)

I was in the panel discussion at Tableau Conference 2015 about self-service analytics to a group of executives. Guess what is the no.1 most frequent asked question – data governance. How to make sure that data not get out of hands? How to make sure that the self-service analytics does not break the existing organization’s process, policy around data protections, data governance?

Data governance is a big topic. Tableau’s Data Management Add-on model (Data Catalog, Tableau Prep Conduct) are making great progress toward data management. This blog focuses following 3 things:

  • Data governance for self-service analytics
  • How to enforce data governance in self-service environment
  • How to audit self-service environment
  1. Data governance for self-service analytics

First of all, what is data governance?

Data governance is a business discipline that brings together data quality, data management, data policies, business process management, and risk management surrounding the handling of data.

The intent is to put people in charge of fixing and preventing issues with data so that the enterprise can become more efficient.

The value of enterprise data governance is as followings:

  • Visibility & effective decisions: Consistent and accurate data visibility enables more accurate and timely business decisions
  • Compliance, security and privacy: Enable business to efficiently and accurately meet growing global compliance requirements

What data should be governed?

Data is any information in any of our systems. Data is a valuable corporate asset that indirectly contributes to organization’s performance.   Data in self-service analytics platform (like Tableau) definitely is part of data governance scope. All the following data should be governed:

  • Master Data: Data that is shared commonly across the company in multiple systems, applications and/or processes. Master Data should be controlled, cleansed and standardized at one single source. Examples: Customer master, product item master. Master data enable information optimization across systems, enable data enrichment, data cleaning and increase accuracy in reporting.
  • Reference Data: Structured data used in an application, system, or process. Often are common lists set once a fiscal year or with periodic updates. Examples like current codes, country codes, chart of accounts, sales regions, etc.
  • Transactional Data: The information recorded from transactions. Examples like user clicks, user registrations, sales transactions, shipments, etc. The majority of the enterprise data should be the transactional data. Can be financial, logistical or work-related, involving everything from a purchase order to shipping status to employee hours worked to insurance costs and claims. As a part of transactional records, transactional data is grouped with associated master data and reference data. Transactional data records a time and relevant reference data needed for a particular transaction record.

What are data governance activities?

  • Data ownership and definition: The data owner decides and approves the use of data, like data sharing/usage requests by other functions. Typically data owners are the executives of the business areas. One data owner is supported by many data stewards who are the operational point of accountability for data, data relationship and process definitions. The steward represents the executive owners and stakeholders. Data definition is what data steward’s responsibility although many people can contribute to the data definitions. In the self-service environment where data is made available to many analyst’s hands, it is business advantage to be able to leverage those data analyst’s knowledge and know-how about the data by allowing each self-service analyst to comment, tag the data, and then find a way to aggregate those comments/tags. This is again the community concept.
  • Monitor and corrective actions: This is an ongoing process to define process flow, data flow, quality requirement, business rules, etc. In the self-service environment where more and more self-service developers have capability to change metadata and create calculated fields to transform the data, it can be an advantage and can also become chaos if data sources and process are not defined within one business group.
  • Data process and policy: This is about exception handlings.
  • Data accuracy and consistency: Commonly known as data quality. This is where most of time and efforts are spent.
  • Data privacy and protection: There are too many examples that data leakage damages brand and causes millions for organizations. Some fundamental rules have to be defined and enforced for self-service enterprise to have a piece of mind.

2. How to enforce privacy and protection in self-service environment?

The concept here is to have thought leadership about top sensitive data before make data available for self-service consumption. To avoid potential chaos and costly mistakes, here is high level approach I use:

  • Define what are the top sensitive dataset for your organization (for example Personally Identifiable Information (PII) Tier classifications.
  • Then use Tableau’s Data Lineage Feature to find out any workbooks using those PII data.
  • Send alerts to workbook owners for the list of workbooks using PII.

3. What are the additional considerations for data privacy governance?  

  • No privacy and private data is allowed to self-service server. Like SSN, federal customer data, credit cards, etc. Most of those self-service platform (like Tableau) is defined for easy of use, and does not have the sophisticate data encrypt technologies.
  • Remove the sensitive data fields (like address, contacts) in database level before making the data available for self-service consumption. The reason is that it is really hard to control those data attributes once you open them to some business analytics super users.
  • Use site as partition to separate data, users, and contents for better data security. For example, finance is a separate site that has finance users only. Sales people have no visibility on finance site.
  • Create separate server instance for external users if possible. Put the external server instance in DMZ zone. Different level of network security will be applied as additional layer of security.
  • Create site for each partner / vendor to avoid potential problems. When you have multiple partners or vendors accessing your Tableau server, never put two vendors into same site. Try to create one site for each vendor to avoid potential surprises.

4. How to audit self-service environment?

You can’t enforce everything. You do not want to enforce everything either. Enforcement comes with disadvantages too, like inflexibility. You want to choose the most critical things to enforce, and then you leave the remaining as best practices for people to follow. Knowing the self-service analytics community always tries to find the boundary, you should have audit in your toolbox. And most importantly let community know that you have the auditing process.

  • What to audit:
    • All the enforced contents should be part of audit scope to make sure your enforcement works in the intended way
    • For all the policy that your BU or organization agreed upon.
    • For any other ad-hoc as needed
  • Who should review the audit results:
    • Self-service governance body should review the results
    • BU data executive owners are the main audiences of auditing reports. It is possible that executives gave special approvals in advanced for self-service analysts to work on some datasets that she or he does not have access normally. When they are too many exceptions, it is an indication of potential problem.
  • Roles and responsibilities of audit: Normally IT provides audit results while business evaluate risks and make decisions about process changes.
  • How to audit: Unfortunately Tableau does not have a lot of server audit features. There is where a lot of creativities come into play. VizAlert can be used. Often creating workbooks from Tableau database directly is the only way to audit.

Please read next blog about content management.

Governed Self-Service Analytics: Performance Management (7/10)

Performance management has been everyone’s concerns when it comes to a shared self-service environment since nobody wants to be impacted by others. This is especially true when each business unit decides their own publishing criteria where central IT team does not gate the publishing process.

How to protect the shared self-service environment? How to prevent one badly designed query from bringing all servers to their knees?

  • First, set server parameters to enforce policy.
  • Second, create daily alerts for any slow dashboards.
  • Third, made performance metrics public to your internal community so everyone in the community has visibility of the worse performed dashboards to create some peer pressures with good intent.
  • Fourth, hold site admin or business leads to be accounted for the self-service dashboard performance.

You will be in good shape if you do those four things above. Let me explain each of those in details.

performance

  1. Server policy enforcement

The server policy setting is for enforced policies. For anything that can be enforced, it is better to enforce those so everyone can have a piece of mind. The enforced parameters should be agreed upon business and IT, ideally in the governance council. The parameters can always be reviewed and revised when situation changes.

Some super useful enforced parameters are (pls ref my presentation Zen Master Guide to Optimize Server Performance  for details):

  • Set VizQL  Session Timeout as 3 minutes vs default 30 minutes
  • Set Hyper Session Memory Timeout as 10G vs default no limit
  • Set Process Memory limit as 60% of system memory vs about 95% when is too late.
  1. Exception alerts

There are only a few limited parameters that you are able to control as enforcement. All the rest will have to be governed by process. The alerts are most common approach to server as exception management:

  • Performance alert: Create alerts when dashboard render time exceeds agreed threshold.
  • Extract size alerts: Create alerts when extract size exceed define thresholds (Extract timeout can be enforced on server but not size).
  • Extract failure alerts: Create alerts for failed extracts. Very often stakeholders will not know the extract failed. It is essential to let owners know his or her extracts failed so actions can be taken timely.
  • You can create a lot of more alerts, like CPU usage, overall storage, memory, etc.

How to do the alerts? There are multiple choices. My favorite one is VizAlert for Tableau https://community.tableau.com/groups/tableau-server-email-alert-testing-feedbac

Who should receive the alerts? It depends. A lot of alerts are for server admin team only, like CPU usage, memory, storage, etc. However most of the extracts and performance alerts are for the content owners. One best practice for content alert is always to include site admins or/and project owners as part of alerts. Why? Workbook owners may change jobs so the original owner may not be responsible for the workbooks anymore. I was talking with a well known Silicon Valley company recently, they are telling me that a lot of workbook owner changed in last 2 years, they had hard time to figure out whom they should go after for issues related to workbooks. Site admin should be able to help to identify the new owners. If site admin is not close enough to workbook level in your implementation, you can choose project leaders instead of site admin.

What should be the threshold? There is no universal answer. But nobody wants to wait for more than 10 seconds. The rule of thumb is that anything less than 5 seconds good. However anything more than 10 seconds is no good. I got a question when I present this in one local Tableau event. The question was what if one specific query used to take 30 minutes, and team made great progress to reduce it to 3 minutes. Do we allow this query to be published and run on server? The answer is depends. If the view is so critical for business, it will be of course worth of waiting 3 minutes for results to render. Everything has exception. However if the 3-minute query chokes everything else on the server and users may click the view to trigger the query often, you may want to re-think the architecture. Maybe the right answer will be to spin-off another server for this mission critical 3-minute application only so the rest of users will not impact.

Yellow and red warning: It is a good practice to create multi-level of warning like yellow and red warning with different threshold. Yellow alerts are warnings while red alerts are for actions.

You may say, hi Mark, this all sounds great but what if people do not take the actions.

This is exactly where some self-service deployments go wrong. There is where governance comes to play. In short, you need to have strong and agreed-upon process enforcement:

  • Some organizations use charging back process to motivate good behaviors. The charge back will influence people’s behaviors but will not be able to enforce anything.
  • The key process enforcement is a penalty system when red alert actions are not taken timely.

If owner did not take corrective actions during agreed period of time for red warning, a meeting should be arranged to discuss the situation. If the site admin refuses to take actions, the governance body has to make decision for agreed-upon penalty actions. The penalty can lead to site suspension. Once a site is suspended, nobody can excess any of the contents anymore except server admins. The site owners have to work on the improvement actions and show the compliances before site can be re-activated. The good news is that all the contents are still there when a site is suspended and it takes less than 10 seconds for server admin to suspend or re-active a site.

I had this policy that was agreed with governance body. I communicate to as many self-service developers about this policy as I can. I never got push back about this policy. It is clear to me that self-service community likes to have a strong and clearly defined governance process to ensure everyone’s success. I suspended a site for some other reasons but never had to suspend a site due to performance alerts. What happens is that is my third tricky about worse performed dashboard visibility.

  1. Make performance metric public

It takes some efforts to make your server dashboard performance metric public to all your internal community. But it turns out that it is one of the best things that a server team can do. It has a few benefits:

  • It serves as a benchmarking for community to understand what is good and good enough since the metric shows your site overall performance comparing with others on the server
  • It shows all the long render dashboards to provide peer pressures.
  • It shows patterns that help people to focus the problematic areas
  • It creates great opportunity for community to help each other. This is one most important success factor. What turns out is that the problematic areas are often the new team on-boarded to the server. It community always have so many ideas to make dashboard perform a lot of better. This is why we never had to suspend any sites since when it comes with a lot of red alerts that community is aware of, it is the whole community that makes things happen, which is awesome.
  1. Hold site admin accounted for

I used to manage Hewlett Packard’s product assembly line during my early career. Hewlett Packard has some well-known quality control processes. One thing that I learned was that each assembler is response for his or her own quality. Although there is QA at the end of line but each workstation has a checklist before pass to next station. This simple philosophy applies today’s software development and self-service analytics environment. The site admin is responsible for the performance of workbooks in the sites. The site admin can further hold workbook owners accounted for the shared workbooks. Flexibility comes with accountability too.

I believe theory Y (people have good intent and want to perform better) and I have been practicing theory Y for years. The whole intent of server dashboard performance management is to provide performance visibility to community and content owners so owners know where the issues are so they can take actions.

What I see often is that a well-performed dashboard may become bad over-time due to data changes and many other factors. The alerts will catch all of those exceptions no matter your dashboards are released yesterday, last week, last month or last year – this approach is a lot of better than gating releases process which is a common IT practice.

During a recent Run-IT as business meet-up, audiences were skeptical when I said that IT did not gate any workbook publishing process and it is completely a self-service. Then audiences started to realize that it did make sense when I started to talk about performance alerts that will catch it all together. What business likes most about this approach is the freedom to push some urgent workbooks to server even workbooks are not performing great – they can always come back later on to tune them and make them perform better for both better use experiences and being good citizen.

Please continue to read next blog about data governance.

Governed Self-Service Analytics: Multi-tendance (5/10)

Tableau has a multi-tendance strategy which is called site.  I heard many people asking if they should use site, when should use site. For some large Tableau deployment,  people also ask if you have created separate Tableau instances. All those are Tableau architecture questions or multi-tendance strategy.

 

How do you approach this? I will use the following Goal – Strategy – Tactics to guide the decision making process.screenshot_42

It starts with goals. The  self-service analytics system has to meet the following expectations which are ultimate goals:  Fast, Easy, Cost Effectiveness, Data Security, Self-Service, Structured and unstructured data.

Now keep those goals in our mind while scale out Tableau from individual teams to department, and then from department to enterprise.

 

How do we maintain self-service, fast and easy with solid data security and cost effectiveness while you deal with thousands of users? This is where you need to have well-defined strategies to avoid chaos.

First of all, each organization has its own culture, operating principles, and different business environment. Some of the strategies that work very well in one company may not work for others. You just have to figure out the best approach that matches your business requirement. Here is some of food for thoughts:

  1. Do you have to maintain only one Tableau instance in your organization? The answer is no. For SMB, the answer may be yes but I have seen many large organizations have multiple Tableau instances for better data security and better agility. I am not saying that Tableau server can’t scale out or scale up. I have read the Tableau architecture white paper for how many cores one server can scale. However they are many other considerations that you just do not want to put every application in one instance.
  2. What are the common use cases when you may want to create a separate instance? Here is some examples:
    • You have both internal employees and external partners accessing your Tableau server. Tableau allows both internal and external people accessing the same instance. However if you would have to create a lot of data security constraints in order to allow external partners to access your Tableau server, the same constraints will be applied to all Tableau internal users which may cause extra complexity. Depends on the constraints you will have, if fast and easy goals are compromised, you may want to create a separate instance to completely separate internal users vs. external users – this way you have completely piece of mind.
    • Network seperation. It is getting common that some corporations have separate engineering network from the rest of corp network for better IP protections. When this is the case, create a separate Tableau instance within engineering network is an easy and simple strategy.
    • Network latency. If your data source is in APAC while your Tableau server is in US, likely you will have some challenges with your dashboard performance. You should either sync your database to US or you will need to have a separate Tableau server instance that sits in APAC to achieve your fast goals.
    • Enterprise mission critical applications. Although Tableau started as ad-hoc and exploration for many users, some Tableau dashboard start to become mission critical business applications. If you have any of those, congratulations! You have a good problem to deal with. Once some apps become mission critical, you will have no choice but tight up the change control and related processes which unfortunately are killers to self-service and explorations. The best way to resolve this conflict is to spin-off a separate instance with more rigors on mission critical app while leave the rest of Tableau as fast, easy self-service.

What about Tableau server licenses? Tableau server have seat-based license model and core-based license model. If you have seat-based model, which goes by users. The separate of instance should not have much impacts on total numbers of licenses.

Now Let’s say that you have 8 core based licenses for existing internal users. You plan to add some external users. If you will have to add 8 more cores due to external users,  your separate instance will not have any impacts on licenses.  What if you only want to have a handful external users? Then you will have to make trade-off decision. Alternately you can keep your 8 core for internal users while get handful seat-based license for external users only.

How about platform cost and additional maintenance cost when we add separate instance? VM or hardware are relatively cheap today. I will agree that there are some additional work initially to setup a separate instance but server admin work is not doubled because you have another server instance.  On the other side, when your server is too big, it is a lot of more coordinations with all business functions for maintenance, upgrade and everything. I have seen some large corp are happy with multiple instance vs. one huge instance.

How about sites?  I have blog about how to use site. As summary, site is useful for better data security, easy governance, employing self-service and distributing administrative work. Here is some cases when sites should not be used:

  • Do not create a new site if the requested site will use the same data sets as one of the existing sites, you may want to create a project within the existing site to avoid potential duplicate extracts (or live connections) running against the same source database. Since 2020.1, project has a new feature to lock or unlock any sub-projects, many content segmentation features can be achieved by using projects/sub-projects vs sites.
  • Do not create a new site if the requested site overlaps end users a lot with one existing site, you may want to create a project within the existing site to avoid duplicating user maintenance works

As summary, while you plan to scale Tableau from department to enterprise. you do not have to put all of your enterprise users on one huge Tableau instance. Keep goals in your mind while deciding the best strategy for your business. The goals are easy, fast, simple, self-service, data security, cost effectiveness. The strategies are separate instance and sites.

 

Please read next blogs about release process.

Tableau Metrics Deep Dive

Tableau 2020.2 released a long waiting Metrics feature.  Metrics make it easy to monitor key performance indicators from web or  Mobile. It is super easy to create. When you use Tableau Mobile to check Metrics, you can update trend line data range and even compare measures with time frames.

Marc Reid(Zen Master)’s dataviz.blog has a nice summary about how to use Metrics. This blog talks about how Metrics works on Tableau server to answer the following questions:

  • Who can create Metrics
  • What is entry point to create Metrics
  • What is the relationship between Metrics and its connected view
  • How Metrics permission works
  • How metrics is refreshed
  • What are the differences between Metrics and subscription/Data-driven alert
  • How to find out Metrics usage
  • Can I turn off Metrics for site, project or workbook
  • Is there Metrics revision

metrics_gif

Who can create Metrics

Only Publishers can create Metrics.

Metrics authoring is actually a publishing process, similar as save Ask Data results to server or workbook publish. The Metrics author has to have ALL the following 5 things to create a Metrics:

  1. Site role as Creator or Explorer (can publish)
  2. Publisher permission to a project on server/site
  3. v2020.2-2021.2 (Download Full Data permission) to the view. v2021.3 onwards, Create/refresh Metrics permission
  4. The  workbook has embedded password
  5. The workbook has no row level security or user filterscreenshot_2232

What is entry point to create Metrics?

screenshot_2233

View or Custom View are the only entry point to create Metrics unlike Ask Data using Data Source as entry point.

What is the relationship between Metrics and its connected view ?

Metrics is created from a view. However soon as Metrics is created, it becomes more like ‘independent’ object that has its owner and its own permission. Metrics will still stay on server when connected view is deleted although Metrics can’t be refreshed anymore. This is very similar as relationship between Published Data Source and connected workbook.

How Metrics permission works?

For those who know me, I spent hours or hours to test/validate Tableau permissions. Here is how Metrics permission works:

  • Metrics has its own independent permission
  • Metrics permission is controlled by Metrics owner and the project leader where Metrics is published
  • Metrics owner decides who can access/overwrite the Metrics.
  • Metrics owner can grant permission to any Explorer who does not have the original connected view permission at all. This is an important behavior to be aware of, similar as published data source and its connected workbook permission process (like it or not)
  • For example, John’s dashboard granted Allan permission. If Allan is a publisher on the server, Allan can create a Metrics and grant 1,000 other user’s access to the Metrics without John’s approval, or even without John’s knowledge.
  • An important behavior: John, as workbook owner, has a new Connected Metrics tab to have visibility on Metrics. However if Allan does not grant John access to the Metrics, John would have no idea at all such Metrics connected with his view (yes, that is how the permission works, similar as connected workbook with published data sources)

screenshot_2235

screenshot_2234Can I turn off Metrics for site, project or workbook?

If there is data security concern that you want to turn off Metrics. Here is how:screenshot_2236

  • Metrics can be turned OFF at site level although default is ON – admin goes to site setting for the flag
  • There is no feature to turn off Metrics at project level. But my tip is to set project permission such way to uncheck Download Full Data for workbooks and LOCK the project permission . This way none of the workbooks can be used to create Metrics
  • There is no feature to turn off Metrics at view level or workbook level. However as long as you do not give Download Full Data permission, Metrics can’t be created

How Metrics is refreshed?

screenshot_2237

  • Live connection: Refresh hourly
  • Extract: After each extract run. This is handled by Tableau server backgrounder new ‘Update all metrics in the view‘ process.
  • Server admin can change Live connection refresh interval (metricsservices.checkIntervalInMinutes) –  default value is 60 min.

How Metrics refreshes handles warnings and errors?

  • screenshot_2239If Metrics refresh failed 10 times in the row, Tableau server will      send notification email to Metrics owner. This 10 time can be configured
  • If Metrics refresh failed 175 times in the row, Tableau server will stop refresh. Metrics owner has to manually resume the refresh after problem solved.

How to find out Metrics usage?screenshot_2240

  • Built-in feature, similar as find view usage
  • Admins can also find usage details from admin views.

Is there Metrics revision?

No unlike workbook or data sources. If you need new version of a published Metrics, you can replace existing one if you are owner, project leader or user who has overwritten permission. Metrics permission will  remain unchanged after Metrics is replaced with a new revision. If you save new Metrics as different name, it will be a complete a new Metrics. You can create many Metics from the same view. You can’t combine two view’s measures into the same Metrics. As matter of fact, Metrics handle only single measure.

Re-cap:

  • Metrics is great new Tableau innovation to track KPIs instantly and on Mobile.
  • Metrics is created from an existing view on server and refresh hourly for live connection or whenever extract refresh for view using extract.
  • You can’t create Metrics from Desktop.
  • Metrics can create new potential data access or data governance challenge due to the fact that one can create a Metrics and grant anyone else access permission without approval or knowledge from original workbook/view owner. This is very different from Data-Driven Alert or Subscription that follow the exact view permission.
  • Today it is hard enough already to audit “who has what access” for a team/project with many workbooks. This Metrics feature will make this problem worse although Metrics is a great feature.
  • Workbook has a new Connected Metrics tab but if Metrics owner did not give workbook owner the Metrics access permission, workbook owner will not know the Metrics exists although the Metrics may be shared to many other server users….
  • Do not get me wrong. My intent is not to disencourge you to use Metics. Actually I do strongly encourage everyone to use it. However my point is to make sure admins/project leader/publishers full aware of the permission and data security behavior so you can put necessary controls to avoid potential data security chaos.
  • What are the potential controls? Some ideas from Mark:
    • For very sensitive data, put the workbooks into separate project, lock project permission and do not give Download Full Data permission to any of the workbooks
    • For one or a few sensitive workbooks, if you do not use lock project permission approach, you can control workbook level permission as well – not giving Download Full Data
  • Again, repeating…. For very sensitive workbooks, if you use v2020.2-2021.2, I strongly recommend to lock the Project permission and do not give Download Full Data permission so Metrics can’t be created.
  • If your server is v2021.3 or newer, you do not have potential metrics permission cascading issue anymore. Read https://enterprisetableau.com/metrics2/
  • Data Catalog – data lineage does not include Metrics data yet as v2020.2 release. But as I know that Tableau Dev is working on it.
  • Enjoy cool Metrics!!!!! screenshot_2241